50 Years in Filesystems: 1984(blog.koehntopp.info)

198 pointsby kaycebasques8 months ago11 comments

teleforce8 months ago
Fun facts, back in 2021 Linus was asked what is the most special about Linux in an interview for 30 years of Linux, and he pointed to the Linux File System as the fastest in the business [1].
[1] 30 Years of Linux: An Interview With Linus Torvalds: Linux and Git - Part 1:
https://www.tag1consulting.com/blog/interview-linus-torvalds...
michelb8 months ago
What are the latest developments on this front? Still ZFS for stability, XFS for performance? Now that we have NAND storage, would we create filesystems differently? Are there ways to make a step-change in overall performance somewhere?
Disclaimer: I know nothing about filesystems..
- evil-olive8 months ago
  > Now that we have NAND storage, would we create filesystems differently? Are there ways to make a step-change in overall performance somewhere?
  you might enjoy Allan Jude's "Scaling ZFS for NVMe" talk [0] from 2022. the tl;dw is that modern NVMe drives have gotten so fast that filesystem developers are having to revisit assumptions that have been true for decades about which operations will be slow vs. fast.
  flash storage that exposed a SATA interface and pretended to be a superbly fast hard drive was an evolutionary improvement of sorts. NVMe is more of a revolutionary improvement, it does away with the spinning-rust assumptions. for example, SATA limits a device to one command queue with 32 commands; NVMe makes it almost unlimited (64k queues with up to 64k commands) [1]
  there is also a corner of the NVMe spec [2] that allows you to treat an SSD directly as key-value storage instead of block storage with a filesystem implementing a key-value layer on top of it. this is promising but AFAIK no mainstream filesystem is using it yet, and I'm not sure if support has even trickled down into consumer drives or if it's market-segmented off for datacenter use only.
  0: https://www.youtube.com/watch?v=v8sl8gj9UnA
  1: https://en.wikipedia.org/wiki/NVM_Express#Comparison_with_AH...
  2: https://nvmexpress.org/specification/key-value-command-set-s...
- avar8 months ago
  If you know nothing about filesystems, then you're almost certainly limited by the underlying hardware, not the filesystem.
  And "performance" is meaningless in this context without specifying what you want to perform. Raw sequential read/write throughput, random metadata operations on directories containing millions of files? Something in between?
  > Are there ways to make a step-change in > overall performance somewhere?
  Yes, it's called O_DIRECT (or equivalent), and is very useful to those that actually need it.
  A general-use filesystem is inherently something that requires you to compromise on performance.
- mtillman8 months ago
  XFS is a file system and ZFS is both a file system and a volume manager. ZFS also recently added RAIDZ expansion which I was really excited about.
- chaostheo8 months ago
  Still ZFS for stability ?? Pull the power plug while writing and you know what your stability is worth in reality. I would say say ZFS is for verify as long as you have a usv in front of your server otherwise with a poweroutage you should have a good backup and be able to do a full restore.
- 6SixTy8 months ago
  A lot of NAND SSDs have controllers that pretty much take away a lot of control the rest of the system has over the actual flash cells. Pretty much any flash specific file systems are designed for embedded applications where there is no controller to perform wear leveling or TRIM operations.
  Pretty much most of the benefits that could be gained by now are at the application level maybe the CPU, not the OS nor filesystem.
pjdesno8 months ago
Note that there are some significant non-Unix filesystems along the way. In particular, a few years later: "Reimplementing the Cedar File System Using Logging and Group Commit", Robert Hagmann, Xerox PARC. SIGOPS Operating Systems Review, November 1987.
That's the origin of the journaling approach used in post-FFS unix file systems through the 90s. (I don't count ext2 as "post-FFS", as it's basically FFS on a disk with LBA addressing)
- convolvatron8 months ago
  wow. I always thought this was due to M. Rosenblum in 1992. Looking at the paper now it doesn't reference this one. I wonder what other group ideation happened in the late 80s around this. I guess it was already well established in databases.
  - Upvoter338 months ago
    Journaling and log-structured file systems are different (but related) things
- gjvc8 months ago
  Xerox PARC strikes again
  - pjmlp8 months ago
    Still waiting for the Mesa, Cedar, Smalltalk, Interlisp-D set of approaches software development, OSes and systems programming to take over, but I guess we can't have everything at once, and should be happy with what we already got into mainstream out of Xerox PARC.
    gjvc8 months ago
    They don't need to take over in order to be a wild success. I'm reminded of the 1997 speech from Steve Jobs where he said "We have to let go of this notion that for Apple to win, Microsoft has to lose".
    Perhaps AIs will recognise their value and convince people! :-)
blacklion8 months ago
Looks like there is 1994 (XFS) but no 2004 (ZFS? btrfs?) or later installments :-(
- yencabulator8 months ago
  https://blog.koehntopp.info/2023/05/15/50-years-in-filesyste...
  https://blog.koehntopp.info/2023/05/17/50-years-in-filesyste...
  - isotopp8 months ago
    https://blog.koehntopp.info/2023/05/05/50-years-in-filesyste...
    https://blog.koehntopp.info/2023/05/06/50-years-in-filesyste...
    https://blog.koehntopp.info/2023/05/12/50-years-in-filesyste...
    https://blog.koehntopp.info/2023/05/15/50-years-in-filesyste...
    https://blog.koehntopp.info/2023/05/17/50-years-in-filesyste...
  - blacklion8 months ago
    Oh, I missed 2004, thank you.
kazinator8 months ago
This is kind of sparse, pardon the pun. The previous chapter covers one Unix filesystem and that's it, as if that's all there was in 1974-1984. The 1984 chapter is again just one BSD Unix filesystem.
How about something lesser known, but excellent?
gjvc8 months ago
As the old saying says: "You can tune a filesystem, but you can't tune a fish."
- cozzyd8 months ago
  Sure you can, https://docs.redhat.com/en/documentation/red_hat_enterprise_...
- azinman28 months ago
  Could it come from this: https://en.wikipedia.org/wiki/You_Can_Tune_a_Piano,_but_You_... ?
- catgirlinspace8 months ago
  is there some metaphor i’m missing or is it just funny?
  - kps8 months ago
    The phrase originates with the ‘BUGS’ section of the 4.2BSD man page for the `tunefs(8)` command, for the Berkeley Fast File System that is the topic of the OP article.
    gjvc8 months ago
    This, but in true to form HN style, if you don't recognise the reference, it's psychologically safer to down vote what you don't know.
    luqtas8 months ago
    karma worth a little on the influencer age
    macintux8 months ago
    Even if you do recognize it, this isn’t Reddit.
    gjvc8 months ago
    /r/confidentlyincorrect
  - CraigJPerry8 months ago
    tuna fish
    Tune-a-fish
    EDIT: looks like there’s a bit more to it https://unixhistory.livejournal.com/1808.html
  - jhbadger8 months ago
    It's a reference to a joke by the comedian Groucho Marx (1890-1977) "You can tune a piano but you can't tuna fish" (also a title of a 1978 REO Speedwagon album paying homage to Marx who had recently died). It's a pun on "tune a" and "tuna".
  - 8 months ago
    undefined
- sadboots8 months ago
  lol
8 months ago
undefined
AdieuToLogic8 months ago
FFS earned its flowers, no doubt, and should be respected for its historic contributions.
FreeBSD's ZFS[0] is really awesome and has proven to be a deserving successor in all scenarios when legacy needs[1] are not a concern.
0 - https://docs.freebsd.org/en/books/handbook/zfs/
1 - https://docs.freebsd.org/en/books/handbook/filesystems/
- homebrewer8 months ago
  Not all, Netflix continues to serve video from UFS:
  https://news.ycombinator.com/item?id=27575405
- pjdesno8 months ago
  There are a couple of generations of filesystems in between FFS and ZFS. And for modern filesystems it's also worth looking at btrfs, Microsoft's Resilient File System, and Apple APFS.
  Also note that ext2 is basically a re-implementation of FFS, without the interleaving because it's designed for disks with integrated controllers.
  - adrian_b8 months ago
    Next in date after FFS, I consider as one of the most innovative and influential filesystems the HPFS (High-Performance File System), which was introduced in November 1989, together with the OS/2 version 1.2 operating system. Among other new features, HPFS has introduced extended file attributes and B-tree directories (both features are ubiquitous in modern filesystems).
    Two very important filesystems that have been heavily influenced by HPFS have been Microsoft NTFS and Silicon Graphics XFS, both introduced in 1993. XFS combined ideas from HPFS with some from the earlier Silicon Graphics filesystem, Extent FS, which had been introduced in October 1988 and which had much better performance for big files than the Berkeley FFS (because file blocks were allocated sequentially whenever possible, instead of being addressed indirectly and scattered on the disk).
    Journaling in filesystems has been introduced by IBM, in JFS (Journaled FS, February 1990).
    An influential research report was "Beating the I/O Bottleneck: A Case for Log-Structured File Systems", by John K. Ousterhout & Fred Douglis, written in 1988-10 at Berkeley. A paper derived from it was presented at USENIX in 1990-06.
    kragen8 months ago
    Didn't the Oberon filesystem have B-tree directories before HPFS?
    And what do you mean by "extended file attributes"? Nowadays I interpret that phrase as meaning "any user-level file metadata that isn't part of 6th Edition Unix", such as ACLs or the immutability bit, but by that definition almost every pre-Unix filesystem had "extended file attributes".
    https://en.m.wikipedia.org/wiki/Extended_file_attributes has a definition which sounds like it might be what you mean? It also sounds like you could easily implement it as a library, though, instead of including it in the filesystem.
    adrian_b8 months ago
    Extended file attributes can contain arbitrary data associated with the file, whatever the user chooses, unlike file metadata with fixed content, like file name, file size, file owner, time of last modification and so on.
    I am not aware of any earlier filesystem with extended file attributes. What you have mentioned are just examples of fixed file metadata, which are defined by the operating system and which cannot be defined or extended by the user. The only earlier similar feature is the resource fork of the Mac OS files, but that had a different purpose and it was accessed in a different way than extended file attributes, which are a set of name + value pairs, so they are accessed with a special get/set API, not with file read/write functions.
    Once a method for storing arbitrary file metadata exists, it can also be used to implement features like access control lists or to implement any of the legacy file metadata.
    The Oberon file system has been described very summarily in a paper published in 1989-09, i.e. 2 months before the commercial introduction of HPFS by IBM and Microsoft, which must have worked at HPFS for a few years before that.
    Therefore these 2 filesystems must have been developed completely independently and the idea of using B-tree directories, which both teams have presented as novel, must have occurred independently to them.
    While the Oberon filesystem has the merit of introducing independently and simultaneously the same improvement in directory implementation, its historical impact has been very reduced in comparison with HPFS, which had a significant user base and which became well known immediately, all over the world, influencing strongly all the filesystems developed after it.
    kragen8 months ago
    If you just want to store arbitrary name-value pairs associated with a file, it seems like you could use any filesystem to do that. Here's an example implementation:
    def get(filename, key): try: with open(filename + ".ea") as f: metadata = json.load(f) except FileNotFoundError: raise KeyError(key) return metadata[key] def set(filename, key, value): ... # left as an exercise to the reader
    A fully robust implementation of the facility requires handling race conditions between concurrent writers, eventual metadata removal when the underlying file is removed, some kind of thinking about how permissions to the file and permissions to the extended attributes relate, and keeping the extended-attribute storage files from cluttering up the usual user interfaces. Still, given how trivial it is to implement the facility if it isn't built into the filesystem, building it into the filesystem seems more like a tradeoff between obvious design alternatives and less like an innovation.
    There are lots of ways you can do things like this; Windows 95 started stashing long-filename metadata in the filesystem directory as extra invalid hidden system volume-name read-only filename entries, for example, and 4DOS (and Total Commander) stored timestamps and arbitrarily-long file descriptions in a file in every directory called DESCRIPT.ION.
    (With respect to the concurrency and robustness issues, keep in mind that HPFS would regularly lose data if you lost power, disk drives at the time would regularly suffer physical damage if you lost power unless the disk head was parked, common implementations of Pascal at the time would fail to flush buffered I/O data if you failed to close files, PCs mostly weren't multitasking, and users could remove floppy disks from drives at any time without warning, so even a pretty non-robust implementation would have been adequate.)
    As for Oberon's filesystem, it wasn't quite the same improvement in directory implementation; in Oberon there is one directory per disk, while in HPFS there is one directory per folder. This may sound like its filesystem was non-hierarchical, but if I recall correctly, it is in fact hierarchical; files are stored under their pathnames in that single directory.
    I thought that probably, if Oberon was described so late, its predecessor probably did the same thing, but I just checked out Svend Erik Knudsen's dissertation (ETH No. 7346) on Medos-2 4.2 https://oberoncore.ru/_media/library/knudsen_medos-2_a_modul... and p. 67, at the tail end of §6.2, describes the file name directory of Medos's DiskSystem. There is no mention at all of B-trees. (There is also a "file directory", but it's confusingly named; what it contains is basically inodes.)
    It would probably be nice if HPFS had strongly influenced all the filesystems developed after it, but in fact it seems to have had virtually no influence on Sprite-LFS (begun 01990), BSD LFS, ext2fs, FAT32, VFAT, NetApp's WAFL, GFS, HDFS, Ceph, yaffs, or jffs2. I'm sure it had a lot of influence on NTFS. I don't know enough about ZFS, Reiserfs, XFS, HFS+, or exFAT to say; do you? Can you think of any other filesystems that were strongly influenced by HPFS other than NTFS?
    Filesystem design seems to be largely a matter of standing on giants' feet.
    adrian_b8 months ago
    While you are right that it is possible to implement extended file attributes as you describe, even when this is not supported by the operating system, this is extremely inconvenient.
    The reason is that you must have complete control over all software run on such a computer and be able to build everything from source, to be sure that all file operations are done using some standard libraries modified by you.
    For the file attributes to be reliably associated with the file you must ensure that any file copying, moving, renaming, linking or deleting takes into account the file attributes. If you allow any such operations to be done by a non-modified legacy program, you break the file system. (The fact that even some implementations of extended file attributes in the operating system allow user programs to lose the extended file attributes when they are not aware of them is not a feature, but a serious bug.)
    Instead of modifying all file-handling libraries that may be invoked in user programs (e.g. for all programming languages that might be used) it is actually much easier to do the modifications in a single place, i.e. in the file system implementation in the operating system. This can also provide better performance for accessing the file attributes.
    Workarounds like those of Windows 95 have appeared after HPFS and its successors, with the purpose of adding extended file attributes to an existing file system, without changing its on-disk data representation.
    Like I have already said, the immediate direct successors of HPFS have been both NTFS and Silicon Graphics XFS (which combined features of HPFS with features of the SGI Extent FS), and these two have been developed and launched almost simultaneously.
    While there have been many file systems that have been developed later than 1989 and which have ignored HPFS at their conception, all of those that have survived have added sooner or later extended file attributes and B-tree directories. This means that all later file systems have either been influenced by HPFS or they have disappeared.
    BSD LFS has started before HPFS (research report in 1988), FAT32 and VFAT were compatible extensions of the older FAT, ext2fs was intended as just a compatible implementation of the old UNIX file systems, with no innovations. ZFS has included since the beginning any features of HPFS, because it was intended to be better than the XFS used by competition (but ZFS has never succeeded to approach XFS in performance, even if it may be better in the provided features and in reliability).
    kragen8 months ago
    This is a very interesting conversation, thank you!
    A couple of minor quibbles: FAT32 is not a compatible extension of the older FAT; ext2fs is not a compatible implementation of any old Unix filesystem; having less features does not always make software worse; the BSD LFS research report is from 01993 rather than 01988 https://www.usenix.org/legacy/publications/library/proceedin... (it credits Ousterhout with introducing the idea of LFSs in 01989); several of the other filesystems I listed that I don't think are significantly influenced by HPFS have not in fact disappeared; and Windows 95 does not, to my knowledge, support extended attributes on existing filesystems.
    It's probably worth mentioning that, as you imply, ext2fs did eventually add both extended attributes and B-tree directories, just after being quasi-renamed ext4fs. And I do agree that ext2fs was initially intended to be as boring as possible (without suffering the annoyances of the Minix filesystem) rather than innovative. It just failed to be compatible with BSD FFS.
    It's a good point that renaming and hardlinking doesn't automatically take into account such extended attributes implemented as a library; that's a plausible reason to implement them in the filesystem. But implementing them in the filesystem doesn't actually help you with respect to copying, unless copying files is an operation the filesystem supports intrinsically, like "reflinks" on some Linux filesystems such as XFS. You still have to modify your file-copying program to copy the extended attributes when that is desired, whether they're provided by the kernel or by a library. Even if the filesystem supports a copy operation, there are typically cases it doesn't handle, such as copying files onto removable media or network filesystems.
    Moreover, whether the attributes are implemented in the kernel or in a library, you also have to modify other utility programs to handle the extended attributes if they are to be preserved; for example, backup programs (including things like rsync, zip, and tar), filesystem integrity checking programs like Tripwire, and network file server programs.
    File operations that aren't affected by extended attributes wouldn't need to use the extended-attribute library. Depending on what you wanted to use them for, that could be most of them. And it isn't necessary to reimplement the library in multiple different languages; it's sufficient to implement it in C, Rust, or assembly. (All the other languages would need to have a way to call the C library, but that's also true of extended-attribute system calls.)
    If you want to use xattrs for ACLs, you can't implement them in a library. But that seems like a backwards way to look at the situation. You don't need extended attributes in the kernel to implement ACLs; you can extend your filesystem to store ACLs in an ACL-specific place. But, if you do have extended attributes in the kernel, that's a candidate way to store ACLs.
    Implementing features like xattrs as libraries (rather than in the kernel) has advantages as well as disadvantages. For example, depending on how you do it, you don't need to modify your backup programs. You can support them on all filesystems instead of just some filesystems. And getting or setting them doesn't necessarily incur system-call overhead, though that probably depends on caching.
    I don't think VFAT or exFAT does in fact support either extended file attributes or B-tree directories. According to https://eclecticlight.co/2018/01/12/which-file-systems-and-c..., MacOS fakes xattr support on them using the approach I outlined upthread, but instead of suffixing the filename with .ea, it prefixes it with ._. And I think it does it in the kernel rather than in a library, but I don't really know.
    BeOS's ability to find files by extended attributes seems like a reasonable justification for doing them in the kernel instead of in a library. Otherwise, though, it seems like an obvious but debatable design choice like case sensitivity, not an innovation.
  - AdieuToLogic8 months ago
    > There are a couple of generations of filesystems in between FFS and ZFS.
    Quite true. I did not mean to imply ZFS was the immediate successor to FFS, but instead that it is a superior choice currently.
    > And for modern filesystems it's also worth looking at btrfs, Microsoft's Resilient File System, and Apple APFS.
    All good options and which to consider is applicable when running Linux, MS-Windows, and macOS respectively. If FreeBSD+ZFS is a possibility though... :-)
    8 months ago
    undefined
- isotopp8 months ago
  It is Sun ZFS.
  And there are many interesting stops between FFS/UFS and ZFS. And ZFS, even today, is not the best system to use in all scenarios, by some margin.
WaitWaitWha8 months ago
I am so confused. File systems or filesystems?
I do not recall anything in my AT&T days referred to as "filesystems". when I read the article I was expecting mentions of CP/M, Files-11, TOPS-10, UFS, VSAM, ISAM, BDAM, FAT? MFS for Macs showed up in 1984, Amiga's OFS followed.
I mean, not even UFS, which showed up in the 70s, yet still had impact on ext and NTFS...
I was trying to see if this article was about history of file systemS from the 70s to through the 80s, or just a single file system's evolution; alas I could not tell.
- lproven8 months ago
  I think you did not read it closely enough.
  The article says right at the top that it is part 2 in a series, and that part 1 was 1974, with a link. That Part 1 discussed UFS in depth. The point being that (1) by 1984 UFS was far from state of the art and (2) you need to understand the 50 year old SOTA to understand what was different in the 40YO SOTA.
- unwind8 months ago
  The two spellings are the same. First sentence from the Wikipedia entry [1]:
  In computing, a file system or filesystem (often abbreviated to FS or fs) governs file organization and access.
  [1]: https://en.wikipedia.org/wiki/File_system
curtisszmania8 months ago
[dead]
bvrmn8 months ago
[flagged]
- 8 months ago
  undefined