196 pointsby kaycebasques3 days ago11 comments
  • teleforce3 days ago
    Fun facts, back in 2021 Linus was asked what is the most special about Linux in an interview for 30 years of Linux, and he pointed to the Linux File System as the fastest in the business [1].

    [1] 30 Years of Linux: An Interview With Linus Torvalds: Linux and Git - Part 1:

    https://www.tag1consulting.com/blog/interview-linus-torvalds...

  • michelb2 days ago
    What are the latest developments on this front? Still ZFS for stability, XFS for performance? Now that we have NAND storage, would we create filesystems differently? Are there ways to make a step-change in overall performance somewhere?

    Disclaimer: I know nothing about filesystems..

    • evil-olive2 days ago
      > Now that we have NAND storage, would we create filesystems differently? Are there ways to make a step-change in overall performance somewhere?

      you might enjoy Allan Jude's "Scaling ZFS for NVMe" talk [0] from 2022. the tl;dw is that modern NVMe drives have gotten so fast that filesystem developers are having to revisit assumptions that have been true for decades about which operations will be slow vs. fast.

      flash storage that exposed a SATA interface and pretended to be a superbly fast hard drive was an evolutionary improvement of sorts. NVMe is more of a revolutionary improvement, it does away with the spinning-rust assumptions. for example, SATA limits a device to one command queue with 32 commands; NVMe makes it almost unlimited (64k queues with up to 64k commands) [1]

      there is also a corner of the NVMe spec [2] that allows you to treat an SSD directly as key-value storage instead of block storage with a filesystem implementing a key-value layer on top of it. this is promising but AFAIK no mainstream filesystem is using it yet, and I'm not sure if support has even trickled down into consumer drives or if it's market-segmented off for datacenter use only.

      0: https://www.youtube.com/watch?v=v8sl8gj9UnA

      1: https://en.wikipedia.org/wiki/NVM_Express#Comparison_with_AH...

      2: https://nvmexpress.org/specification/key-value-command-set-s...

    • avar2 days ago
      If you know nothing about filesystems, then you're almost certainly limited by the underlying hardware, not the filesystem.

      And "performance" is meaningless in this context without specifying what you want to perform. Raw sequential read/write throughput, random metadata operations on directories containing millions of files? Something in between?

          > Are there ways to make a step-change in
          > overall performance somewhere?
      
      Yes, it's called O_DIRECT (or equivalent), and is very useful to those that actually need it.

      A general-use filesystem is inherently something that requires you to compromise on performance.

    • chaostheoa day ago
      Still ZFS for stability ?? Pull the power plug while writing and you know what your stability is worth in reality. I would say say ZFS is for verify as long as you have a usv in front of your server otherwise with a poweroutage you should have a good backup and be able to do a full restore.
    • 6SixTy2 days ago
      A lot of NAND SSDs have controllers that pretty much take away a lot of control the rest of the system has over the actual flash cells. Pretty much any flash specific file systems are designed for embedded applications where there is no controller to perform wear leveling or TRIM operations.

      Pretty much most of the benefits that could be gained by now are at the application level maybe the CPU, not the OS nor filesystem.

    • mtillman2 days ago
      XFS is a file system and ZFS is both a file system and a volume manager. ZFS also recently added RAIDZ expansion which I was really excited about.
  • pjdesno3 days ago
    Note that there are some significant non-Unix filesystems along the way. In particular, a few years later: "Reimplementing the Cedar File System Using Logging and Group Commit", Robert Hagmann, Xerox PARC. SIGOPS Operating Systems Review, November 1987.

    That's the origin of the journaling approach used in post-FFS unix file systems through the 90s. (I don't count ext2 as "post-FFS", as it's basically FFS on a disk with LBA addressing)

    • convolvatron3 days ago
      wow. I always thought this was due to M. Rosenblum in 1992. Looking at the paper now it doesn't reference this one. I wonder what other group ideation happened in the late 80s around this. I guess it was already well established in databases.
      • Upvoter332 days ago
        Journaling and log-structured file systems are different (but related) things
    • gjvc3 days ago
      Xerox PARC strikes again
      • pjmlp2 days ago
        Still waiting for the Mesa, Cedar, Smalltalk, Interlisp-D set of approaches software development, OSes and systems programming to take over, but I guess we can't have everything at once, and should be happy with what we already got into mainstream out of Xerox PARC.
        • gjvc2 days ago
          They don't need to take over in order to be a wild success. I'm reminded of the 1997 speech from Steve Jobs where he said "We have to let go of this notion that for Apple to win, Microsoft has to lose".

          Perhaps AIs will recognise their value and convince people! :-)

  • kazinator2 days ago
    This is kind of sparse, pardon the pun. The previous chapter covers one Unix filesystem and that's it, as if that's all there was in 1974-1984. The 1984 chapter is again just one BSD Unix filesystem.

    How about something lesser known, but excellent?

  • gjvc3 days ago
    As the old saying says: "You can tune a filesystem, but you can't tune a fish."
  • 2 days ago
    undefined
  • AdieuToLogic2 days ago
    FFS earned its flowers, no doubt, and should be respected for its historic contributions.

    FreeBSD's ZFS[0] is really awesome and has proven to be a deserving successor in all scenarios when legacy needs[1] are not a concern.

    0 - https://docs.freebsd.org/en/books/handbook/zfs/

    1 - https://docs.freebsd.org/en/books/handbook/filesystems/

    • homebrewer2 days ago
      Not all, Netflix continues to serve video from UFS:

      https://news.ycombinator.com/item?id=27575405

    • pjdesno2 days ago
      There are a couple of generations of filesystems in between FFS and ZFS. And for modern filesystems it's also worth looking at btrfs, Microsoft's Resilient File System, and Apple APFS.

      Also note that ext2 is basically a re-implementation of FFS, without the interleaving because it's designed for disks with integrated controllers.

      • adrian_b2 days ago
        Next in date after FFS, I consider as one of the most innovative and influential filesystems the HPFS (High-Performance File System), which was introduced in November 1989, together with the OS/2 version 1.2 operating system. Among other new features, HPFS has introduced extended file attributes and B-tree directories (both features are ubiquitous in modern filesystems).

        Two very important filesystems that have been heavily influenced by HPFS have been Microsoft NTFS and Silicon Graphics XFS, both introduced in 1993. XFS combined ideas from HPFS with some from the earlier Silicon Graphics filesystem, Extent FS, which had been introduced in October 1988 and which had much better performance for big files than the Berkeley FFS (because file blocks were allocated sequentially whenever possible, instead of being addressed indirectly and scattered on the disk).

        Journaling in filesystems has been introduced by IBM, in JFS (Journaled FS, February 1990).

        An influential research report was "Beating the I/O Bottleneck: A Case for Log-Structured File Systems", by John K. Ousterhout & Fred Douglis, written in 1988-10 at Berkeley. A paper derived from it was presented at USENIX in 1990-06.

        • kragen2 days ago
          Didn't the Oberon filesystem have B-tree directories before HPFS?

          And what do you mean by "extended file attributes"? Nowadays I interpret that phrase as meaning "any user-level file metadata that isn't part of 6th Edition Unix", such as ACLs or the immutability bit, but by that definition almost every pre-Unix filesystem had "extended file attributes".

          https://en.m.wikipedia.org/wiki/Extended_file_attributes has a definition which sounds like it might be what you mean? It also sounds like you could easily implement it as a library, though, instead of including it in the filesystem.

          • adrian_b2 days ago
            Extended file attributes can contain arbitrary data associated with the file, whatever the user chooses, unlike file metadata with fixed content, like file name, file size, file owner, time of last modification and so on.

            I am not aware of any earlier filesystem with extended file attributes. What you have mentioned are just examples of fixed file metadata, which are defined by the operating system and which cannot be defined or extended by the user. The only earlier similar feature is the resource fork of the Mac OS files, but that had a different purpose and it was accessed in a different way than extended file attributes, which are a set of name + value pairs, so they are accessed with a special get/set API, not with file read/write functions.

            Once a method for storing arbitrary file metadata exists, it can also be used to implement features like access control lists or to implement any of the legacy file metadata.

            The Oberon file system has been described very summarily in a paper published in 1989-09, i.e. 2 months before the commercial introduction of HPFS by IBM and Microsoft, which must have worked at HPFS for a few years before that.

            Therefore these 2 filesystems must have been developed completely independently and the idea of using B-tree directories, which both teams have presented as novel, must have occurred independently to them.

            While the Oberon filesystem has the merit of introducing independently and simultaneously the same improvement in directory implementation, its historical impact has been very reduced in comparison with HPFS, which had a significant user base and which became well known immediately, all over the world, influencing strongly all the filesystems developed after it.

            • kragen2 days ago
              If you just want to store arbitrary name-value pairs associated with a file, it seems like you could use any filesystem to do that. Here's an example implementation:

                  def get(filename, key):
                      try:
                          with open(filename + ".ea") as f:
                              metadata = json.load(f)
                      except FileNotFoundError:
                          raise KeyError(key)
                      return metadata[key]
              
                  def set(filename, key, value):
                      ...
                      # left as an exercise to the reader
              
              A fully robust implementation of the facility requires handling race conditions between concurrent writers, eventual metadata removal when the underlying file is removed, some kind of thinking about how permissions to the file and permissions to the extended attributes relate, and keeping the extended-attribute storage files from cluttering up the usual user interfaces. Still, given how trivial it is to implement the facility if it isn't built into the filesystem, building it into the filesystem seems more like a tradeoff between obvious design alternatives and less like an innovation.

              There are lots of ways you can do things like this; Windows 95 started stashing long-filename metadata in the filesystem directory as extra invalid hidden system volume-name read-only filename entries, for example, and 4DOS (and Total Commander) stored timestamps and arbitrarily-long file descriptions in a file in every directory called DESCRIPT.ION.

              (With respect to the concurrency and robustness issues, keep in mind that HPFS would regularly lose data if you lost power, disk drives at the time would regularly suffer physical damage if you lost power unless the disk head was parked, common implementations of Pascal at the time would fail to flush buffered I/O data if you failed to close files, PCs mostly weren't multitasking, and users could remove floppy disks from drives at any time without warning, so even a pretty non-robust implementation would have been adequate.)

              As for Oberon's filesystem, it wasn't quite the same improvement in directory implementation; in Oberon there is one directory per disk, while in HPFS there is one directory per folder. This may sound like its filesystem was non-hierarchical, but if I recall correctly, it is in fact hierarchical; files are stored under their pathnames in that single directory.

              I thought that probably, if Oberon was described so late, its predecessor probably did the same thing, but I just checked out Svend Erik Knudsen's dissertation (ETH No. 7346) on Medos-2 4.2 https://oberoncore.ru/_media/library/knudsen_medos-2_a_modul... and p. 67, at the tail end of §6.2, describes the file name directory of Medos's DiskSystem. There is no mention at all of B-trees. (There is also a "file directory", but it's confusingly named; what it contains is basically inodes.)

              It would probably be nice if HPFS had strongly influenced all the filesystems developed after it, but in fact it seems to have had virtually no influence on Sprite-LFS (begun 01990), BSD LFS, ext2fs, FAT32, VFAT, NetApp's WAFL, GFS, HDFS, Ceph, yaffs, or jffs2. I'm sure it had a lot of influence on NTFS. I don't know enough about ZFS, Reiserfs, XFS, HFS+, or exFAT to say; do you? Can you think of any other filesystems that were strongly influenced by HPFS other than NTFS?

              Filesystem design seems to be largely a matter of standing on giants' feet.

              • adrian_ba day ago
                While you are right that it is possible to implement extended file attributes as you describe, even when this is not supported by the operating system, this is extremely inconvenient.

                The reason is that you must have complete control over all software run on such a computer and be able to build everything from source, to be sure that all file operations are done using some standard libraries modified by you.

                For the file attributes to be reliably associated with the file you must ensure that any file copying, moving, renaming, linking or deleting takes into account the file attributes. If you allow any such operations to be done by a non-modified legacy program, you break the file system. (The fact that even some implementations of extended file attributes in the operating system allow user programs to lose the extended file attributes when they are not aware of them is not a feature, but a serious bug.)

                Instead of modifying all file-handling libraries that may be invoked in user programs (e.g. for all programming languages that might be used) it is actually much easier to do the modifications in a single place, i.e. in the file system implementation in the operating system. This can also provide better performance for accessing the file attributes.

                Workarounds like those of Windows 95 have appeared after HPFS and its successors, with the purpose of adding extended file attributes to an existing file system, without changing its on-disk data representation.

                Like I have already said, the immediate direct successors of HPFS have been both NTFS and Silicon Graphics XFS (which combined features of HPFS with features of the SGI Extent FS), and these two have been developed and launched almost simultaneously.

                While there have been many file systems that have been developed later than 1989 and which have ignored HPFS at their conception, all of those that have survived have added sooner or later extended file attributes and B-tree directories. This means that all later file systems have either been influenced by HPFS or they have disappeared.

                BSD LFS has started before HPFS (research report in 1988), FAT32 and VFAT were compatible extensions of the older FAT, ext2fs was intended as just a compatible implementation of the old UNIX file systems, with no innovations. ZFS has included since the beginning any features of HPFS, because it was intended to be better than the XFS used by competition (but ZFS has never succeeded to approach XFS in performance, even if it may be better in the provided features and in reliability).

                • kragen13 hours ago
                  This is a very interesting conversation, thank you!

                  A couple of minor quibbles: FAT32 is not a compatible extension of the older FAT; ext2fs is not a compatible implementation of any old Unix filesystem; having less features does not always make software worse; the BSD LFS research report is from 01993 rather than 01988 https://www.usenix.org/legacy/publications/library/proceedin... (it credits Ousterhout with introducing the idea of LFSs in 01989); several of the other filesystems I listed that I don't think are significantly influenced by HPFS have not in fact disappeared; and Windows 95 does not, to my knowledge, support extended attributes on existing filesystems.

                  It's probably worth mentioning that, as you imply, ext2fs did eventually add both extended attributes and B-tree directories, just after being quasi-renamed ext4fs. And I do agree that ext2fs was initially intended to be as boring as possible (without suffering the annoyances of the Minix filesystem) rather than innovative. It just failed to be compatible with BSD FFS.

                  It's a good point that renaming and hardlinking doesn't automatically take into account such extended attributes implemented as a library; that's a plausible reason to implement them in the filesystem. But implementing them in the filesystem doesn't actually help you with respect to copying, unless copying files is an operation the filesystem supports intrinsically, like "reflinks" on some Linux filesystems such as XFS. You still have to modify your file-copying program to copy the extended attributes when that is desired, whether they're provided by the kernel or by a library. Even if the filesystem supports a copy operation, there are typically cases it doesn't handle, such as copying files onto removable media or network filesystems.

                  Moreover, whether the attributes are implemented in the kernel or in a library, you also have to modify other utility programs to handle the extended attributes if they are to be preserved; for example, backup programs (including things like rsync, zip, and tar), filesystem integrity checking programs like Tripwire, and network file server programs.

                  File operations that aren't affected by extended attributes wouldn't need to use the extended-attribute library. Depending on what you wanted to use them for, that could be most of them. And it isn't necessary to reimplement the library in multiple different languages; it's sufficient to implement it in C, Rust, or assembly. (All the other languages would need to have a way to call the C library, but that's also true of extended-attribute system calls.)

                  If you want to use xattrs for ACLs, you can't implement them in a library. But that seems like a backwards way to look at the situation. You don't need extended attributes in the kernel to implement ACLs; you can extend your filesystem to store ACLs in an ACL-specific place. But, if you do have extended attributes in the kernel, that's a candidate way to store ACLs.

                  Implementing features like xattrs as libraries (rather than in the kernel) has advantages as well as disadvantages. For example, depending on how you do it, you don't need to modify your backup programs. You can support them on all filesystems instead of just some filesystems. And getting or setting them doesn't necessarily incur system-call overhead, though that probably depends on caching.

                  I don't think VFAT or exFAT does in fact support either extended file attributes or B-tree directories. According to https://eclecticlight.co/2018/01/12/which-file-systems-and-c..., MacOS fakes xattr support on them using the approach I outlined upthread, but instead of suffixing the filename with .ea, it prefixes it with ._. And I think it does it in the kernel rather than in a library, but I don't really know.

                  BeOS's ability to find files by extended attributes seems like a reasonable justification for doing them in the kernel instead of in a library. Otherwise, though, it seems like an obvious but debatable design choice like case sensitivity, not an innovation.

      • AdieuToLogic2 days ago
        > There are a couple of generations of filesystems in between FFS and ZFS.

        Quite true. I did not mean to imply ZFS was the immediate successor to FFS, but instead that it is a superior choice currently.

        > And for modern filesystems it's also worth looking at btrfs, Microsoft's Resilient File System, and Apple APFS.

        All good options and which to consider is applicable when running Linux, MS-Windows, and macOS respectively. If FreeBSD+ZFS is a possibility though... :-)

        • 2 days ago
          undefined
    • isotopp2 days ago
      It is Sun ZFS.

      And there are many interesting stops between FFS/UFS and ZFS. And ZFS, even today, is not the best system to use in all scenarios, by some margin.

  • WaitWaitWha2 days ago
    I am so confused. File systems or filesystems?

    I do not recall anything in my AT&T days referred to as "filesystems". when I read the article I was expecting mentions of CP/M, Files-11, TOPS-10, UFS, VSAM, ISAM, BDAM, FAT? MFS for Macs showed up in 1984, Amiga's OFS followed.

    I mean, not even UFS, which showed up in the 70s, yet still had impact on ext and NTFS...

    I was trying to see if this article was about history of file systemS from the 70s to through the 80s, or just a single file system's evolution; alas I could not tell.

    • lproven2 days ago
      I think you did not read it closely enough.

      The article says right at the top that it is part 2 in a series, and that part 1 was 1974, with a link. That Part 1 discussed UFS in depth. The point being that (1) by 1984 UFS was far from state of the art and (2) you need to understand the 50 year old SOTA to understand what was different in the 40YO SOTA.

    • unwind2 days ago
      The two spellings are the same. First sentence from the Wikipedia entry [1]:

      In computing, a file system or filesystem (often abbreviated to FS or fs) governs file organization and access.

      [1]: https://en.wikipedia.org/wiki/File_system

  • curtisszmania2 days ago
    [dead]
  • bvrmn3 days ago
    [flagged]
    • 3 days ago
      undefined