[1] 30 Years of Linux: An Interview With Linus Torvalds: Linux and Git - Part 1:
https://www.tag1consulting.com/blog/interview-linus-torvalds...
Disclaimer: I know nothing about filesystems..
you might enjoy Allan Jude's "Scaling ZFS for NVMe" talk [0] from 2022. the tl;dw is that modern NVMe drives have gotten so fast that filesystem developers are having to revisit assumptions that have been true for decades about which operations will be slow vs. fast.
flash storage that exposed a SATA interface and pretended to be a superbly fast hard drive was an evolutionary improvement of sorts. NVMe is more of a revolutionary improvement, it does away with the spinning-rust assumptions. for example, SATA limits a device to one command queue with 32 commands; NVMe makes it almost unlimited (64k queues with up to 64k commands) [1]
there is also a corner of the NVMe spec [2] that allows you to treat an SSD directly as key-value storage instead of block storage with a filesystem implementing a key-value layer on top of it. this is promising but AFAIK no mainstream filesystem is using it yet, and I'm not sure if support has even trickled down into consumer drives or if it's market-segmented off for datacenter use only.
0: https://www.youtube.com/watch?v=v8sl8gj9UnA
1: https://en.wikipedia.org/wiki/NVM_Express#Comparison_with_AH...
2: https://nvmexpress.org/specification/key-value-command-set-s...
And "performance" is meaningless in this context without specifying what you want to perform. Raw sequential read/write throughput, random metadata operations on directories containing millions of files? Something in between?
> Are there ways to make a step-change in
> overall performance somewhere?
Yes, it's called O_DIRECT (or equivalent), and is very useful to those that actually need it.A general-use filesystem is inherently something that requires you to compromise on performance.
Pretty much most of the benefits that could be gained by now are at the application level maybe the CPU, not the OS nor filesystem.
That's the origin of the journaling approach used in post-FFS unix file systems through the 90s. (I don't count ext2 as "post-FFS", as it's basically FFS on a disk with LBA addressing)
Perhaps AIs will recognise their value and convince people! :-)
https://blog.koehntopp.info/2023/05/17/50-years-in-filesyste...
https://blog.koehntopp.info/2023/05/06/50-years-in-filesyste...
https://blog.koehntopp.info/2023/05/12/50-years-in-filesyste...
https://blog.koehntopp.info/2023/05/15/50-years-in-filesyste...
https://blog.koehntopp.info/2023/05/17/50-years-in-filesyste...
How about something lesser known, but excellent?
Tune-a-fish
EDIT: looks like there’s a bit more to it https://unixhistory.livejournal.com/1808.html
FreeBSD's ZFS[0] is really awesome and has proven to be a deserving successor in all scenarios when legacy needs[1] are not a concern.
Also note that ext2 is basically a re-implementation of FFS, without the interleaving because it's designed for disks with integrated controllers.
Two very important filesystems that have been heavily influenced by HPFS have been Microsoft NTFS and Silicon Graphics XFS, both introduced in 1993. XFS combined ideas from HPFS with some from the earlier Silicon Graphics filesystem, Extent FS, which had been introduced in October 1988 and which had much better performance for big files than the Berkeley FFS (because file blocks were allocated sequentially whenever possible, instead of being addressed indirectly and scattered on the disk).
Journaling in filesystems has been introduced by IBM, in JFS (Journaled FS, February 1990).
An influential research report was "Beating the I/O Bottleneck: A Case for Log-Structured File Systems", by John K. Ousterhout & Fred Douglis, written in 1988-10 at Berkeley. A paper derived from it was presented at USENIX in 1990-06.
And what do you mean by "extended file attributes"? Nowadays I interpret that phrase as meaning "any user-level file metadata that isn't part of 6th Edition Unix", such as ACLs or the immutability bit, but by that definition almost every pre-Unix filesystem had "extended file attributes".
https://en.m.wikipedia.org/wiki/Extended_file_attributes has a definition which sounds like it might be what you mean? It also sounds like you could easily implement it as a library, though, instead of including it in the filesystem.
I am not aware of any earlier filesystem with extended file attributes. What you have mentioned are just examples of fixed file metadata, which are defined by the operating system and which cannot be defined or extended by the user. The only earlier similar feature is the resource fork of the Mac OS files, but that had a different purpose and it was accessed in a different way than extended file attributes, which are a set of name + value pairs, so they are accessed with a special get/set API, not with file read/write functions.
Once a method for storing arbitrary file metadata exists, it can also be used to implement features like access control lists or to implement any of the legacy file metadata.
The Oberon file system has been described very summarily in a paper published in 1989-09, i.e. 2 months before the commercial introduction of HPFS by IBM and Microsoft, which must have worked at HPFS for a few years before that.
Therefore these 2 filesystems must have been developed completely independently and the idea of using B-tree directories, which both teams have presented as novel, must have occurred independently to them.
While the Oberon filesystem has the merit of introducing independently and simultaneously the same improvement in directory implementation, its historical impact has been very reduced in comparison with HPFS, which had a significant user base and which became well known immediately, all over the world, influencing strongly all the filesystems developed after it.
def get(filename, key):
try:
with open(filename + ".ea") as f:
metadata = json.load(f)
except FileNotFoundError:
raise KeyError(key)
return metadata[key]
def set(filename, key, value):
...
# left as an exercise to the reader
A fully robust implementation of the facility requires handling race conditions between concurrent writers, eventual metadata removal when the underlying file is removed, some kind of thinking about how permissions to the file and permissions to the extended attributes relate, and keeping the extended-attribute storage files from cluttering up the usual user interfaces. Still, given how trivial it is to implement the facility if it isn't built into the filesystem, building it into the filesystem seems more like a tradeoff between obvious design alternatives and less like an innovation.There are lots of ways you can do things like this; Windows 95 started stashing long-filename metadata in the filesystem directory as extra invalid hidden system volume-name read-only filename entries, for example, and 4DOS (and Total Commander) stored timestamps and arbitrarily-long file descriptions in a file in every directory called DESCRIPT.ION.
(With respect to the concurrency and robustness issues, keep in mind that HPFS would regularly lose data if you lost power, disk drives at the time would regularly suffer physical damage if you lost power unless the disk head was parked, common implementations of Pascal at the time would fail to flush buffered I/O data if you failed to close files, PCs mostly weren't multitasking, and users could remove floppy disks from drives at any time without warning, so even a pretty non-robust implementation would have been adequate.)
As for Oberon's filesystem, it wasn't quite the same improvement in directory implementation; in Oberon there is one directory per disk, while in HPFS there is one directory per folder. This may sound like its filesystem was non-hierarchical, but if I recall correctly, it is in fact hierarchical; files are stored under their pathnames in that single directory.
I thought that probably, if Oberon was described so late, its predecessor probably did the same thing, but I just checked out Svend Erik Knudsen's dissertation (ETH No. 7346) on Medos-2 4.2 https://oberoncore.ru/_media/library/knudsen_medos-2_a_modul... and p. 67, at the tail end of §6.2, describes the file name directory of Medos's DiskSystem. There is no mention at all of B-trees. (There is also a "file directory", but it's confusingly named; what it contains is basically inodes.)
It would probably be nice if HPFS had strongly influenced all the filesystems developed after it, but in fact it seems to have had virtually no influence on Sprite-LFS (begun 01990), BSD LFS, ext2fs, FAT32, VFAT, NetApp's WAFL, GFS, HDFS, Ceph, yaffs, or jffs2. I'm sure it had a lot of influence on NTFS. I don't know enough about ZFS, Reiserfs, XFS, HFS+, or exFAT to say; do you? Can you think of any other filesystems that were strongly influenced by HPFS other than NTFS?
Filesystem design seems to be largely a matter of standing on giants' feet.
The reason is that you must have complete control over all software run on such a computer and be able to build everything from source, to be sure that all file operations are done using some standard libraries modified by you.
For the file attributes to be reliably associated with the file you must ensure that any file copying, moving, renaming, linking or deleting takes into account the file attributes. If you allow any such operations to be done by a non-modified legacy program, you break the file system. (The fact that even some implementations of extended file attributes in the operating system allow user programs to lose the extended file attributes when they are not aware of them is not a feature, but a serious bug.)
Instead of modifying all file-handling libraries that may be invoked in user programs (e.g. for all programming languages that might be used) it is actually much easier to do the modifications in a single place, i.e. in the file system implementation in the operating system. This can also provide better performance for accessing the file attributes.
Workarounds like those of Windows 95 have appeared after HPFS and its successors, with the purpose of adding extended file attributes to an existing file system, without changing its on-disk data representation.
Like I have already said, the immediate direct successors of HPFS have been both NTFS and Silicon Graphics XFS (which combined features of HPFS with features of the SGI Extent FS), and these two have been developed and launched almost simultaneously.
While there have been many file systems that have been developed later than 1989 and which have ignored HPFS at their conception, all of those that have survived have added sooner or later extended file attributes and B-tree directories. This means that all later file systems have either been influenced by HPFS or they have disappeared.
BSD LFS has started before HPFS (research report in 1988), FAT32 and VFAT were compatible extensions of the older FAT, ext2fs was intended as just a compatible implementation of the old UNIX file systems, with no innovations. ZFS has included since the beginning any features of HPFS, because it was intended to be better than the XFS used by competition (but ZFS has never succeeded to approach XFS in performance, even if it may be better in the provided features and in reliability).
A couple of minor quibbles: FAT32 is not a compatible extension of the older FAT; ext2fs is not a compatible implementation of any old Unix filesystem; having less features does not always make software worse; the BSD LFS research report is from 01993 rather than 01988 https://www.usenix.org/legacy/publications/library/proceedin... (it credits Ousterhout with introducing the idea of LFSs in 01989); several of the other filesystems I listed that I don't think are significantly influenced by HPFS have not in fact disappeared; and Windows 95 does not, to my knowledge, support extended attributes on existing filesystems.
It's probably worth mentioning that, as you imply, ext2fs did eventually add both extended attributes and B-tree directories, just after being quasi-renamed ext4fs. And I do agree that ext2fs was initially intended to be as boring as possible (without suffering the annoyances of the Minix filesystem) rather than innovative. It just failed to be compatible with BSD FFS.
It's a good point that renaming and hardlinking doesn't automatically take into account such extended attributes implemented as a library; that's a plausible reason to implement them in the filesystem. But implementing them in the filesystem doesn't actually help you with respect to copying, unless copying files is an operation the filesystem supports intrinsically, like "reflinks" on some Linux filesystems such as XFS. You still have to modify your file-copying program to copy the extended attributes when that is desired, whether they're provided by the kernel or by a library. Even if the filesystem supports a copy operation, there are typically cases it doesn't handle, such as copying files onto removable media or network filesystems.
Moreover, whether the attributes are implemented in the kernel or in a library, you also have to modify other utility programs to handle the extended attributes if they are to be preserved; for example, backup programs (including things like rsync, zip, and tar), filesystem integrity checking programs like Tripwire, and network file server programs.
File operations that aren't affected by extended attributes wouldn't need to use the extended-attribute library. Depending on what you wanted to use them for, that could be most of them. And it isn't necessary to reimplement the library in multiple different languages; it's sufficient to implement it in C, Rust, or assembly. (All the other languages would need to have a way to call the C library, but that's also true of extended-attribute system calls.)
If you want to use xattrs for ACLs, you can't implement them in a library. But that seems like a backwards way to look at the situation. You don't need extended attributes in the kernel to implement ACLs; you can extend your filesystem to store ACLs in an ACL-specific place. But, if you do have extended attributes in the kernel, that's a candidate way to store ACLs.
Implementing features like xattrs as libraries (rather than in the kernel) has advantages as well as disadvantages. For example, depending on how you do it, you don't need to modify your backup programs. You can support them on all filesystems instead of just some filesystems. And getting or setting them doesn't necessarily incur system-call overhead, though that probably depends on caching.
I don't think VFAT or exFAT does in fact support either extended file attributes or B-tree directories. According to https://eclecticlight.co/2018/01/12/which-file-systems-and-c..., MacOS fakes xattr support on them using the approach I outlined upthread, but instead of suffixing the filename with .ea, it prefixes it with ._. And I think it does it in the kernel rather than in a library, but I don't really know.
BeOS's ability to find files by extended attributes seems like a reasonable justification for doing them in the kernel instead of in a library. Otherwise, though, it seems like an obvious but debatable design choice like case sensitivity, not an innovation.
Quite true. I did not mean to imply ZFS was the immediate successor to FFS, but instead that it is a superior choice currently.
> And for modern filesystems it's also worth looking at btrfs, Microsoft's Resilient File System, and Apple APFS.
All good options and which to consider is applicable when running Linux, MS-Windows, and macOS respectively. If FreeBSD+ZFS is a possibility though... :-)
And there are many interesting stops between FFS/UFS and ZFS. And ZFS, even today, is not the best system to use in all scenarios, by some margin.
I do not recall anything in my AT&T days referred to as "filesystems". when I read the article I was expecting mentions of CP/M, Files-11, TOPS-10, UFS, VSAM, ISAM, BDAM, FAT? MFS for Macs showed up in 1984, Amiga's OFS followed.
I mean, not even UFS, which showed up in the 70s, yet still had impact on ext and NTFS...
I was trying to see if this article was about history of file systemS from the 70s to through the 80s, or just a single file system's evolution; alas I could not tell.
The article says right at the top that it is part 2 in a series, and that part 1 was 1974, with a link. That Part 1 discussed UFS in depth. The point being that (1) by 1984 UFS was far from state of the art and (2) you need to understand the 50 year old SOTA to understand what was different in the 40YO SOTA.
In computing, a file system or filesystem (often abbreviated to FS or fs) governs file organization and access.