* Databases reside on raw disks. There is no file system underneath the databases. If you want a flat file, it has to be in the database. Why? Because databases can be made with good reliability properties and made distributed and redundant.
* Processes can be moved from one machine to another. Much like the Xen hypervisor, which was a high point in that sort of thing.
* Hardware must have built in fault detection. Everything had ECC, parity, or duplication. It's OK to fail, but not make mistakes. IBM mainframes still have this, but few microprocessors do, even though the necessary transistors would not be a high cost today. (It's still hard to get ECC RAM on the desktop, even.)
* Most things are transactions. All persistent state is in the database. Think REST with CGI programs, but more efficient. That's what makes this work. A transaction either runs to successful completion, or fails and has no lasting effect. Database transactions roll back on failures.
The Tandem concept lived on through several changes of ownership and hardware. Unfortunately, it ended up at HP in the Itanium era, where it seems to have died off.
It's a good architecture. The back ends of banks still look much like that, because that's where the money is. But not many programmers think that way.
The terminology of "filesystem" here is confusing. The original database system was/is called Enscribe, and was/is similar to VMS Record Management Services - it had different types of structured files types, in addition to unstructured unix/dos/windows stream-of-byte "flat" files. Around 1987 Tandem added NonStop SQL files. They're accessed through a PATH: Volume.SubVolume.Filename, but depending on the file type, there is different things you can do with them.
> If you want a flat file, it has to be in the database.
You could create unstructured files as well.
> Processes can be moved from one machine to another
Critical system processes are process-pairs, where a Primary process does the work, but sends checkpoint messages to a Backup process on another processor. If the Primary process fails, the Backup process transparently takes over and becomes the Primary. Any messages to the process-pair are automatically re-routed.
> Unfortunately, it ended up at HP in the Itanium era, where it seems to have died off.
It did get ported to Xeon processors around 10 years ago, and is still around. Unlike OpenVMS, HPE still works on it, but as I don't think there is even a link to it on the HPE website* . It still runs on (standard?) HPE x86 servers connected to HPE servers running Linux to provide storage/networking/etc. Apparently it also runs supported under VMWare of some kind.
* Something something Greenlake?
Right. Process migration was possible, but you're right in that it didn't work like Xen.
> It still runs on (standard?) HPE x86 servers connected to HPE servers running Linux to provide storage/networking/etc.
HP is apparently still selling some HPE gear. But it looks like all that stuff transitions to "mature support" at the end of 2025.[1] "Standard support for Integrity servers will end December 31, 2025. Beyond Standard support, HPE Services may provide HPE Mature Hardware Onsite Support, Service dependent on HW spares availability." The end is near.
[1] https://www.hpe.com/psnow/doc/4aa3-9071enw?jumpid=in_hpesite...
The HP NonStop systems, Xeon versions, are here.[1] The not-very-informative white paper is here.[2] Not much about how they do it. Especially since they talk about running "modern" software, like Java and Apache.
[1] https://www.hpe.com/us/en/compute/nonstop-servers.html
[2] https://www.hpe.com/psnow/doc/4aa6-5326enw?jumpid=in_pdfview...
Stratus was another interesting HA vendor, particularly the earlier VOS systems as their modern systems are a bit more pedestrian. http://www.teamfoster.com/stratus-computer
and the book "Reliable Computer Systems - Design and Evaluation"[1] which has general info on reliability, and specific looks at IBM Mainframe, Tandem, and Stratus, plus AT&T switches and spaceflight computers.
[0] https://pages.cs.wisc.edu/~remzi/Classes/838/Fall2001/Papers...
[1] https://archive.org/download/reliablecomputer00siew/reliable...
Also, I think Stratus was the first (only?) computer IBM re-badged at the time - IBM sold Stratus's as the Model 88, IIRC
Because it's a small world, a former Tandem employee was attending the talk. Unfortunately it's been long enough that I don't remember much of our conversation, but it was impressive to hear how they moved a computer between data centers; IIRC, they simply turned it off, and when they powered it back on, the CPU resumed precisely where it had been executing before.
(I have no idea how they handled the system clock.)
Jim Gray's paper:
https://jimgray.azurewebsites.net/papers/TandemTR86.2_FaultT...
The last one I can find is for the NonStop Advanced Architecture (on Itanium), with ServetNet. I gather that this was replaced with the NonStop Multicore Architecture (also on Itanium), with Infiniband, and I assume x86-64 is basically the same but on x86-64, but in pseudo big-endian.