Does Fedora use Debian's patch set for sshd, or a similar patch set that adds libsystemd?
Edit: It looks like Fedora wasn't affected because the backdoor triggered a valgrind test failure, so they shipped it with a flag that disabled the functionality that was backdoored. Seems like they lucked out. https://lists.fedoraproject.org/archives/list/devel@lists.fe...
If I recall correctly, the backdoor was set up to only activate on rpm and deb based systems, so it wouldn't have been trigged on Arch, Gentoo or NixOS, even if they linked systemd to ssh.
As the proportion of younger engineers contributing to open-source decreases (a reasonable choice, given the state of the economy), I see only two future possibilities:
1. Big corporations take ownership of key open-source libraries in an effort to continue their development.
2. Said key open-source libraries die, and corporations develop proprietary replacements for their own use. The open source scene remains alive, but with a much smaller influence.
In decades past companies you to pay for my license for Visual Studio (I think of a MSDN subscription), clear case, a dozen different issue/work trackers. However as soon as an open source alternative is used I don't know how to get the money that would have been spent to them.
Come to think of it I'm maintainer of a couple open source projects that I don't use anymore and I don't normally bother even looking at the project either. Either someone needs to pay me to continue maintaining it (remember I don't find them useful myself so I'm not doing it to scratch an itch), or someone needs to take them over from me - but given xz attacks I'm no longer sure how to hand maintenance over.
“We’re paying for contract development? But it’s not one of our products and we’ll have no rights to the software? They’ll fix all the bugs we find, right? Right?” This a hard conversation at most companies, even tech companies.
I thought that the idea of a funding manifest to advertise funding requests was a good idea: https://floss.fund/funding-manifest/ No idea if it works.
With AI and CV reference hunting, number of contributions is higher than ever. Open-source projects are basically spammed, with low quality contributions.
Public page is just a liability. I am considering to close public bugzilla, git repo and discussions. I would just take bug reports and patches from very small circle of customers and power users. Everything except release source tarball, and short changelog would be private!
Open-source means you get a source code, not free customer and dev support!
So far our core full time team of 3 gets to spend about half our time consulting/auditing and half our time contributing to our open projects that most of our clients use and depend on.
The key is for companies to have visibility into the current funding status of the software they depend on, and relationships with maintainers, so they can offer to fund features or fixes they need instead of being blocked.
Second thing is there are bunch of things corporations need to use but don't want to develop on their own like SSH.
There is already too much internal tooling inside of big corporations that is rotting there and a lot of times it would be much better if they give it out to a foundation - like Apache foundation where projects go to die or limp through.
  For the past 26 years, the speaker has been engaged in the design, implementation, technology transfer, and application of flexible Mandatory Access Control (MAC). In this talk, he describes the history and lessons learned from this body of work. The background and motivation for MAC is first presented, followed by a discussion of how a flexible MAC architecture was created and matured through a series of research systems. The work to bring this architecture to mainstream systems is then described, along with how the architecture and implementation evolved. The experience with applying this architecture to mobile platforms is examined. The role of MAC in a larger system architecture is reviewed in the context of a secure virtualization system. The state of MAC in mainstream systems is compared before and after our work. Work to bring MAC to emerging operating systems is discussed.
For other "package" managers (eg: CPAN, Debian) I can point to my own archive and be sure everything I manage down stream gets the blessed bits.
I basically have a huge archive/mirror for the supply chain for my perl, PHP, JavaScript, etc.
If anyone has pro tips on how to "lock" docker to one registry that would be cool.
> (OpenSSL is written in C, so this mistake was incredibly easy to make and miss; in a memory-safe language with proper bounds checking, it would have been nearly impossible.)
    package main
    import "fmt"
    type CmdType int
    const (
        WriteMsg CmdType = iota
        ReadMsg
    )
    type Cmd struct {
        t CmdType
        d []byte
        l int
    }
    var buffer [256]byte
    var cmds = []Cmd{
        Cmd{WriteMsg, []byte("Rain. And a little ice. It's a damn good thing he doesn't know how much I hate his guts."), 88},
        Cmd{WriteMsg, []byte("Rain. And a little ice."), 23},
        Cmd{ReadMsg, nil, 23},
        Cmd{ReadMsg, nil, 88}, // oops!
    }
    func main() {
        for c := range cmds {
            if cmds[c].t == WriteMsg {
                copy(buffer[:], cmds[c].d[:cmds[c].l])
            } else if cmds[c].t == ReadMsg {
                fmt.Println(string(buffer[:cmds[c].l]))
            }
        }
    }
It wasn't caught because OpenSSL had built its own buffer/memory management routines on top of the actual ones provided by the language (malloc, memcpy, realloc, free), and all sorts of unsafe manipulations were happening inside one big buffer. That buffer could be in a language with perfect memory safety, the same flaw would still be there.
I don't see anything that is going to block this from getting worse and worse. It became a pretty common issue that I first heard about with npm or node.js and their variants, maybe because people update software so much there and have lots of dependencies. I don't see a solution. A single program can have huge numbers of dependencies, even c++ or java programs now.
It's not new, here's one from 6 years ago on c++ - https://www.trendmicro.com/en_us/research/19/d/analyzing-c-c...
Don't forget log4j - https://www.infoworld.com/article/3850718/developers-apply-t..., points to this recent paper https://arxiv.org/pdf/2503.12192
  - CHERI compartmentalisation
  - LavaMoat (js)
  - Scala "capture checking"
  - Java "integrity by default"Security, when practiced, is a fundamentally practical discipline that needs to work with the world as is, not with dreams of putting people in basements in chains.
In our distro, Stagex, our threat model assumes at least one maintainer, sysadmin, or computer is compromised at all times.
This has resulted in some specific design choices and practices:
- 100% deterministic, hermetic, reproducible
- full source bootstrapped from 180 bytes of human-auditable machine code
- all commits signed by authors
- all reviews signed by reviewers
- all released artifacts are multi-party reproduced and signed
- fully OCI (container) native all the way down "FROM scratch"
- All packages easily hash-locked to give downstream software easy determinism as well
This all goes well beyond the tactics used in Nix and Guix.
As far as we know, Stagex is the only distro designed to strictly distrust maintainers.
It doesn't distrust the developers of the software though, so does not fix the biggest hole. Multiparty reproduction does not fix it either, that only distrusts the build system.
The bigger the project, the higher the chance something slips through, if even an exploitable bug. Maybe it's the developer themselves being compromised, or their maintainer.
Reviews are done on what, you have someone reviewing clang code? Binutils?
Only with this problem solved, can we prove the code humans ideally start spending a lot more time reviewing (working on it) is actually the code that is shipped in compiled artifacts.
In practice this is much more rare then a user downloading and running malware or visiting a site that exploits their browser. Compare the number of 0days chrome has had over the years versus the number of times bad actors have hacked Google and replaced download links with links to malware.
Non-web software distribution, particularly for developers, has failed to mature significantly here. Most developers today use brew, nix, alpine, dockerhub, etc. None are signed in a way that allows end users to automatically prove they got artifacts that were faithfully and deterministically built from the expected source code. Could be malware, could be anything. The typical blind trust contract from developers to CDNs that host final compiled artifacts baffles me. Of course you will get malware this way.
Stagex by contrast uses OCI standard signing, meaning you can optionally set a containers/policy.json file in docker or whatever container runtime you use that will cause it to refuse to run any stagex images without reproduction signatures by two or more maintainers.
If you choose to, you can automatically rule out any single developer or system in the stagex chain from injecting malware into your projects.
But an operating system can limit the blast radius. Proper sandboxing is much more important than securing the supply chain.
Maybe you mean sandboxes like secure enclaves. Almost every solution there builds non-deterministically with unsigned containers any of many maintainers can modify at any time, with minimal chance of detection. Maybe you have super great network monitoring, but if I compromise the CI/CD system to compile all binaries with a non-random RNG, then I can undermine any cryptography you use, and can re-create any sessions keys or secrets you can. Game over.
Qubes has the best sandboxing solution of any workstation OS, but that relies on Fedora which is not fully reproducible, and only signed via centralized single-party-controlled infrastructure. Threaten the right person and you can backdoor qubes and everyone that uses it.
I say this as a qubes user, because it is the least bad workstation sandboxing option we have. We must fix the supply chain to have server or workstation sandboxes we can trust.
By contrast, I help maintain airgapos, repros, and enclaveos which are each special purpose immutable appliance operating systems that function as sandboxes for cold key management, secure software builds, and remotely attestable isolated software respectively. All are built with stagex and deterministic so you should get the same hash from a local build any other maintainer has, proving your artifacts faithfully came from the easily reviewable sources.
Yes, you can as they are independent things.
>Maybe you mean sandboxes like secure enclaves.
No I mean sandbox as in applications are sandboxed from the rest of the system. If you just run an application it shouldn't be able to encrypt all of your files. The OS should protect the rest of the system from potentially badly behaving applications.
>but if I compromise the CI/CD system to compile all binaries with a non-random RNG, then I can undermine any cryptography you use, and can re-create any sessions keys or secrets you can
In practice this is a much rarer kind of an attack. Investing a ton in strengthening the front door is meaningless when the backdoor is completely open. Attackers will attack the weakest link.
>Qubes has the best sandboxing solution of any workstation OS
Qubes only offers sandboxing between qubes.questions. There isn't sandboxing within a qube.
>proving your artifacts faithfully came from the easily reviewable sources.
Okay, but as mentioned previously those sources could have vulnerabilities or be malicous. Or users could run other software they have downloaded separately or via a curl | sh.
I sandbox everything in hypervisors, I get it, but you cannot trust a sandbox some internet rando built for you is actually sandboxing. You have to full source bootstrap your sandbox to be guaranteed that the compromise of any of hundreds of dev machines in the usual supply chains did not backdoor your hypervisor.
You need both.
> Attackers will attack the weakest link.
Agreed, and today that is supply chain attacks. I have done them myself in the wild, multiple times. Often as easy as buying an expired email domain of an awal maintainer and doing a password reset for github, dockerhub, godaddy, etc until you control a package in piles of supply chains. Or in the case of most Linux distros just go submit a couple bugfixes and apply to be a maintainer and you have official god access to push any code to major Linux distro supply chains with little to no oversight.
Cheap and effective attacks.
> Qubes only offers sandboxing between qubes.questions. There isn't sandboxing within a qube.
You are expected to run a distinct kernel and VM for each security context. The linux kernel is pretty shit at isolating trusted code from untrusted code on its own. Hypervisors are the only reliable sandbox we have so spin up tiny VMs for every workload.
> Okay, but as mentioned previously those sources could have vulnerabilities or be malicous.
Yes of course, and we need a community wide push to review all this code (working on it) but most of the time supply chain attacks are not even in the repos where someone might notice. They are introduced covertly in the release process of the source code tarballs, or in the final artifact generation flows, or in the CDNs that host those final artifacts. Then people review code, and assume that code is what generated final artifacts.
> Or users could run other software they have downloaded separately or via a curl | sh
Some users will always shoot themselves in the foot if they are uneducated on security, so that is a separate education problem. Supply chain attacks however will hit even users doing everything right, and often burn thousands of people at once. Those of us that maintain and distribute software are obligated to give users safe methods to prove software artifacts are faithfully generated from publicly accountable source code, teach them to not to trust any maintainers including us.
Education is the biggest problem on all sides here. For my part, every "curl | sh" I have ever encouraged users to run in the wild is a troll to teach users to never run those.
There aren't random developers pushing commits to these codebases: these are used by virtually every Linux distro out there (OK, maybe not the Kubernetes one that ships only 12 binaries, forgot its name).
It seems obvious to me that GP is talking about protection against rogue distro maintainers, not fundamental packages being backdoored.
You're basically saying: "GP's work is pointless because Linus could insert a backdoor in the Linux kernel".
In addition to that determinism and 100% reproducibility brings another gigantic benefit: should a backdoor ever be found in clang or one of the binutils tool, it's going to be 100% reproducible. And that is a big thing: being able to reproduce a backdoor is a godsend for security.
You are likely thinking of Talos Linux, which incidentally also builds itself with stagex.
What does this mean? You have a C-like compiler in 180 bytes of assembler that can compile a C compiler that can then compile GCC?
> This is a set of manually created hex programs in a Cthulhu Path to madness fashion. Which only have the goal of creating a bootstrapping path to a C compiler capable of compiling GCC, with only the explicit requirement of a single 1 KByte binary or less.
[1] https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-...
We use an abbreviated and explicit stage0 chain here for easy auditing: https://codeberg.org/stagex/stagex/src/branch/main/packages/...
> stage0: < 190 byte x86 assembly seed is reproduced on multiple distros
> stage1: seed builds up to a tiny c compiler, and ultimately x86 gcc
> stage2: x86 gcc bootstraps target architecture cross toolchains
very impressive, I want to try this out now.
Do you all document how you got around system level sources of non-determinism? Filesystems, metadata, timestamps, tempfiles, etc? This would be a great thing to document for people aiming for the same thing.
What are you all using to verify commits? Are you guys verifying signatures against a public PKI?
Super interested as I manage the reproducibility program for a large software company.
> git clone https://codeberg.org/stagex/stagex
> cd stagex
> make
Several hours later your "out" directory will contain locally built OCI images for every package in the tree, and the index.json for each should contain the exact same digests we commit in the "digests" folder, and the same ones multiple maintainers sign in the OCI standard "signatures" folder.
We build with only a light make wrapper around docker today, though it assumes you have it configured to use the containerd image store backend, which allows for getting deterministic local digests without uploading to a registry.
No reason you cannot build with podman or kaniko etc with some tweaks (which we hope to support officially)
> Do you all document how you got around system level sources of non-determinism? Filesystems, metadata, timestamps, tempfiles, etc? This would be a great thing to document for people aiming for the same thing.
We try to keep our package definitions to "FROM scratch" in "linux from scratch" style with no magic to be self documenting to be easy to audit or reference. By all means crib any of our tactics. We use no global env, so each package has only the determinism tweaks needed (if any). We heavily referenced Alpine, Arch, Mirage, Guix, Nix, and Debian to arrive at our current patterns.
> What are you all using to verify commits? Are you guys verifying signatures against a public PKI?
We all sign commits, reviews, and releases with well published PGP keys maintained in smartcards, with expected public keys in the MAINTAINERS file. Most of us have keyoxide profiles as well making it easy to prove all our online presences agree with the expected fingerprints for us.
> Super interested as I manage the reproducibility program for a large software company.
By all means drop in our matrix room, #stagex:matrix.org . Not many people working on these problems. The more we can all collaborate to unblock each other the better!
https://reproducible-builds.org/ https://bootstrappable.org/ https://bootstrapping.miraheze.org/ https://lwn.net/Articles/983340/ https://lwn.net/Articles/985739/
Reproducibility is more like a security smell; a symptom you’re doing things right. Determinism is the correct target and subtly different.
The focus on supply chain is a distraction, a variant of The “trusting trust” attack Ken Thompson described in 1984 is still among the most elegant and devastating. Infected development toolchains can spread horizontally to “secure” builds.
Just because it’s open doesn’t mean anyone’s been watching closely. "50 years of security"? Important pillars of OSS have been touched by thousands of contributors with varying levels of oversight. Many commits predate strong code-signing or provenance tracking. If a compiler was compromised at any point, everything it compiled—including future versions of itself—could carry that compromise forward invisibly. This includes even "cleanroom" rebuilds.
Additionally the few dependencies you have should be well compensated to avoid 'alternative monetization'.
You can't have the cake (massive amounts of gratis software) and eat it too (security and quality warranties).
The 100 layers of signing and layer 4 package managers is a huge coping mechanism by those that are not ready to bite the tradeoff.
I'm not sure what the fallacy is called, but you say we have an excess of X and then the fallacy is "we can't live without X".
Modern projects especially in the javascript realm have like 10K dependencies. Having one dependency in an Operating System(even though it may itself have their own dependencies) is a huuuuuuuuuge difference.
You can pay cash money to Windows or Red Hat and have either a company that owns all of the deps, or a company that vets all of the dependencies, distributes some cash through donations, and provides a sensible base package.
It may sound extreme, but you don't need much more than a Base OS. If you reaaallly want something else, you can check the OS official package repository. Downloading some third party code is what's extreme to me.
There's no static or dynamic analysis deployed to enhance this level of trust.
The initial attempts are simulated execution like in valgrind, all the sanitizer work, perhaps difference on the functional level beyond the text of the source code where it's too easy to smuggle things through... (Like on an abstracted conditional graph.)
We cannot even compare binaries or executables right given differing compiler revisions.
The full source bootstrapped go compiler binaries in stagex exactly match the hashes of the ones Google releases, giving us as much confidence as we can get in the source->binary chain, which until very recently had no solution at all.
Go has unique compiler design choices that make it very self contained that make this possible, though we also can deterministically build rust, or any other language from any OCI compatible toolchain.
You are talking about one layer down from that, the source code itself, which is our next goal as well.
Our plan is this:
1. Be able to prove all released artifacts came from hash locked source code (done)
2. Develop a universal normalized identifier for all source code regardless of origin (treehash of all source regardless of git, tar file etc, ignoring/removing generated files, docs, examples, or anything not needed to build) (in progress)
3. Build distributed code review system to coordinate the work to multiple signed reviews by reputable security researchers for every source package by its universal identifier (planning stages)
We are the first distro to reach step 1, and have a reasonably clear path to steps 2 and 3.
We feel step 2 would be a big leap forward on its own, as it would have fully eliminated the xz attack where the attack hid in the tar archive, but not the actual git tree.
Pointing out these classes of problem is easy. I know, did it for years. Actually dramatically removing attack surface is a lot more rewarding.
Help welcome!
This won't protect against more complex attacks like RoP or unverified state. For that we need to implement simple artifacts that are verifiable and mapped. Return to more simple return states (pass/error). Do error handling external to the compiled binaries. Automate state mapping and combine with targeted fuzzing. Systemd is a perfect example of this kind of thing, what not to do: internal logs and error states being handled by a web of interdependent systems.
Fuzzing is good but probabilistic. It is unlikely to hit on a deliberate backdoor. Solid for finding bugs though.
There is unfortunately no substitute for a coordinated effort to document review by capable security researchers on our toolchain sources.
As far as I know Gentoo, even from their "stage0" still assumes you bring your own bootstrap compiler toolchain, and thus is not self bootstrapping.
Full source bootstrapping is our only way out of the trusting trust problem
>Full source bootstrapping is our only way out of the trusting trust problem
No, that is just deferring the trust to all the tools and scripts that fosslinux/live-bootstrap project provides.
To be able to do this, you must already have both the source for the compiler and what someone has told you is a binary compiled from it. But what if that someone was lying?
Anyway, so your means of forming trust in a compiler faithfully compiling code, is to trust a decompiler to faithfully generate human readable source code followed by a lot of manual review labor repeated by every user that wishes to distrust the maintainers.
Okay, but a decompiler could be backdoored as easily as a compiler to hide malicious code vs inject it .
How do you get a decompiler you trust more than the compiler you are reviewing? Do you decompile the decompiler with itself? Back at the trusting trust problem.
Decompilers are way more complex than anything in the hex0->tinycc bootstrap path.
No, it is to fully audit the binary of a compiler itself, if you don't trust a decompiler, learn to read machine code, the output from a simple C compiler tend to pretty predictable.
> manual review labor repeated by every user that wishes to distrust the maintainers.
Yes? What's wrong with that? Anyone wishes to distrust, you give them the tools and knowledge to verify the process, the more people able to do this the better.
We should expect only a few people will review code, if it is drive-by easy to do. That means proving the binaries for sure came from the published commented formatted code, and then go review that code.
But let's not focus too hard on the logic side of your argument. The part that really convinced everyone that you're right was your opening statement, "Not a programmer, are you?". From that moment it was clear that you were taking the discussion to a higher plane, far above boring everyday logic.
Like a superhero, really. At least, that's how I picture you.
If you can do that, you are the only one alive that can.
Nowadays there are so many microcontrolers in your PC an hardware vendor could simply infect: your SSD, HDD, Motherboard or part of the processor. Good luck bootstrapping from hand rolled NAND.