Rust Binder contains the following unsafe operation:
// SAFETY: A `NodeDeath` is never inserted into the death list
// of any node other than its owner, so it is either in this
// death list or in no death list.
unsafe { node_inner.death_list.remove(self) };
This operation is unsafe because when touching the prev/next pointers of
a list element, we have to ensure that no other thread is also touching
them in parallel. If the node is present in the list that `remove` is
called on, then that is fine because we have exclusive access to that
list. If the node is not in any list, then it's also ok. But if it's
present in a different list that may be accessed in parallel, then that
may be a data race on the prev/next pointers.
And unfortunately that is exactly what is happening here. In
Node::release, we:
1. Take the lock.
2. Move all items to a local list on the stack.
3. Drop the lock.
4. Iterate the local list on the stack.
Combined with threads using the unsafe remove method on the original
list, this leads to memory corruption of the prev/next pointers. This
leads to crashes like this one:This is one/the first CVE caused by a mistake made using unsafe Rust. But it was revealed along with 159 new kernel CVEs found in C code.[0]
It may just be me, but it seems wildly myopic to draw conclusions about Rust, or even, unsafe Rust from one CVE. More CVEs will absolutely happen. But even true Rust haters have to recognize that tide of CVEs in kernel C code runs something like 19+ CVEs per day? What kind of case can you make that "incautious and unverified unsafe {} blocks" is worse than that?
Your sense seems more than a little unrigorous. 1/160 = 0.00625. So, several orders of magnitude fewer CVEs per line of code.
And remember this also the first Rust kernel CVE, and any fair metric would count both any new C kernel code CVEs, as well as those which have already accrued against the same C code, if comparing raw lines of code.
But taking a one week snapshot and saying Rust doesn't compare favorably to C, when Rust CVEs are 1/160, and C CVEs are 159/160 is mostly nuts.
This is incorrect. Chalk it up to the flu and fever! Sorry.
0.00625 == .625%. or about twice the instance of Rust code however as stated above these are just the metric from one patch cycle.
The relevant question is whether it results in fewer and less severe CVEs than code written in C. So far the answer seems to be a resounding yes
Or another way to put it: clearly this is bad, and unsafe blocks deserve significant scrutiny. But it's unclear how this would have been made better by the code being entirely unsafe, rather than a particular source of unsafety being incorrect.
(I think the underlying philosophical disagreement here is this: I think software is always going to have bugs, and that Rust can't - and doesn't promise - to perfectly eliminate them. Instead, what Rust does promise - and deliver on - is that the entire class of memory safety bugs can be eliminated by construction in safe Rust, and localized when present to errors in unsafe Rust. Insofar as that's the promise, Rust has delivered here.)
That essay doesn't say that silver bullets are a panacea or cure all, instead they're a decimal order of magnitude improvement. The essay gives the example of Structured Programming, an idea which feels so obvious to us today that it's unspoken, but it's really true that once upon a time people wrote unstructured programs (today the only "language" where you even could do this is assembly and nobody does it) where you just jump arbitrarily to unrelated code and resume execution. The result is fucking chaos and languages where you never do that delivered a huge improvement even before I wrote my first line of code in the 1980s.
Google did find that sort of effect in Rust over C++.
The more useful question is, how many CVEs were prevented because unsafe {} blocks receive more caution and scrutiny?
If all of C is effectively "unsafe" then wouldn't it receive the _most_ scrutiny?
Since this didn't work then I don't understand Rust's overall strategy.
On top of that, there is something else they say. You have to uphold the invariants inside the unsafe blocks. Rust for Linux documents these invariants as well. The invariant was wrong in this case. The reason I mention this is because this practice has forced even C developers to rethink and improve their code.
Rust specifies very clearly what sort of error it eliminates and where it does that. It reduces the surface area of memory safety bugs to unsafe blocks, and gives you clear guidelines on what you need to ensure manually within the unsafe block to avoid any memory safety bugs. And even when you make a human error in that task, Rust makes it easy to identify them.
There are clear advantages here in terms of the effort required to prevent memory safety bugs, and in making your responsibilities explicit. This has been their claim consistently. Yet, I find that these have to be repeated in every discussion about Rust. It feels like some critics don't care about these arguments at all.
(The nuance being that sometimes there's a lot of unsafe Rust, because some domains - like kernel programming - necessitate it. But this is still a better state of affairs than having no code be correct by construction, which is the reality with C.)
But as the adjacent commenter notes: having unsafe is not inherently a problem. You need unsafe Rust to interact with C and C++, because they're not safe by construction. This is a good thing!
In other words: unsafe Rust is harder, but only in an apples-and-oranges sense. If you compare it to the same diligence you'd need to exercise in writing safer C, it would be about the same.
Ultimately every program depends on things beyond any compilers ability to verify, for example the calls to code not written in that language being correct, or even more fundamentally if you're writing some embedded program that literally has interfaces to foreign code at all the silicon (both that handles IO and that which does the computation) being correct.
The promise of rust isn't that it can make this fundamentally non-compiler-verifiable (i.e. unsafe) dependency go away, it's that you can wrap the dependency in abstractions that make it safe for users of the dependency if the dependency is written correctly.
In most domains rust don't necessitate writing new unsafe code, you rely on the existing unsafe code in your dependencies that is shared, battle tested, and reasonably scoped. This is all rust, or any programming langauge, can promise. The demand that the dependency tree has no unsafe isn't the same as the domain necessitating no unsafe, it's the impossible demand that the domain of writing the low level abstractions that every domain relies on doesn't need unsafe.
Yes! Failure to uphold invariants of the underlying abstract model in an unsafe block breaks the surrounding code, including other crates! That's exactly consistent with what I said. There's nothing special about the stdlib. Like all software, it can have bugs.
What the proof states is that two independently correct blocks of unsafe code cannot, when used together, be incorrect. So the key value there is that you only have to reason about them in isolation, which is not true for C.
I sound like an apologist, but the Rust team stated that “memory safety is preserved as long as Rusts invariants are”. Feels really clear, people keep missing this point for some reason, almost as if its a gotcha that unsafe rust behaves in the same memory unsafe way as C/C++: when thats exactly the point.
Your verification surface is smaller and has a boundary.
If Rust doesn't live up to its lofty promises, then it changes the cost-benefit analysis. You might give up almost anything to eliminate all bugs, a lot to eliminate all memory bugs, but what would you give up to eliminate some bugs?
The cost-benefit argument for Rust has always been mediated by the fact that Rust will need to interact with (or include) unsafe code in some domains. Per above, that's an explicit goal of Rust: to provide sound abstractions over unsound primitives that can be used soundly by construction.
I have heard it and I've stated it before. It's never stated in absolute confidence. As I said in another thread, if it was actually true, then Rust wouldn't need an integrated unit testing framework.
It's referring to the experience that Rust learners have, especially when writing relatively simple code, that's it tends to be hard to misuse libraries in a way that looks correct and compiles but actually fails at runtime. Rust cannot actually provide this guarantee, it's impossible in any language. However there are a lot of common simple tasks (where there's not much complex internal logic that could be subtly incorrect) where the interfaces provided by libraries they're depending on are designed to leverage the type system such that it's difficult to accidentally misuse them.
Like something like not initializing a HTTP client properly. The interfaces make it impossible to obtain an improperly initialized client instance. This is an especially distinct feeling if you're used to dynamic languages where you often have no assurances at all that you didn't typo a field name.
I can't imagine anybody seriously making that claim as a property of the language.
(edit: fixed a comma and a forgotten word)
Safe Rust code is safe. You know where unsafe code is, because it's marked as unsafe. Yes, you will need some unsafe code in an notable project, but at least you know where it is. If you don't babysit your unsafe code, you get bad things. Someone didn't do the right thing here and I'm sure there will be a post-mortem and lessons learned.
To be comparable, imagine in C you had to mark potentially UB code with ub{} to compile. Until you get that, Rust is still a clear leader.
Thankfully, it doesn't. There are very few situations which require unsafe code, though a kernel is going to run into a lot of those by virtue of what it does. But the vast majority of the time, you can write Rust programs without ever once reaching for unsafe.
Rust is written in Rust, and we still want to be able to e.g. call C code from Rust. (It used to be the case that external C code was not always marked unsafe, but this was fixed recently).
The devs didn't write unsafe Rust to experience the thrills of living dangerously, they wrote it because the primitives were impossible to express in safe Rust.
If I were to write a program in C++ that has a thread-safe doubly linked list in it, I'd be able to bet on that linked list will have safety bugs, not because C++ is an unsafe language, but because multi-threading is hard. In fact, I believe most memory safety errors today occur in the presence of multi-threading.
Rust doesn't offer me any way of making sure my code is safe in this case, I have to do the due diligence of trying my best and still accept that bugs might happen because this is a hard problem.
The difference between Rust and C++ in this case, is that the bad parts of Rust are cordoned off with glowing red lines, while the bad parts of C++ are not.
This might help me in minimizing the attack surface in the future, but I suspect Rust's practical benefits will end up less impactful than advertised, even when the language is full realized and at its best, because most memory safety issues occur in code that cannot be expressed in safe Rust and doing it in a safe Rust way is not feasible for some technical reason.
The short of it is that for fundamental computer science reasons the ability to always reject unsafe programs comes at the cost of sometimes being unable to verify that an actually-safe program is safe. You can deal with this either by accepting this tradeoff as it is and accepting that some actually-safe programs will be impossible to write, or you can add an escape hatch that the compiler is unable to check but allows you to write those unverifiable programs. Rust chose the latter approach.
> Kinda sounds a lock would make this safe?
There was a lock, but it looks like it didn't cover everything it needed to.
Here's what `List::remove` says on its safety requirements [0]:
/// Removes the provided item from this list and returns it.
///
/// This returns `None` if the item is not in the list. (Note that by the safety requirements,
/// this means that the item is not in any list.)
///
/// # Safety
///
/// `item` must not be in a different linked list (with the same id).
pub unsafe fn remove(&mut self, item: &T) -> Option<ListArc<T, ID>> {
At least if I'm understanding things correctly, I don't think that that invariant is something that locks can protect in general. I can't say I'm familiar enough with the code to say whether some other code organization would have eliminated the need for the unsafe block in this specific case.[0]: https://github.com/torvalds/linux/blob/3e0ae02ba831da2b70790...
> Rust is is not a "silver bullet" that can solve all security problems, but it sure helps out a lot and will cut out huge swatches of Linux kernel vulnerabilities as it gets used more widely in our codebase.
> That being said, we just assigned our first CVE for some Rust code in the kernel: https://lore.kernel.org/all/2025121614-CVE-2025-68260-558d@gregkh/ where the offending issue just causes a crash, not the ability to take advantage of the memory corruption, a much better thing overall.
> Note the other 159 kernel CVEs issued today for fixes in the C portion of the codebase, so as always, everyone should be upgrading to newer kernels to remain secure overall.Same thing is true about modern C++, and as an added bonus, it's far more straightforward to rewrite legacy C code into modern C++ than it is to rewrite it in Rust, and you can do so without compromising existing support for any architectures thanks to GCC (unlike Rust, which relies on LLVM-based rustc), yet there's still a baffling blanket ban on C++ in the kernel. I say "baffling," since it's unjustifiable on technical grounds, but it's easily explained as a mix of Linus by proxy (Greg) being too proud to walk back his infamous vituperative denunciation of the language (even though it's since evolved considerably), and Rust zealots being so numerous, motivated, organized, and also mentally and hormonally unstable enough that it's just not worth the effort to resist them. If C++ developers were the types to threaten self-harm when they didn't get their way, C++ would probably be in the kernel too.
That indicates that Greg Koah-Hartman has a very poor understanding of Rust and the _unsafe_ keyword. The bug can, in fact, exhibit undefined behavior and memory corruption.
His lack of understanding is unfortunate, to put it very mildly.
Or is this just a theoretical argument, "it is hypothetically possible to create a technically-spec-compliant Rust compiler that would compile this into dangerous machine code"? If so it should still be fixed of course, but if I'm patching my Linux kernel I'd rather know what the practical impact is.
To be fair, I'm not saying that Greg KH is definitely wrong; I'm only willing to claim that in the general case observing crashes due to corrupted pointers does not necessarily mean that there's no ability to actually exploit said corruption. Actual exploitability will depend on other factors as well, and I'm far from knowledgeable enough to say anything on the matter.
[0]: https://projectzero.google/2014/08/the-poisoned-nul-byte-201...
All bugs is typically a strawman typically only used by detractors. The correct claim is: safe Rust eliminates certain classes of bugs. I'd wager the design of std eliminates more (e.g. the different string types), but that doesn't really apply to the kernel.
Which is either 1) not true as evidenced by this bug or 2) a tautology whereby Rust eliminates all bugs that it eliminates.
> 1) not true as evidenced by this bug
Code used unsafe, putting us out of "safe" rust.
So arguably both camps are correct. Those who advocate Rust rewrites, and those who are against it too.
I would presume the ratio of safe to unsafe code leads to less unsafe code being written over time as the full ”kernel standard library” gets built out allowing all other parts to replace their hand rolled implementations with the standard one.
Then the safety comment can easily bias the reader into believing that the author has fully understood the problem and all edge cases.
The fact that this survived review is the worrying part. Unsafe blocks are intentionally small and localized in Rust precisely so the safety argument can be checked. If the stated safety argument is incomplete and still passes review, that suggests reviewers are relying on the comment as the proof, rather than rederiving the invariants themselves. Unless of course the wrong people are reviewing these changes. Why rewrite in Rust if we don't apply extreme scrutiny to the tiny subset (presumably) that should be scrutinized.
To be clear, I think this is a failure of process, not Rust of course.
I think the safety comment might have been more on-point than you think. If you look at the original code, it did something like:
- Take a lock - Swap a `Node`'s `death_list` (i.e., a list of `NodeDeath`s) with an empty one - Release the lock - Iterate over the taken `death_list`
While in another thread, you have a `NodeDeath`:
- Take a lock - Get its parent's `death_list` - Remove itself from said list. - Release the lock
The issue is what happens when a `NodeDeath` from the original list tries to remove itself after the parent Node swapped its `death_list`. In that case, the `NodeDeath` grabs the replacement list from its parent node, and the subsequent attempt to remove itself from the replacement list violates the precondition in the safety comment.
> Why rewrite in Rust if we don't apply extreme scrutiny to the tiny subset (presumably) that should be scrutinized.
That "extreme scrutiny" was applied does not guarantee that all possible bugs will be found. Humans are only human, after all.
That's roughly 100% of unsafe code because a lint in the compiler asks for it.
Classic Motte and Bailey. Rust is often said "if it compiles it runs". When that is obviously not the case, Rust evangelicals claim nobody actually means that and that Rust just eliminates memory bugs. And when that isn't even true, they try to mischaracterize it as "all bugs" when, no, people are expecting it to eliminate all memory bugs because that's what Rust people claim.
That claims is overly broad, but its a huge, huge part of it. There's no amount of computer science or verification that can prevent a human from writing the wrong software or specification (let plus_a_b = a - b or why did you give me an orange when I wanted an apple). Unsafe Rust is so markedly different than safe default Rust. This is akin to claiming that C is buggy or broken because people write broken inline ASM. If C can't deal with broken inline ASM, then why bother with C?
I write bugs, because I'm human, and Rust's compiler sure does catch a lot more of my bugs than GCC used to when I was writing C all day.
Stronger typing a big part of why this happens. For example in C it's perfectly usual to use the "int" type for a file descriptor, a count of items in some container and a timeout (in seconds? milliseconds? who knows). We could do better, but we usually don't.
In idiomatic Rust everybody uses three distinct types OwnedFd, usize and Duration. As a result while arithmetic on ints must work in C, the Rust compiler knows that it's reasonable to add two Durations together, it's nonsense to add a Duration to a size, and all arithmetic is inappropriate for OwnedFd, further it's also not reasonable to multiply two Durations together, a Duration multiplied by an integer makes sense and the other way around likewise, but 5 seconds multiplied by 80 milliseconds is nonsense.
For this to be a "classic motte and bailey" you will need to point us to instances where _the original poster_ suggested these (the "bailey", which you characterize as "rust eliminates all bugs") things.
It instead appears that you are attributing _other comments_ to the OP. This is not a fair argumentation technique, and could easily be turned against you to make any of your comments into a "classic motte and bailey".
The real question is "does it provide this greater value for _less_ effort?"
The answer seems to be: "No."
Have fun defining that in an open source project.
> Looking for greater value with less effort is looking for a free lunch
If you have to switch languages to get that value, then no, this has nothing to do with free lunches.
> and we all know TANSTAAFL.
The church of the acronym. I do not share your apparent faith. Engineering requires you to actually do the work and not rely on simple aphorisms to make decisions.
Kernels - and especially the Linux kernel - are high-performance systems that require lots of shared mutable state. Every driver is a glorified while loop waiting for an IRQ so it can copy a chunk of data from one shared mutable buffer to another shared mutable buffer. So there will need to be some level of unsafe in the code.
There's a fallacy that if 95% of the code is safe, and 5% is unsafe, then that code is only 5% as likely to contain memory errors as a comparable C program. But, to reiterate what another commenter said, and something I've predicted for a long time, the tendency for the "unsafe block" to become instrumented by the "safe block" will always exist. People will loosen the API contract between the "safe" and "unsafe" sides until an error in the "safe" side kicks off an error in the "unsafe" side.
This is so obviously false that I suspect there's the reason you don't see any Rust gurus agreeing with you.
Drivers do lots of resource and memory management, far more than just spinning on IRQs.
My anecdotal experience interviewing big tech engineers that used Rust reflects GP's hunch about this astonishing experience gap. Just this year, 4/4 candidates I interviewed couldn't give me the correct answer for what two bytes in base 2 represented in base 10. Not a single candidate asked me about the endianness of the system.
Now that Rust in the kernel doesn't have an "experimental" escape hatch, these motte-and-bailey arguments aren't going to work. Ultimately, I think this is a good thing for Rust in the kernel. Once all of the idiots and buffoons have been sufficiently derided and ousted from public discourse (deservedly so), we can finally begin having serious and productive technical discussions about how to make C and Rust interoperate in the kernel.
I guess it makes sense you're having trouble hiring qualified candidates.
> Since it was in an unsafe block, the error for sure was way easier to find within the codebase than in C. Everything that's not unsafe can be ruled out as a reason for race conditions and the usual memory handling mistakes - that's already a huge win.
The benefit of Rust is you can isolate the possible code that causes an XYZ to an unsafe block*. But that doesn't necessarily mean the error shown is directly related to the unsafe block. Like C++, triggering undefined behavior can in theory cause the program to do anything, including fail spectacularly within seemingly unrelated safe code.
* Excluding cases where safe things are actually possibly unsafe (like some incorrectly marked FFI)
I believe their point was that they only needed to audit only the unsafe blocks to find the actual root cause of the bug once they had an idea of the problematic area.
The author is thinking about "the error" as some source code that's incorrect. "Your error was not bringing gloves and a hat to the snowball fight" but you're thinking "the error" is some diagnostic result that shows there was a problem. "My error is that I'm covered in freezing snow".
Does that help?
When debugging, we care about where the assumptions we had were violated. Not where we observe a bad effect of these violated assumptions.
I think you get here yourself when you say:
> triggering undefined behavior can in theory cause the program to do anything, including fail spectacularly within seemingly unrelated safe code
The bug isn't where it failed spectacularly. It's where the C++ code triggered undefined behavior.
Put another way: if the undefined behavior _didn't_ cause a crash / corrupted data, the bug _still_ exists. We just haven't observed any bad effects from it.
As for why there is unsafe in the kernel? There are things, especially in a kernel, that cannot be expressed in safe Rust.
Still, having smaller sections of unsafe is a boon because you isolate these locations of elevated power, meaning they are auditable and obvious. Rust also excels at wrapping unsafe in safe abstractions that are impossible to misuse. A common comparison point is that in C your entire program is effectively unsafe, whereas in Rust it's a subset.
> Still, having smaller sections of unsafe is a boon because you isolate these locations of elevated power, meaning they are auditable and obvious.
The Rustonomicon makes it very clear that it is generally insufficient to only verify correctness of Rust-unsafe blocks. If the absence of UB in a Rust-unsafe block depends on Rust-not-unsafe code in the surrounding module, potentially the whole module has to be verified for correctness. And that assumes that the module has correct encapsulation, otherwise even more may have to be verified. And a single buggy change to Rust-not-unsafe code can cause UB, if a Rust-unsafe block somewhere depends on that code to be correct.
Strict aliasing in C roughly means that if you initialize memory as a particular type, you can only access it as that type or one of a list of aliasable types look like char. Rust has no such restriction, and has no concept of strict aliasing like this. In Rust, "type aliasing" is allowed, so long as you respect size, alignment, and representability rules.
Aliasing safety in Rust roughly means that you can not have an exclusive reference to an object if any other reference is active for that reference (reality is a little bit more involved than that, but not a lot). C has no such rule.
It's very unfortunate that such similar names were given to these different concepts.
The C "strict aliasing" rule is that with important exceptions the name for a thing of type T cannot also be an alias to a thing of type S, and char is an important exception. Linux deliberately switches off this rule.
Rust's rule is that there mustn't be mutable aliases. We will see why that's important in a moment.
Aliasing is an impediment to compiler optimisation. If you've been watching Matt's "Advent of Compiler Optimisation" videos (or reading the accompanying text) it's been covered a little bit in that, Matt uses C and C++ in those videos, so if you're scared of Rust you needn't fear that in the AoCO
But why mutation? Well, the optimisations concern modification. The optimiser does its job by rewriting what you asked for as something (possibly not something you could have expressed at all in your chosen language) that has the same effect but is faster or smaller. Rewrites which avoid "spilling" a register (writing its value to memory) often improve both size and speed of the software, but if there is aliasing then spilling will be essential because the other aliases are referring to the same memory. If there's no modification it doesn't matter, copies are all identical anyway.
pub(crate) fn release(&self) {
let mut guard = self.owner.inner.lock();
while let Some(work) = self.inner.access_mut(&mut guard).oneway_todo.pop_front() {
drop(guard);
work.into_arc().cancel();
guard = self.owner.inner.lock();
}
- let death_list = core::mem::take(&mut self.inner.access_mut(&mut guard).death_list);
- drop(guard);
- for death in death_list {
+ while let Some(death) = self.inner.access_mut(&mut guard).death_list.pop_front() {
+ drop(guard);
death.into_arc().set_dead();
+ guard = self.owner.inner.lock();
}
}
And here is the unsafe block mentioned in the commit message with some more context [3]: fn set_cleared(self: &DArc<Self>, abort: bool) -> bool {
// <snip>
// Remove death notification from node.
if needs_removal {
let mut owner_inner = self.node.owner.inner.lock();
let node_inner = self.node.inner.access_mut(&mut owner_inner);
// SAFETY: A `NodeDeath` is never inserted into the death list of any node other than
// its owner, so it is either in this death list or in no death list.
unsafe { node_inner.death_list.remove(self) };
}
needs_queueing
}
[0]: https://lore.kernel.org/linux-cve-announce/2025121614-CVE-20...[1]: https://github.com/torvalds/linux/commit/3e0ae02ba831da2b707...
[2]: https://github.com/torvalds/linux/blob/3e0ae02ba831da2b70790...
[3]: https://github.com/torvalds/linux/blob/3e0ae02ba831da2b70790...
/// Removes the provided item from this list and returns it.
///
/// This returns `None` if the item is not in the list. (Note that by the safety requirements,
/// this means that the item is not in any list.)
///
/// # Safety
///
/// `item` must not be in a different linked list (with the same id).
pub unsafe fn remove(&mut self, item: &T) -> Option<ListArc<T, ID>> {
I think it'd be tricky at best to make this particular API safe since doing so requires reasoning across arbitrary other List instances. At the very least I don't think locks would help here, since temporary exclusive access to a list won't stop you from adding the same element to multiple lists.[0]: https://github.com/torvalds/linux/blob/3e0ae02ba831da2b70790...
Otherwise there's the question of where exactly the API boundaries are. In the most general case, your unsafe boundary is going to be the module boundary; as long as what you publicly expose is safe modulo bugs, you're good. In this case the fix was in a crate-internal function, so I suppose one could argue that the public API was/is fine.
That being said, I'm not super-familiar with the code in question so I can't definitively say that there's no way to make internal changes to reduce the risk of similar errors.
Unsafe is the one escape hatch where Rust is more like C, but pragmatically it's an important escape hatch.
Oh no, what happened to Rust will save us from retarded legacy languages prone to memory corruption?
But then when you do you should really know what you're doing.
The fact that this bug is because of "unsafe" Rust usage actually affirms the language's safety when using "safe" code. Although with "memory safe" code you can of course still fuck up lots of other things.
If we're looking beyond one decade, then:
- As with all other utility, the future must be discounted compared to the present.
- A language that might look sterling today might fall behind tomorrow.
- Something something AI by 2035.