The article name is somewhat misleading, since it makes it sound like this would also apply to desktop workloads. The article says it is for datacenters and that is true, but it would have been better had the title ended with the words “in datacenters” to avoid confusion.
It most likely won't. This patch set only affects applications that enable epoll busy poll using the EPIOCSPARAMS ioctl. It's a very specialized option that's not commonly used by applications. Furthermore, network routing in Linux happens in the kernel, not in user space, so this patch set doesn't apply to it at all.
Now, NAPI already was supposed to have some adaptiveness involved, so I guess it's possibly a matter of optimizing it.
But my system is compiling for now so will look at article more in depth later :V
> If this [new] parameter is set to a non-zero value and a user application has enabled preferred busy poll on a busy poll context (via the EPIOCSPARAMS ioctl introduced in commit 18e2bf0edf4d ("eventpoll: Add epoll ioctl for epoll_params")), then application calls to epoll_wait for that context will cause device IRQs and softirq processing to be suspended as long as epoll_wait successfully retrieves data from the NAPI. Each time data is retrieved, the irq_suspend_timeout is deferred.
I'm expect there are many, but the vast majority are likely massive datacenters with hundreds of thousands of machines each running multiple instances, and also Android phones are probably more common than home equipment. Edit: Also IoT, as someone else points out.
I suggest you reread “software used in datacenters (e.g. by CDNs) that does use it”. This is not a reference to software in a datacenter. It is a reference to software in a datacenter that uses it, which is a subset of the former.
I confess I'm dubious on major savings for most home users, though? At least, at an absolute level. 30% of less than five percent is still not that big of a deal. No reason not to do it, but don't expect to really see the results there.
If you’re doing custom routing with a NUC or a basic Linux box, however, this would gain massive power savings because that box pretty much only does networking.
I much prefer grassroots projects. Made by and for people like me <3 That's why I moved to BSD (well there were other reasons too of course)
Also no you are entirely unaffected by this unless you use a very specific and uncommon syscall.
Like others have mentioned, it's just a huge deal in the data center now. With that comes a lot of influence by corporates interests.
Whereas BSD has gone the opposite way. Started by Berkeley but abandoned to the community. Business is not really interested in that because anything they contribute can be used by anyone for anything (even their competitors can use it in closed source code). Netflix was the biggest user but I don't think they contribute anymore either. WhatsApp used it until Facebook acquired them. That leaves netgate and ix systems which are small. Netgate pushed a really terrible wireguard once but it was nipped in the bud luckily. https://arstechnica.com/gadgets/2021/03/buffer-overruns-lice... It also highlighted many trust issues which have been improved since.
Of course whether this is an issue for you is very personal. For me it is but clearly for a lot more people it isn't, as Linux is s lot more popular.
Arguably, the only somewhat mainstream operating systems today that deserve that label are the *BSDs. Haiku OS gets an honorable mention but I wouldn't consider Haiku OS to be mainstream.
> Arguably, the only somewhat mainstream operating systems today that deserve that label are the *BSDs.
I think the word you're looking for is 'sidelined'.
But yeah grassroots was perhaps not the right term. I meant the status quo, not the origin. I don't know what the right word is then though.
The article name is, Data Centers Can Slash Power Needs With One Coding Tweak: Reworking 30 lines of Linux code could cut power use by up to 30 percent
The article says, “It is sort of a best case because the 30 percent applies to the network stack or communication part of it,” Karsten explains. “If an application primarily does that, then it will see 30 percent improvement. If the application does a lot of other things and only occasionally uses the network, then the 30 percent will shrink to a smaller value.”
It seems you only read the HN title? If so, why bother to critique the article's title?
They mix interrupts and polling depending on the load. The interrupt service routine and user-kernel context-switch overhead is tiny computationally and hence in power usage.
Also, most network hardware in the last twenty years has had buffer coalescing, reducing interrupt rates.
Who in their right mind would leave something so critical to the success of their product to a general purpose OS designed for mainframes and slightly redesigned for PCs?
(I typed this on Linux PC).
This is specifically a change to the Linux kernel, which is much, much more broadly successful.
Instead, they use DPDK, XDP, or userspace stacks like Onload or VMA—often with SmartNICs doing hardware offload. In those cases, this patch wouldn’t apply, since packet processing happens entirely outside the kernel.
That doesn’t mean the patch isn’t valuable—it clearly helps in setups where the kernel is in the datapath (e.g., CDNs, ingress nodes, VMs, embedded Linux systems). But it probably won’t move the needle for workloads that already bypass the kernel for performance or latency reasons. So the 30% power reduction headline is likely very context-dependent.
The power efficiency seems to be limited to "network applications using epoll".
The 30% the article talks about seems to be benchmarked on memcached, and here is the ~30 lines diff they're probably talking about: https://raw.githubusercontent.com/martinkarsten/irqsuspend/m...
For me, it feels like a moral imperative to make my code as efficient as possible, especially when a job will take months to run on hundreds of CPU.
Also, would you share all new found efficiencies with your competitors?
I personally believe the majority is wasted. Any code that runs in an interpreted language, JIT/AOT or not, is at a significant disadvantage. On performance measurements it's as bad as 2x to 60x worse than the performance of the equivalent optimized compiled code.
> it feels like a moral imperative to make my code as efficient as possible
Although we're still talking about fractions of a Watt of power here.
> especially when a job will take months to run on hundreds of CPU.
To the extent that I would say _only_ in these cases are the optimizations even worth considering.
It is unfortunate that many software engineers continue to dismiss this as "premature optimization".
But as soon as I see resources or server costs gradually rising every month (even on idle usage) costing into the tens of thousands which is a common occurrence as the system scale, then it becomes unacceptable to ignore.
I was working with a peer on a click handler for a web button. The code ran in 5-10ms. You have nearly 200ms budget before a user notices sluggishness. My peer "optimized" the 10ms click handler to the point of absolute illegibility. It was doubtful the new implementation was faster.
Most commonly, If the costs increase as the users increase it then becomes an issue with efficiency and the scaling is not good nor sustainable which can easily destroy a startup.
In this case, the Linux kernel is directly critical for applications in AI, real time systems, networking, databases, etc and performance optimizations and makes a massive difference.
This article is a great example of properly using compiler optimizations to significantly improve performance of the service. [0]
[0] https://medium.com/@utsavmadaan823/how-we-slashed-api-respon...
For the completely uninitiated, taking the most critical code paths uncovered via profiling and asking an LLM to rewrite it to be more efficient might give an average user some help in optimization. If your code takes more than a few minutes to run, you definitely should invest in learning how to profile, common optimizations, hardware latencies and bandwidths, etc.
With most everything I use at the consumer level these days, you can just feel the excessive memory allocations and network latency oozing out of it, signaling the inexperience or lack of effort of the developers.
The issue I was trying to resolve was sudden, dramatic changes in traffic. Think: a loop being introduced in the switching, and the associated packet storm. In that case, interrupts could start coming in so fast that the system couldn't get enough non-interrupted time to disable the interrupts, UNLESS you have more CPUs than busy networking interfaces. So my solution then was to make sure that the Linux routers had more cores than network interfaces.
I'm not au fait with network data centres though, how similar are they in terms of their demands?
I expect you're right that GPU data centers are a particularly extreme example
My current guess is that I heard it on a podcast (either a Dwarkesh interview or an episode of something else - maybe transistor radio? - featuring Dylan Patel).
I'll try to re listen to top candidates in the next two weeks (a little behind on current episodes because I'm near the end of an audiobook) and will try to ping back if I find it.
If too long has elapsed, update your profile so I can find out how to message you!
Not so sexy. But as a force multiplier that's still a lot of carbon probably.
https://didgets.substack.com/p/finding-and-fixing-a-billion-...
The "up to 30%" figure is operative when you have a near-idle application that's busy polling, which is already dumb. There are several ways to save energy in that case.
That was my first thought, but it sounds like the OS kernel, not the application, has control over the polling behavior, right?
This is the sort of performance efficiencies I want to keep seeing on this site, from those who are distinguished experts and contributed to critical systems such as the Linux kernel.
Unfortunately, in the last 10-15 years we are seeing the worst technologies being paraded due to a cargo-cultish behaviour. From asking candidates to implement the most efficient solution to a problem in interviews but then also choosing the most extremely inefficient technologies to solve certain problems because so-called software shops are racing for that VC money. Money that goes to hundreds of k8s instances on many over-provisioned servers instead of a few.
Performance efficiency critically matters, and it is the difference between having enough runway for a sustainable business vs having none at all.
And nope. AI Agents / Vibe coders could not have come up with a more correct solution in the article.
Typically saves 2-3 microseconds going through the kernel network stack.
“It will take one year,” said the master promptly.
“But we need this system immediately or even sooner! How long will it take if I assign ten programmers to it?”
The master programmer frowned. “In that case, it will take two years.”
“And what if I assign a hundred programmers to it?”
The master programmer shrugged. “Then the design will never be completed,” he said.
— Chapter 3.4 of The Tao of Programming, by Geoffrey James (1987)
I mean I get it, you dislike Gnome for whatever reason and want to cast some shade - that part is clear. What I don't really understand is how you decided this context is somehow going to be received in a way to further your view... it's just so illogical that my reaction is "support the Gnome folks".
Actually - due to this comment I just donated $50 to the GNOME Foundation. Either the guerilla marketing worked, or your mission failed - in any case I hope this was an effective lesson in messaging.
This is basically 3 month old news now.
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds...
If it is not reported for people to know about it, this will do nothing.
That said, the opt-in here is only relevant when software has already opted into busy polling, so it likely does not apply to software used outside of datacenters.
Similarly, I prefer old books, old computer games, old movies (or at least not ones currently being hot/viral/advertised), this allows a lot of trash to self-filter out. Including trash being breathlessly promoted and consumed.
6.13 is old news now, we're already on 6.14, and even 6.15 isn't far from release (we're already at rc3).
I may notice changes when they get adopted by upstream maintainers of my distro, but that usually takes time..