Electricity use of AI coding agents(www.simonpcouch.com)

119 pointsby linolevan18 days ago14 comments

simonw18 days ago
At first glance this looks like a credible set of calculations to me. Here's the conclusion:
> So, if I wanted to analogize the energy usage of my use of coding agents, it’s something like running the dishwasher an extra time each day, keeping an extra refrigerator, or skipping one drive to the grocery store in favor of biking there.
That's for someone spending about $15-$20 in a day on Claude Code, estimated at the equivalent of 4,400 "typical queries" to an LLM.
- Aurornis18 days ago
  Comparing it to running a refrigerator or the dishwasher is very relatable, as most people have at least one refrigerator without a second thought.
  This is for someone using a lot of LLM tokens relative to the average customer of these companies.
  - nasmorn17 days ago
    It also means I could offset my Claude usage with a single solar panel which costs basically nothing. A battery as well if I wanna code late
nospice18 days ago
I'm not sure I like this method of accounting for it. The critics of LLMs tend to conflate the costs of training LLMs with the cost of generation. But this makes the opposite error: it pretends that training isn't happening as a consequence of consumer demand. There are enormous resources poured into it on an ongoing basis, so it feels like it needs to be amortized on top of the per-token generation costs.
At some point, we might end up in a steady state where the models are as good as they can be and the training arms race is over, but we're not there yet.
- Aurornis18 days ago
  That's not really an error, that's a fundamental feature of unit economics.
  Fixed costs can't be rolled into the unit economics because the divisor is continually growing. The marginal costs of each incremental token/query don't depend on the training cost.
  - AndrewDucker17 days ago
    You can absolutely have a stab at it. Estimate how long models last for, amortise over that time/number of calls. We've seen enough models go out of fashion for that to be reasonably done.
- cortesoft18 days ago
  It would be really hard to properly account for the training, since that won't scale with more generation.
  The training is already done when you make a generative query. No matter how many consumers there are, the cost for training is fixed.
  - nospice18 days ago
    My point is that it isn't, not really. Usage begets more training, and this will likely continue for many years. So it's not a vanishing fixed cost, but pretty much just an ongoing expenditure associated with LLMs.
    bob102918 days ago
    No one doing this for money intends to train models that will never be amortized. Some will fail and some are niche, but the big ones must eventually pay for themselves or none of this works.
    The economy will destroy inefficient actors in due course. The environmental and economic incentives are not entirely misaligned here.
    quietbritishjim18 days ago
    > No one doing this for money intends to train models that will never be amortized.
    Taken literally, this is just an agreement with the comment you're replying to.
    Amortizing means that it is gradually written off over a period. That is completely consistent with the ability to average it over some usage. For example, if a printing company buys a big new printing machine every 5 years (because that's how long they last before they wear out), they would amortize it's cost over the 5 years (actually it's depreciation not amortization because it's a physical asset but the idea is the same). But it's 100% possible to look at the number of documents they print over that period and calculate the price of the print machine per document. And that's still perfectly consistent with the machine paying for itself.
- TSiege18 days ago
  The challenge with no longer developing new models is making sure your model is up to date which as of today requires an entire training run. Maybe they can do that less or they’ll come up with a way to update a model after it’s trained. Maybe we’ll move onto something other than LLMs
- skybrian18 days ago
  The training cost is a sunk cost for the current LLM, and unknown for the next-generation LLM. Seems like it would be useful information but doesn't go here?
- robocat18 days ago
  The AI training data sets are also expensive... The cost is especially hard to estimate for data sets that are internal to businesses like Google. Especially if the model needs to be refreshed to deal with recent data.
  I presume historical internal datasets remain high value, since they might be cleaner (no slop) or maybe unavailable (copyright takedowns) and companies are getting better at hiding their data from spidering.
bramhaag18 days ago
Only tangentially related, but today I found a repo that appears to be developed using AI assistance, and the costs for running the agents are reported in the PRs. For example, 50 USD to remove some code: https://github.com/coder/mux/pull/1658
- ammario17 days ago
  Lol, this is my PR. That cost is misleading. That workspace did far more than that change. In reality I spend ~$1000/week in tokens for all of my development work, and I'm quite happy with the exchange.
- AnotherGoodName18 days ago
  This is the cost of Anthropics pay by the token plan.
  To give an analogy, Anthropics pricing is $0.10 per grain of rice (pay by the token) or;
  $20 a month for quarter cup of rice each day (claude pro)
  $100 a month for 10 cups of rice each day (claude max 100)
  $200 a month for a sack of rice delivered to your door each day.
  It's a rather insane pricing scale and here they are paying $50 there because they don't understand the pricing model (which is fair, Anthropics pricing model is crazy). Never pay by the token with Antropic! Only ever use the subscription plans.
  - xmcqdpt217 days ago
    At work we pay per token because we use a third party tool (amp) and it is very pricy. OTOH the subscription model is obviously not profitable. It's priced to capture market share, like Uber in 2010. (TBH the token pricing is probably not profitable either.)
    From the point of view of a large entreprise, it makes more sense to pay a third party that can itself swap out Claude for Gemini (for example) based on pricing, then to buy subscriptions to one specific tool that is likely to suddenly cost 10x as much when they run out of VC funds. The dynamic is going to be different for individuals and small companies.
- msephton18 days ago
  I'd like to suggest that dev lower their settings: sota model + high + thinking is definitely not needed to do this simple task. Lower settings could easily do it for less than $0.50, maybe even $0.05. I'd encourage people to operate on average to low settings and wind then up or down depending on the prompt task complexity.
  - samusiam17 days ago
    Also helps to use big models for planning, small for implementation.
- RestartKernel18 days ago
  That seems reasonable compared to an actual developer (depending on your region), but I had hoped for these models to make simple tasks like this fast and cheap so those developers can focus on the difficult stuff.
  - xmcqdpt217 days ago
    Yes in practice I've done fairly simple tasks using AMP at work and ended up with bills of 100-150$ for them which is somewhat less (but similar order of magnitude) as contracting out to LCoL countries. Depending on how the technology (towards cheaper cost) and financial situation (towards profitability and higher costs) evolve, it wouldn't be shocking to me that the frontier model end up more expensive than human contractors.
linolevan18 days ago
Had a small discussion about this on an OP on bsky. A somewhat interesting discussion over there.
https://bsky.app/profile/simonpcouch.com/post/3mcuf3eazzs2c
matthewfcarlson18 days ago
I would like some real world comparisons. How much power does the laptop or desktop consume during these (likely multi hour) sessions? Assuming you’re using a large HDR monitor 50-100W isn’t unreasonable and at 8 hours a day you’re talking about at least 2 days before you crack 1000kwh like his sessions do. But then a personal desktop on a gaming session can easily pull 1000w (cpu + gpu + peripherals). So comparing it to a gaming session seems fair.
throwerxyz18 days ago
"How much energy does it take to eat meat? (ignoring the cost to produce the meat into your hands)"
Do people even care about this?
How much energy does it take to download a video on YouTube versus the energy input to keep it all setup and running?
- none258517 days ago
  Re: eating meat, plenty of people opt out of eating meat (specifically beef) because of its environmental cost.
  - throwerxyz17 days ago
    I don't know what that has to do with my comment.
ggm18 days ago
As long as it's unaccounted for by users it's at best anexternality. I think it may demand regulation to force this cost to the surface.
electricity and cooling incur wider costs and consequences.
- simonw18 days ago
  That's hardly unique to data centers.
  I'm all for regulation that makes businesses pay for their externalities - I'd argue that's a key economic role that a government should play.
  - ggm18 days ago
    No disagree. I think all enterprises that come with community borne costs (noise, heat, energy use, road use, construction & infrastructure, state incentives) and benefits (tax revenue, jobs) should have some level of accounting. It would be wrong to say the negs always outweigh the positives, thats not the point here. The point is that a bunch of cost in DC relating to power and cooling wind up having impact on the community at large.
    I've been told in other (non US) economies, decisions to site hyperscaler DCs has had downstream impacts on power costs and longterm power planning. The infra to make a lot of power appear at a site, means the same capex and inputs cannot be used to supply power to towns and villages. There's a social opportunity loss in hosting the DC because the power supply doesn't magicly make more transformers and wires and syncons appear on the market: Prices for these things are going up because of a worldwide shortage.
    Its like the power version of worldwide RAM pricing.
- Majromax18 days ago
  > As long as it's unaccounted for by users it's at best anexternality.
  Why is it an externality? Anthropic (or other model provider) pays the electricity cost, then it's passed along in the subscription or API bill. The direct cost of the energy is fully internalized in the price.
  - ggm18 days ago
    Electricity supply is about more than supply cost. It has a build cost, with assumed inputs which are in turn priced by demand, and they're in short supply worldwide. DC costs of construction in power are pushing up the delay and cost for non-DC power projects.
- avalys18 days ago
  No! What a disaster. Enforce the costs of externalities as close to the source as possible. If electricity costs more money charge more money for electricity. Don’t add some “regulation” to force end users to pay more based on your estimate of how much more you think electricity should cost.
  - ggm18 days ago
    I said force the cost to the surface. Not set a fixed rate of return: I want the accounting on the build and supply cost at large, as well as the KW charges to be accounted for.
- jeffbee18 days ago
  I don't see how this follows. Data center operators buy energy and this is almost their only operating expense. Their products are priced to reflect this. The fact that basic AI features are free reflects the fact that they use almost no energy.
  - arrowleaf18 days ago
    I would be surprised if AI prices reflect their current cost to provide the service, even inference costs. With so much money flowing into AI the goal isn't to make money, it's to grow faster than the competition.
    simonw18 days ago
    I remain confident that most AI labs are not selling API access for less than it costs to serve the models.
    If that's so common then what's your theory as to why Anthropic aren't price competitive with GPT-5.2?
    JimDabell18 days ago
    I think it’s more instructive to look at providers like AWS than to compare with other AI labs. What’s the incentive for AWS to silently subsidise somebody else’s model when you run it on their infrastructure?
    AWS are quite happy to give service away for free in vast quantities, but they do it by issuing credits, not by selling below cost.
    I think it’s a fairly safe bet AWS aren’t losing money on every token they sell.
    Majromax18 days ago
    From this article:
    > For the purposes of this post, I’ll use the figures from the 100,000 “maximum”–Claude Sonnet and Opus 4.5 both have context windows of 200,000 tokens, and I run up against them regularly–to generate pessimistic estimates. So, ~390 Wh/MTok input, ~1950 Wh/MTok output.
    Expensive commercial energy would be 30¢ per kWh in the US, so the energy cost implied by these figures would be about 12¢/MTok input and 60¢/MTok output. Anthropic's API cost for Opus 4.5 is $5/MTok input and $25/MTok output, nearly two orders of magnitude higher than these figures.
    The direct energy cost of inference is still covered even if you assume that Claude Max/etc plans are offering a tenfold subsidy over the API cost.
    ggm18 days ago
    Thank you for some good intel. Thats very interesting. But, I wonder how this affects supply pricing to other customers. Not that you haven't shown the direct power costs have been borne, but the more indirect ones remain for me.
    Aurornis18 days ago
    > I would be surprised if AI prices reflect their current cost to provide the service, even inference costs.
    This has been covered a lot. You can find quotes from one of the companies saying that they'd be profitable if not for training costs. In other words, inference is a net positive.
    You have to keep in mind that the average customer doesn't use much inference. Most customers on the $20/month plans never come close to using all of their token allowance.
- pixl9718 days ago
  I mean if we really cared about this one bit we'd stop making a car based society in the US and save far more energy and pollution. That's not politically expedient and there are powerful vested interests in ensuring it doesn't happen.
  That's why I think most of this data center energy use, especially over longer terms is a joke. Data center can pretty easily run on solar and wind energy if we spend even a small amount to political capital to make it happen.
  - ggm18 days ago
    They're a lot harder to build out from those sources because they are costed to run 24/7 and the intermittency issue comes to the fore. Unlike things like aluminium smelters there isn't always a good load-shed or even supply-timing story in a DC, cooling aside (big chunks of cooling can be used for demand management)
    I am not in the DC business. if somebody who is says "thats bunkum" I'd pay attention to it.
thestructuralme16 days ago
A lot of the scary numbers come from agents being left in “always-on” loops: long context windows, tool calls, retries, and idle GPU time between steps. The right unit isn’t “watts per agent” but something like joules per accepted change (or per useful decision), because an agent that burns 10x energy but replaces 20 minutes of human iteration can still be a net win. What I’d love to see is a breakdown by (1) model/token cost, (2) orchestration overhead (retries, evaluation, tool latency), and (3) utilization (how much time the GPU is actually doing work vs waiting). That’s where the real waste usually hides.
scottcha18 days ago
That is a pretty good article although the one factor not mentioned that we see that has a huge impact on energy is batch size but that would be hard to estimate with the data he has.
We've only launched to friends and family but I'll share this here since its relevant: we have a service which actually optimizes and measures the energy of your AI use: https://portal.neuralwatt.com if you want to check it out. We also have a tools repo we put together that shows some demonstrations of surfacing energy metadata in to your tools: https://github.com/neuralwatt/neuralwatt-tools/
Our underlying technology is really about OS level energy optimization and datacenter grid flexibility so if you are on the pay by KWHr plan you get additional value as we continue to roll new optimizations out.
DM me with your email and I'd be happy to add some additional credits to you.
- ccgibson18 days ago
  To add a bit more to what @scottcha is saying: overall GPU load has a fairly significant impact on the energy per result. Energy per result is inversely related, since the idle TDP of these servers is significant the more the energy gets spread the more efficient the system becomes. I imagine Anthropic is able to harness that efficiency since I imagine their servers are far from idle :)
  - Majromax18 days ago
    You can infer the discount from the pricing of the batch API, which is presumably arranged for minimum inference costs. Anthropic offers a 50% discount there, which is consistent with other model providers.
maxdo18 days ago
Us person does not consume 1600 liters a day
- ipnon18 days ago
  I think this number is derived from the water used to create all the goods they consume. Let's say every American eats 1lb of beef per day, and nothing else, to create a simple model. Then every American has a water footprint of about 1,850 gallons.
  [0] https://watercalculator.org/news/articles/beef-king-big-wate...
- sowbug17 days ago
  The almonds that an average person eats in the US account for about 70 liters of water daily. That's already 5% of 1,600, and more than a typical shower.
- NicuCalcea18 days ago
  It does seem high, most of the estimates I find are around half that.
- simonw18 days ago
  What number would you provide for that?
xk318 days ago
So less energy than a human brain uses...
- weiliddat17 days ago
  How did you arrive at that calculation?
  For an average day (incl. "non-working" hours) the brain uses far less at ~300Wh, and if you include the body the average person needs ~2.3 kWh.
  In the rest of the article, they estimate inference electricity use is only ~8% of overall datacenter use, and if we think of the datacenter as the "body" in which the GPU / brain wouldn't work without, that's an overall median use of ~16 kWh for only 24 Claude Code requests.
  I'm more impressed with the human brain's energy efficiency + multimodality + long term context + malleability than anything after using LLMs a bunch, even though I learnt a lot about that in a neuropsych course long time ago.
mikeaskew418 days ago
I have a kids and a dishwasher (which with kids, runs quite often) but I’m not convinced I’m doing worse at energy consumption
renewiltord18 days ago
[flagged]
HNisCIS18 days ago
LLMs don't use much energy at all to run, they use it all at the beginning for training, which is happening constantly right now.
TLDR this is, intentionally or not, an industry puff piece that completely misunderstands the problem.
Also, even if everyone is effectively running a a dishwasher cycle every day, this is still a problem that we can't just ignore, that's still a massive increase in ecological impact.
- jeffbee18 days ago
  Training is pretty much irrelevant in the scheme of global energy use. The global airline industry uses the energy needed to train a frontier model, every three minutes, and unlike AI training the energy for air travel is 100% straight-into-your-lungs fossil carbon.
  - pluralmonad18 days ago
    Not to mention doesn't aviation fuel still make heavy (heh) use of lead?
    TSiege18 days ago
    I think thats only true for propeller planes, which use leaded gasoline. Jet fuel is just kerosene
    tialaramex18 days ago
    Pistons, rather than all propellers. Basically imagine a really old car engine, because simplicity is crucial for reliability and ease of maintenance so all those "fancy" features your car had by the 1990s aren't available, however instead of turning wheels the piston engine turns a propeller. Like really old car engines these piston engines tend to be designed for leaded fuel. Because this is relatively cheap to do, all the cheapest planes aimed at GA (General Aviation, ie you just like flying a plane, not for pay) are like this.
    Propellers are a very common means to make aeroplanes work though, instead of a piston engine, which is cheap to make but relatively unreliable and expensive to run, you can use turbine engines, which run on JetA aka kerosene, and the rotary motion of the turbine drives the propeller making a turboprop. In the US you won't see that many turboprop engines for passenger service, but in the rest of the world that's a very common choice for medium distance aeroplane routes, while the turbofan planes common everywhere in the US would in most places be focused on longer distances between bigger airfields because they deliver peak efficiency when they spend longer up in the sky.
    JetA, whether for a turbofan or turboprop does not have lead in it, so to a first approximation no actual $$$ commercial flights spew lead. They're bad for the climate, but they don't spew lead into the atmosphere.
- simonw18 days ago
  The training cost for a model is constant. The more individual use that model gets the lower the training-cost-per-inference-query gets, since that one-time training cost is shared across every inference prompt.
  It is true that there are always more training runs going, and I don't think we'll ever find out how much energy was spent on experimental or failed training runs.
  - dietr1ch18 days ago
    > The training cost for a model is constant
    Constant until the next release? The battle for the benchmark-winning model is driving cadence up, and this competition probably puts a higher cost on training and evaluation too.
    simonw18 days ago
    Sure. By "constant" there I meant it doesn't change depending on the number of people who use the model.
    dietr1ch18 days ago
    I got that part, it's just that it overlooks the power consumption of the AI race.
    I wish we reach the everything is equally bad phase so we can start enjoying the more constant cost of the entire craze to build your own model with more data than the rest.
- kingstnap18 days ago
  You underestimate the amount of inference and very much overestimate what training is.
  Training is more or less the same as doing inference on an input token twice (forward and backward pass). But because its offline and predictable it can be done fully batched with very high utilization (efficiently).
  Training is guestimate maybe 100 trillion total tokens but these guys apparently do inference on the quadrillion token monthly scales.
- linolevan18 days ago
  I'm not convinced that LLM training is at such a high energy use that it really matters in the big picture. You can train a (terrible) LLM on a laptop[1], and frankly that's less energy efficient than just training it on a rented cloud GPU.
  Most of the innovation happening today is in post-training rather than pre-training, which is good for people concerned with energy use because post-training is relatively cheap (I was able to post-train a ~2b model in less than 6 hours on a rented cluster[2]).
  [1]: https://github.com/lino-levan/wubus-1 [2]: https://huggingface.co/lino-levan/qwen3-1.7b-smoltalk