I see RAM prices dropping to new lows in 3-5 years.
I assume they could scale faster with more machines of the older, more understood, lithography technology.
1. People who are currently buying AI services realizing it's not all that useful to them and discontinuing their subscriptions. Note that this can come from a changing ecosystem as much as anything to do with the products themselves. I know a couple people running AI propaganda operations where a single person can now do what previously took a major media conglomerate; this is great for them, but if I personally know a couple folks doing this, it indicates that there are probably hundreds of thousands worldwide, and people are simply going to stop trusting anything they read on the Internet.
2. Rising interest rates from the Iran war. Suddenly the cash flows needed to finance all this datacenter and AI model expansion are much higher, and combined with #1, may not be viable.
2. Historical precedent holds that governments are more likely to suppress rates to spur the economy during wartime.
sudden end of overinvestments in hardware procurements by big players. Its unclear if google for example will sustain 50B/y investments.
And Linux runs better than ever on them; I'm running debian 13 with almost no driver issues.
For $2k you can get 32 GB DDR5 RAM and 16 GB fast VRAM. Bump the RAM to 64 GB and you're still below $3k.
I've asked myself that question while looking at some of the models on this: site https://laptopparts4less.frl/index.php?route=common/home
This might not be enough to chew through a large code base but for smaller projects it can easily fit enough if not all of the code base to drive a good coding agent.
I don't recommend specific models or model providers due to how much hype and BS there is around benchmarks etc. Easiest is to check the latest generation of open models and look for a dense-type where a decent quant fits within the VRAM.
Some models run fast enough that some of the weights can spill over from VRAM to RAM while maintaining a usable prompt/token gen speed.