the answer is staring you right in the face:
> I fed my setup, budget, and constraints as context into Gemini CLI
> The commit message claimed “60% cost savings.”
don't outsource your critical thinking to a chatbot.
or, if you feel you simply must have the chatbot do this work for you, supervise it more closely. instead you ignored it for 6 weeks:
> From Nov 2 to Dec 14, Cloud Run accrued ~$4,676.
I’ve seen runaway cloud costs at my employer a few times, with a few different services and it took a fair amount of time to figure out the monitoring/alerting. They may change their service agreements or pricing structure in hard to decipher ToS, etc. If the cloud provider won’t refund or credit a company that has a representative, they aren’t going to pay any attention to a solo dev or small team.
I would build locally on your laptop and start with a $5/month VM until you get a paying customer and know what size your system needs.
Also don't outsource your thinking to LLMs, it's a useful tool, but once you do, it's brainrot for programmers.
Very naive answer: before doing anything I would start by playing around in the GCP pricing calculator [1] to figure out how much it was going to cost during development and in production. Did you use the calculator tool? If so, were its estimates accurate?
I wouldn’t have expected that to go well.
First of all, thank you so much for obviously writing part of this via a Large Language Model. Second of all, what kind of argument is "The commit message claimed '60% cost savings'" - do you have any idea what you were actually doing? And lastly, addressing your question:
> Do you set hard budget caps and accept downtime?
If you have no clue what you're doing, yes! Especially for early prototyping, why not? IaaS offerings will also just create downtime for you as well if you need more resources than you've provisioned. It's normal. Either you set up a system where you can rely on dynamic scaling or you don't and set hard limits.
You asked your cloud provider to provision resources, and you were billed for them. If you can't handle working with a cloud provider, you might want to look into less scalable but in turn more cost stable infrastructure solutions.
A little more context: I’ve been on GCP for 4 years, App Engine for the majority of it. Expensive but stable. I’ve used Gemini in the past to reduce costs successfully, so this wasn’t my first attempt at optimizing.
I take ownership of the outcome, but the config behavior still doesn’t match my mental model and Google support hasn’t been able to clarify how to properly scope this either, which is why I turned here.
Could you -learn- how to self-host a version of your app to expand your mental model in doing so? You outsourced the thinking part to an LLM - a bag of words - and are surprised the outcome didn't just work?
> Google support hasn’t been able to clarify how to properly scope this either
More outsourcing of thinking, no? Is Plan A really asking the vendor selling you compute how to use less compute and make them less money, instead of figuring out how to use just enough of it yourself?
If you're taking ownership, who could have effected the outcome the most here? Maybe the person who keeps outsourcing thinking to LLMs, support requests, and forums? I'd argue ownership would look more like figuring out how to handle Top Cost #1 yourself and reduce burn rate, starting by doing less outsourcing.
But if I were doing a side project or starting a business, I would personally use a simple VPS. In my case I would use AWS Lightsail - it’s a simple fixed priced service with no surprises.
I’m not saying you should use AWS. But you definitely shouldn’t be taking advice from a chatbot when you don’t know what you’re doing.
Later, once you have more traffic and/or paying customers, it would be worth looking into cloud hosting. And even then, you probably don't need as much horsepower as they're trying to sell.
How is Cloud Run cost not predictable?
It is fairly simple arithmetic in a spreadsheet to estimate the upper bound with # of max instance times the unit prices times the per-instance resources. Exactly the same as you do with App Engine standard & flexible environment.
> budget caps
Doesn’t exist in GCP and most cloud providers. You can fix the usage or hard cap the usage autoscaling, but not the spend incurred by the usage.
> What guardrails work that don’t depend on constant manual billing checks?
Start conservatively with max instances and instance resource, and iterate based on the actual performance and needs. Say, you know, put the number 1 in everything.
Do your capacity planning and cost estimation and understand them. “Solo dev” or not, you need these things to run the business. The root cause was that you outsourced your business and budgeting decision to LLM without verifying it and understanding it.