Taking the time to point a coding agent towards the public (or even private) API of a B2B SaaS app to generate a working (partial) clone is effectively "unblocking" the agent. I wouldn't be surprised if a "DTU-hub" eventually gains traction for publishing and sharing these digital twins.
I would love to hear more about your learnings from building these digital twins. How do you handle API drift? Also, how do you handle statefulness within the twins? Do you test for divergence? For example, do you compare responses from the live third-party service against the Digital Twin to check for parity?
I wrote a bunch more about that this morning: https://simonwillison.net/2026/Feb/7/software-factory/
This one is worth paying attention to to. They're the most ambitious team I've see exploring the limits of what you can do with this stuff. It's eye-opening.
> If you haven’t spent at least $1,000 on tokens today per human engineer, your software factory has room for improvement
Seems to me like if this is true I'm screwed no matter if I want to "embrace" the "AI revolution" or not. No way my manager's going to approve me to blow $1000 a day on tokens, they budgeted $40,000 for our team to explore AI for the entire year.
Let alone from a personal perspective I'm screwed because I don't have $1000 a month in the budget to blow on tokens because of pesky things that also demand financial resources like a mortgage and food.
At this point it seems like damned if I do, damned if I don't. Feels bad man.
I didn't read that as you need to be spending $1k/day per engineer. That is an insane number.
EDIT: re-reading... it's ambiguous to me. But perhaps they mean per day, every day. This will only hasten the elimination of human developers, which I presume is the point.
As for me, we get Cursor seats at work, and at home I have a GPU, a cheap Chinese coding plan, and a dream.
I don't think you need to spend anything like that amount of money to get the majority of the value they're describing here.
I built a tool that writes (non shit) reports from unstructured data to be used internally by analysts at a trading firm.
It cost between $500 to $5000 per day per seat to run.
It could have cost a lot more but latency matters in market reports in a way it doesn't for software. I imagine they are burning $1000 per day per seat because they can't afford more.
At home on my personal setup, I haven't even had to move past the cheapest codex/claude code subscription because it fulfills my needs ¯\_(ツ)_/¯. You can also get a lot of mileage out of the higher tiers of these subscriptions before you need to start paying the APIs directly.
Takes like this are just baffling to me.
For one engineer that is ~260k a year.
If everyone can do this, there won't be any advantage (or profit) to be had from it very soon. Why not buy your own hardware and run local models, I wonder.
No local model out there is as good as the SOTA right now.
THIS FRIGHTENS ME. Many of us sweng are either going be FIRE millionaires, or living under a bridge, in two years.
I’ve spent this week performing SemPort; found a ts app that does a needed thing, and was able to use a long chain of prompts to get it completely reimplemented in our stack, using Gene Transfer to ensure it uses some existing libraries and concrete techniques present in our existing apps.
Now not only do I have an idiomatic Python port, which I can drop right into our stack, but I have an extremely detailed features/requirements statement for the origin typescript app along with the prompts for generating it. I can use this to continuously track this other product as it improves. I also have the “instructions infrastructure” to direct an agent to align new code to our stack. Two reusable skills, a new product, and it took a week.
I’m happy to answer any questions!
> Those of us building software factories must practice a deliberate naivete
This is a great way to put it, I've been saying "I wonder which sacred cows are going to need slaughtered" but for those that didn't grow up on a farm, maybe that metaphor isn't the best. I might steal yours.
This stuff is very interesting and I'm really interested to see how it goes for you, I'll eagerly read whatever you end up putting out about this. Good luck!
EDIT: oh also the re-implemented SaaS apps really recontextualizes some other stuff I’ve been doing too…
Or a vegan or Hindu. Which ethics are you willing to throw away to run the software factory?
I eat hamburgers while aware of the moral issues.
Not just code review agents, but things like "find duplicated code and refactor it"?
* DRYing/Refactoring if needed
* Documentation compaction
* Security reviews