When AI 'builds a browser,' check the repo before believing the hype(www.theregister.com)

243 pointsby CrankyBear12 days ago25 comments

MrGilbert12 days ago
I love the quote from Gregory Terzian, one of the servo maintainers:
> "So I agree this isn't just wiring up of dependencies, and neither is it copied from existing implementations: it's a uniquely bad design that could never support anything resembling a real-world web engine."
It hurts, that it wasn't framed as an "Experiment" or "Look, we wanted to see how far AI can go - kinda failed the bar." Like it is, it pours water on the mills of all CEOs out there, that have no clue about coding, but wonder why their people are so expensive when: "AI can do it! D'oh!"
- MattGrommes12 days ago
  In blacksmithing there's the concept of an "anvil shaped object". That is, something that looks like an anvil but is hollow or made of ceramic or something. It might stand up to tapping on for making jewelry or something but should never be worked like a real anvil for fear of hurting someone or wrecking the thing you're working on when it breaks.
  I feel like a lot of the AI articles and experiments like this one are producing "app shaped objects" that look okay for making content (and indeed are fine for making earrings) but fall apart when pounded on by the real world.
  - tempodox11 days ago
    Plus we can suspect a tremendous amount of astroturfing on this topic. When you’re spending billions on the tech, a few millions (if it even is that much) for “creative marketing” are really nothing.
- simonw12 days ago
  That was from a conversation here on Hacker News the other day: https://news.ycombinator.com/item?id=46624541#46709191
  - tyre12 days ago
    I wish your recent interview had pushed much harder on this. It came across as politely not wanting to bring up how poorly this really went, even for what the engineer intended.
    They were making claims without the level of rigor to back them up. There was an opportunity to learn some difficult lessons, but—and I don’t think this was your intention—it came across to me as kind of access journalism; not wanting to step on toes while they get their marketing in.
    blibble12 days ago
    pushing would definitely stop the supply of interviews/freebies/speaking engagements
    well_ackshually10 days ago
    The person you're responding to isn't a journalist, they're a mouthpiece. Pushing means they don't get these interviews anymore.
    The quality of whatever they put out as a result of it is yours to take into consideration.
    moomoo1112 days ago
    Why would he push back? His whole schtick is to sell only AI hype. He’s not going to hurt his revenue.
    simonw12 days ago
    If I sell only AI hype why do I keep telling people that many systems built on top of LLMs are inherently insecure? https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
    square_usual12 days ago
    That's a great way to tell on yourself that you've never read Simon's work.
    GoatInGrey12 days ago
    On the contrary, we get to read hundreds of his comments explaining how the LLM in anecdote X didn't fail, it was the developer's fault and they should know better than to blame the LLM.
    I only know this because on occasion I'll notice there was a comment from them (I only check the name of the user if it's a hot take) and I ctrl-F their username to see 20-70 matches on the same thread. Exactly 0 of those comments present the idea that LLMs are seriously flawed in programming environments regardless of who's in the driver seat. It always goes back to operator error and "just you watch, in the next 3 months or years...".
    I dunno, I manage LLM implementation consulting teams and I will tell you to your face that LLMs are unequivocally shit for the majority of use cases. It's not hard to directly criticize the tech without hiding behind deflections or euphemisms.
    simonw12 days ago
    > Exactly 0 of those comments present the idea that LLMs are seriously flawed in programming environments regardless of who's in the driver seat.
    Why would I say that when I very genuinely believe the opposite?
    LLMs are flawed in programming environments if driven by people who don't know how to use them effectively.
    Learning to use them effectively is unintuitive and difficult, as I'm sure you've seen yourself.
    So I try to help people learn how to use them, through articles like https://simonwillison.net/2025/Mar/11/using-llms-for-code/ and comments like this one: https://news.ycombinator.com/item?id=46765460#46765940
    (I don't ever say variants of "just you watch, in the next 3 months or years..." though, I think predicting future improvements is pointless when we can be focusing on what the models we have right now can do.)
    moomoo1112 days ago
    I literally see their posts every (other) day, and its always glazing something that doesn't fully work (but is kind of cool at a glance) or is really just hyped beyond belief.
    Comments usually point out the issues or more grounded reality.
    BTW I'm bullish on AI, going through 100s of millions of tokens per month.
    blibble12 days ago
    the bare minimum of criticism to allow independence to be claimed?
    sealeck12 days ago
    I actually don't think this is true, and certainly of people who cover LLMs Simon Willison is one of the more critical and measured people.
    simonw12 days ago
    I just don't think that's the case.
    The claims they made really weren't that extreme. In the blog post they said:
    > To test this system, we pointed it at an ambitious goal: building a web browser from scratch. The agents ran for close to a week, writing over 1 million lines of code across 1,000 files. You can explore the source code on GitHub.
    > Despite the codebase size, new agents can still understand it and make meaningful progress. Hundreds of workers run concurrently, pushing to the same branch with minimal conflicts.
    That's all true.
    On Twitter their CEO said:
    > We built a browser with GPT-5.2 in Cursor. It ran uninterrupted for one week.
    > It's 3M+ lines of code across thousands of files. The rendering engine is from-scratch in Rust with HTML parsing, CSS cascade, layout, text shaping, paint, and a custom JS VM.
    > It kind of works! It still has issues and is of course very far from Webkit/Chromium parity, but we were astonished that simple websites render quickly and largely correctly.
    That's mostly accurate too, especially the "it kind of works" bit. You can take exception to "from-scratch" claim if you like. It's a tweet, the lack of nuance isn't particularly surprising.
    In the overall genre of CEO's over-hyping their company's achievements this is a pretty weak example.
    I think the people making out that Cursor massively and dishonestly over-hyped this are arguing with a straw man version of what the company representatives actually said.
    mikkupikku12 days ago
    > That's mostly accurate too, especially the "it kind of works" bit. You can take exception to "from-scratch" claim if you like. It's a tweet, the lack of nuance isn't particularly surprising.
    > In the overall genre of CEO's over-hyping their company's achievements this is a pretty weak example
    I kind of agree, but kind of not. The tweet isn't too bad when read from an experienced engineer perspective, but if we're being real then the target audience was probably meant to be technically clueless investors who don't and can't understand the nuance.
    mjr0012 days ago
    What people take issue with is the claim that agents built a web browser "from scratch" only to find by looking deeper that they were using Servo, WGPU, Taffy, winit, and other libraries which do most of the heavy lifting.
    It's like claiming "my dog filed my taxes for me!" when in reality everything was filled out in TurboTax and your dog clicked the final submit button. Technically true, but clearly disingenuous.
    I'm not saying an LLM using existing libraries is a bad thing--in fact I'd consider an LLM which didn't pull in a bunch of existing libraries for the prompt "build a web browser" to be behaving incorrectly--but the CEO is misrepresenting what happened here.
    square_usual12 days ago
    Did you read the comment that started this thread? Let me repeat that, ICYMI:
    > "So I agree this isn't just wiring up of dependencies, and neither is it copied from existing implementations: it's a uniquely bad design that could never support anything resembling a real-world web engine."
    It didn't use Servo, and it wasn't just calling dependencies. It was terribly slow and stupid, but your comment is more of a mischaracterization than anything the Cursor people have said.
    mjr0012 days ago
    You're right in the sense it didn't `use::servo`, merely Servo's CSS parser `cssparser`[0] and Servo's DOM parser `html5ever`[1]. Maybe that dog can do taxes after all.
    [0] https://github.com/search?q=repo%3Awilsonzlin%2Ffastrender%2...
    [1] https://github.com/search?q=repo%3Awilsonzlin%2Ffastrender+h...
    simonw12 days ago
    Taffy is related to Servo too, though apparently not officially part of the Servo project - but Servo does use it.
    https://github.com/DioxusLabs/taffy
    Used here (I think): https://github.com/servo/servo/tree/c639bb1a7b3aa0fd5e02b40d...
    nicoburns12 days ago
    Servo uses Taffy for CSS Grid. It could also very easily use it for Flexbox, but they currently prefer to use their own implementation there.
    It was originally a derivative of React Native's Yoga implementation of Flexbox, and is currently developed primarily as part of the Blitz engine.
    simonw12 days ago
    I agree that "from scratch" is a misrepresentation.
    But it was accompanied by a link to the GitHub repo, so you can hardly claim that they were deliberately hiding the truth.
    tyre12 days ago
    Sorry, just to be clear, the defense that they pulled something out of their ass is that they linked to something that outed them? So they couldn't have actually have been overstating it?
    If anything, that proves the point that they weren't rigorous! They claimed a thing. The thing didn't accomplish what they said. I'm not saying that they hid it but that they misrepresented the thing that they built. My comment to you is that the interview didn't directly firmly pressure them on this.
    Generating a million lines of code in parallel isn't impressive. Burning a mountain of resources in parallel isn't noteworthy (see: the weekly post of someone with an out of control EC2 instance racking up $100k of charges.)
    It would have been remarkable if they'd built a browser from scratch, which they said they did, except they didn't. It was a 50 million token hackathon project that didn't work, dressed up as a groundbreaking example of their product.
    As feedback, I hope in the future you'll push back firmly on these types of claims when given the opportunity, even if it makes the interviewee uncomfy. Incredible claims require incredible evidence. They didn't have it.
    simonw12 days ago
    My goal in the interview was to get to as accurate a version of what they actually built and how they built it as possible.
    I don't think directly accusing them of being misleading about what they had done would have supported that goal, so I didn't do it.
    Instead I made sure to dig into things like what QuickJS was doing in there and why it used Taffy as part of the conversation.
    cube0012 days ago
    3 days ago: (https://news.ycombinator.com/item?id=46743831)
    > Honestly, grilling him about what the CEO had tweeted didn't even cross my mind.
    Today:
    > I don't think directly accusing them of being misleading about what they had done would have supported that goal, so I didn't do it.
    I find it hard to follow how it didn't cross your mind while for the same interview you had also considered the situation and determined it didn't meet the interview goal.
    simonw12 days ago
    I don't think those two statements are particularly inconsistent.
    It didn't cross my mind to grill him over his CEO's tweets.
    I also don't think that directly accusing them of being misleading would support the goal of my interview - which was to figure out the truth of what they built and how.
    If you like, I'll retract the fragment "so I didn't do it" since that implies that I thought "maybe I should grill him about what the CEO said... no actually I won't" - which isn't what happened.
    So I guess you win?
    pera12 days ago
    > I agree that "from scratch" is a misrepresentation.
    I believe in the UK the term for this is actually fraudulent misrepresentation:
    https://en.wikipedia.org/wiki/Misrepresentation#English_law
    And in this context it seems to go against The Consumer Protection from Unfair Trading Regulations 2008 and the Digital Markets, Competition and Consumers Act 2024:
    https://www.legislation.gov.uk/uksi/2008/1277/made
    https://www.legislation.gov.uk/ukpga/2024/13/section/226
    vidarh11 days ago
    I very much don't believe for a second anyone would manage to get a judgement against them on this in the UK.
    For starters, the language is highly subjective, and they'd be able to show vast amounts of discourse about software engineering where "from scratch" often does not involve starting with nothing, and they'd then go on to argue that the person suing haven't actually had any reason to believe that they would be able to replicate a setup that was described as a complex large-scale experiment without much more information.
    The person suing would have an uphill battle showing that whatever assumptions they made were something that was reasonable to infer based on that statement.
    And to have a case, a consumer would also then need to have relied on this as a significant factor in choosing to buy their services.
    But even if we assume the court would agree it is fraudulent, the remedy is only "directly consequential losses".
    In other words, I doubt anyone would lose sleep over this risk.
    Fervicus12 days ago
    How many non developers were going to look at that? They knew exactly what they were doing by saying that.
    mjr0012 days ago
    > But it was accompanied by a link to the GitHub repo, so you can hardly claim that they were deliberately hiding the truth.
    Well, yes and no; we live in an era where people consume headlines, not articles, and certainly not links to Github repositories in articles. If VCs and other CEOs read the headline "Cursor Agents Autonomously Create Web Browser From Scratch" on LinkedIn, the project has served its purpose and it really doesn't matter if the code compiles or not.
    dns_snek12 days ago
    > I think the people making out that Cursor massively and dishonestly over-hyped this are arguing with a straw man version of what the company representatives actually said.
    It's far more dishonest to search for contrived interpretations of their statements in an attempt to frame them as "mostly accurate" when their statements are clearly misleading (and in my opinion, intentionally so).
    You're giving them infinite benefit of the doubt where they deserve none, as this industry is well known for intentionally misleading statements, you're brushing off serious factual misrepresentations as simple "lack of nuance" and finally trying to discredit people who have an issue with all of this.
    With all due respect, that's not the behavior of a neutral reporter but someone who's heavily invested in maintaining a certain narrative.
    dminik12 days ago
    According to the twitter analytics you can see on the post (at least on nitter), the original
    > We built a browser with GPT-5.2 in Cursor. It ran uninterrupted for one week.
    tweet was seen by over 6 million people.
    The follow up tweet which includes the link to the actual details was seen by less than 200000.
    That's just how Twitter engagement works and these companies know it. Over 6 million people were fed bullshit. I'm sorry, but it's actually a great example of CEOs over hyping their products.
    simonw12 days ago
    That Tweet that was seen by 6 million people is here: https://x.com/mntruell/status/2011562190286045552
    You only quoted the first line. The full tweet includes the crucial "it kind of works" line - that's not in the follow-up tweet, it's in the original.
    Here's that first tweet in full:
    > We built a browser with GPT-5.2 in Cursor. It ran uninterrupted for one week.
    > It's 3M+ lines of code across thousands of files. The rendering engine is from-scratch in Rust with HTML parsing, CSS cascade, layout, text shaping, paint, and a custom JS VM.
    > It kind of works! It still has issues and is of course very far from Webkit/Chromium parity, but we were astonished that simple websites render quickly and largely correctly.
    The second tweet, with only 225,000 views, was just the following text and a link to the GitHub repository:
    > Excited to continue stress testing the boundaries of coding agents and report back on what we learn.
    > Code here: https://github.com/wilsonzlin/fastrender
    testdelacc112 days ago
    The fact that the codebase is meaningless drivel has already been established, you don’t need to defend them. It’s just pure slop, and they’re trying to get people to believe that it’s a working browser. At the time he bragged about that `cargo build` didn’t even run! It was completely broken going back a hundred commits. So it was a complete lie to claim that it “kind of works”.
    You have a reputation. You don’t need to carry water for people who are misleading people to raise VC money. What’s the point of you language lawyering about the precise meaning of what he said?
    “No no, you don’t get it guys. I’m technically right if you look at the precise wording” is the kind of silly thing I do all the time. It’s not that important to be technically right. Let this one go.
    simonw12 days ago
    Which part of their CEO saying "It kind of works" are you interpreting as "trying to get people to believe that it’s a working browser"?
    The reason I won't let this one go is that I genuinely believe people are being unfair to the engineer who built this, because some people will jump on ANY opportunity to "debunk" stories about AI.
    I won't stand for misleading rhetoric like "it's just a Servo wrapper" when that isn't true.
    blibble12 days ago
    > I won't stand for misleading rhetoric like "it's just a Servo wrapper" when that isn't true.
    this level of outrage seems absent when it's misleading in the pro-"AI" direction
    sensanaty12 days ago
    > "It kind of works"
    https://github.com/wilsonzlin/fastrender/issues/98
    A project that didn't compile at all counts as "kind of" working now?
    > I won't stand for misleading rhetoric like "it's just a Servo wrapper" when that isn't true.
    True, at least if it was a wrapper then it would actually kind of work, unlike this which is the most obvious case of hyping lies up for investors I've witnessed in the last... Well, week or so, considering how much bullshit spews out of the mouths of AI bros.
    simonw12 days ago
    It did compile. It just didn't compile in GitHub Actions CI, since that wasn't correctly configured.
    dispersed12 days ago
    The linked GitHub issue has quotes from multiple people who were not able to compile it locally, not just in CI.
    testdelacc112 days ago
    simonw has drunk the koolaid on this one. There’s no point trying to convince him. Relatedly, he made a prediction that AI would be able to write a web browser from scratch in 3 years. He really wants to see this happen, so maybe that’s why he’s defending these scammers.
    thmlbvf12 days ago
    It’s been fascinating, watching you go from someone who I could always turn into more sensible opinion about technology for the last 15 years, to a sellout whose every message drips with motivated reasoning.
    simonw12 days ago
    I feel like I spend way too much of my time arguing back against motivated reasoning from people.
    "This project is junk that doesn't even compile", for example.
    vidarh11 days ago
    It's largely futile. There's a certain contingent that will not be convinced of this until they see what these tools can do first hand, and they'll refuse to try to do this properly until it's everywhere.
Sharlin12 days ago
I’m super impressed by how "zillions of lines of code" got re-branded as a reasonable metric by which to measure code, just because it sounds impressive to laypeople and incidentally happens to be the only thing LLMs are good at optimizing.
- jihadjihad12 days ago
  It really is insane. I really thought we had made progress stamping out the idea that more LOC == better software, and this just flies in the face of that.
  I was in a meeting recently where a director lauded Claude for writing "tens of thousands of lines of code in a day", as if that metric in and of itself was worth something. And don't even get me started on "What percentage of your code is written by AI?"
  - Veserv12 days ago
    As Dijkstra once opined in 1988: "My point today is that, if we wish to count lines of code, we should not regard them as "lines produced" but as "lines spent": the current conventional wisdom is so foolish as to book that count on the wrong side of the ledger."
    cedws12 days ago
    In an adage: code is a liability, not an asset.
    embedding-shape11 days ago
    As a fun exercise, I tried to see how close I could get to Cursor's results without using any Rust crates, and by making the agent actually care about the code. End results: 20K LOC for a browser that more or less works the same, on three platforms, leveraging commonly available system libraries and no 3rd party Rust crates: https://emsh.cat/one-human-one-agent-one-browser/ (https://news.ycombinator.com/item?id=46779522)
    I'm not entirely sure what the millions of lines of code is supposedly doing.
  - throwerxyz12 days ago
    "What percentage of your code is written by AI?"
    "I don't know, what percentage of your sweater is polyester?"
    "I don't know, I think it's all cotton, why do you ask me such a random question?"
    "Well surely you know that polyester can be made far cheaper in a plastics factory than cotton? Why do you use cotton?"
  - MonkeyClub12 days ago
    LOC per day metrics are bovine metrics: how many pounds of dung per day.
    jihadjihad12 days ago
    I'd argue porcine: how many pounds of slop per day.
- josefritzishere12 days ago
  KPIs are slowly destroying the American economy. The idea that everything can be easily measured meaningfully with simple metrics by laypeople is a myth propagated by overpaid business consultante. It's absurd and facetious. Every attempt to do so is degrading and counter-productive.
  - graemep12 days ago
    Other western economies too. In the UK its destroying the education system too.
  - mschuster9112 days ago
    The problem is that Western societies shifted into a "zero trust" mode - on all levels. It begins with something like being able to leave your house door unlocked after going for work to that not being reasonable due to thefts and vandalism, and it ends with insane amounts of "dumb capital" being flushed into public companies by ETFs and other investment vehicles.
    And the latter is what's driving the push for KPIs the most - "active" ETFs already were bad enough because their managers would ask the companies they invested in to provide easily-to-grok KPIs (so that they could keep more of the yearly fee instead of having to pay analysts to dig down into a company's finances), and passive ETFs make that even worse because there is now barely any margin left to pay for more than a cursory review.
    America's desire for stock-based pensions is frying the world's economy with its second and third order effects. Unfortunately, that rotten system will most probably only collapse when I'm already dead, so there is zero chance for most people alive today to ever see a world free of this BS.
- atrettel12 days ago
  I completely agree. The issue is that some misconceptions just never go away. People were talking about how bad lines of code is as a metric in the 1980s [1]. Its persistence as a measure of productivity only shows to me that people feel some deep-seated need to measure developer productivity. They would rather have a bad but readily-available metric than no measure of productivity.
  [1] https://folklore.org/Negative_2000_Lines_Of_Code.html
- brokencode12 days ago
  Every line of code is technical debt. Some of the hardest projects I’ve ever worked on involved deleting as much code as I wrote.
  - patrickmay12 days ago
    Exactly. I once worked on a large project where the primary contractor was Accenture. They threw a party when we hit a million lines of C++. I sat in the back at a table with the other folks who knew enough to realize it should have been a wake.
    brokencode12 days ago
    It’s just the easiest metric to measure progress. Measuring real progress in terms of meeting user needs with high quality is a lot harder.
  - jamesfinlayson12 days ago
    Oh yeah. At a previous job there was a guy who'd deleted more code than he'd written which I always found a little amusing.
    Being in a similar position to him now though... if it can be deleted it gets deleted.
- chankstein3812 days ago
  That's what got me. I've never written a browser from scratch but just telling me that it took millions of lines of code made me feel like something was wrong. Maybe somehow that's what it takes? But I've worked in massive monorepos that didn't have 3million lines of code and were able to facilitate an entire business's function.
  - Sharlin12 days ago
    To be fair, it easily takes 3 million lines of code to make a browser from scratch. Firefox and Chrome both have around ten times that(!) – presumably including tests etc. But if the browser is in large part third-party libraries glued together, that definitely shouldn't take 3 million lines.
    mghackerlady12 days ago
    Depending on how functional you want the browser to be. I can technically write a web browser in a few lines of perl but you wouldn't get any styling, let alone javascript. Plus 90% of the code is likely going to fixing compatibility issues with poorly designed sites.
    simonw12 days ago
    FastRender isn't "in large part third-party libraries glued together". The only dependency that fits that bill in my opinion is Taffy for CSS grid and flexbox layout.
    The rest is stuff like HarfBuzz for font rendering which is an entirely cromulent dependency for a project like this.
    jamesfinlayson12 days ago
    Yeah I would have thought 3 million lines for a fully functional browser is a little lean, though I imagine that Chrome and Firefox have probably reinvented some STL stuff over the years (for performance) which would bulk it out.
- Bukhmanizer12 days ago
  Lines of code is just phrenology for software development, but a lot of people are very incentivized to believe in phrenology.
- rvz12 days ago
  These 'metrics' are deliberately meant to trick investors into throwing money into hyped up inflated companies for secondary share sales because it sounds like progress.
  The reality was the AI made an uncompilable mess, adding 100+ dependencies including importing an entire renderer from another browser (servo) and it took a human software engineer to clean it all up.
- add-sub-mul-div12 days ago
  Citing the ability to turn on an endless faucet of code as a benefit and not a liability should be disqualifying.
- nunez12 days ago
  It's only impressive if you've ever only saw code as a means to an end and SLOC never really mattered.
  If you write code in any capacity, you'll know that high LOC counts are usually a sign of a bad time, browsers and operating systems aside.
simonw12 days ago
> According to Perplexity, my AI chatbot of choice, this week‑long autonomous browser experiment consumed in the order of 10-20 trillion tokens and would have cost several million dollars at then‑current list prices for frontier models.
Don't publish things like that. At the very least link to a transcript, but this is a very non-credible way of reporting those numbers.
- storystarling12 days ago
  That implies a throughput of around 16 million tokens per second. Since coding agent loops are inherently sequential—you have to wait for the inference to finish before the next step—that volume seems architecturally impossible. You're bound by latency, not just cost.
  - mrob12 days ago
    The original post claimed they were "running hundreds of concurrent agents":
    https://cursor.com/blog/scaling-agents
    simonw12 days ago
    It was 2,000 concurrent agents at peak.
    I'd still be surprised if that added up to "trillions" of tokens. A trillion is a very big number.
    mikkupikku12 days ago
    16 million a second across 2000 agents would be 8000 tokens per second per agent. This doesn't seem right to me.
    Snuggly7312 days ago
    I mean, its right there in their blog - https://cursor.com/blog/scaling-agents
    "We've deployed trillions of tokens across these agents toward a single goal. The system isn't perfectly efficient, but it's far more effective than we expected."
korm12 days ago
From an engineer working on this here on HN:
> ...while far off from feature parity with the most popular production browsers today...
What a way to phrase it!
You know, I found a bicycle in the trash. It doesn't work great yet, but I can walk it down a hill. While far off from the level of the most popular supercars today, I think we have made impressive progress going down the hill.
tgtweak12 days ago
I think it's impressive for what it is: this level of complexity being reached by an ai-only workflow. Previously, anything of modest complexity required a lot of human guidance - and even with that had some serious shortcomings and crutches. If you extrapolate that the models themselves, the frameworks for inter-model workflows, the tooling available to the models and the hardware running them are all accelerating - it's not hard to envision where this will get to, and that this is a notable achievement particlarly when comparing with the amount of effort and resources put into what we currently see in a browser engine: many decades and countless millions of man-hours.
Fully agree that the original authors made some unsubstantiated and unqualified claims about what was done - which is sad, because it was still a huge accomplishment as i see it.
- Ososjjss12 days ago
  [dead]
simonw12 days ago
If you want to learn more about the Cursor project directly from the source I conducted a 47 minute interview with Wilson Lin, the developer behind FastRender, last week.
We talked about dependencies, among a whole bunch of other things.
You can watch the full video on YouTube or read my extracted highlights here: https://simonwillison.net/2026/Jan/23/fastrender/
pencilcode12 days ago
Just had my manager submit 3 PRs in a language he doesn’t understand (rust) and hasn’t ran or tested and is demanding quick reviews for hundreds of LoCs. These are tools but some people are clueless..
- Fervicus12 days ago
  > These are tools
  Just like your manager.
- patrickmay12 days ago
  Tell him you need him at the code review to explain his coding decisions.
- mbac3276812 days ago
  It's only fair you ask an LLM to review it for you.
jey12 days ago
I don't think the point was to say "look, AI can just take care of writing a browser now". I think it was to show just how far the tools have come. It's not meant to be production quality, it's meant to be an impressive demo of the state of AI coding. Showing how far it can be taken without completely falling over.
EDIT: I retract my claim. I didn't realize this had servo as a dependency.
- santadays12 days ago
  This is entirely too charitable. Basically all this proves is that the agent could run in a loop for a week or so, did anyone doubt that?
  They marketed as if we were really close to having agents that could build a browser on their own. They rightly deserve the blowback.
  This is an issue that is very important because of how much money is being thrown at it, and that effects everyone, not just the "stakeholders". At some point if it does become true that you can ask an agent to build a browser and it actually does, that is very significant.
  At this point in time I personally can't predict whether that will happen or not, but the consequences of it happening seem pretty drastic.
  - anthonypasq9612 days ago
    > This is entirely too charitable. Basically all this proves is that the agent could run in a loop for a week or so, did anyone doubt that?
    yes, every AI skeptic publicly doubted that right up until they started doing it.
  - user3428312 days ago
    I find it hard to believe after running agents fully autonomously for a week you'd end up with something that actually compiles and at least somewhat functions.
    And I'm an optimist, not one of the AI skeptics heavily present on HN.
    From the post it sounds like the author would also doubt this when he talks about "glorified autocomplete and refactoring assistants".
    simonw12 days ago
    You don't run coding agents for a week and THEN compile their code. The best available models would have no chance of that working - you're effectively asking them to one-shot a million lines of code with not a single mistake.
    You have the agents compile the code every single step of the way, which is what this project did.
    user3428312 days ago
    With the agent running autonomously for a long time, I'd have feared it would break my build/verification tasks in an attempt to fix something.
    My confidence in running an agent unsupervised for a long time is low, but to be fair that's not something I tried. I worked mostly with the agent in the foreground, at most I had two agents running at once in Antigravity.
    Veserv12 days ago
    It did not compile [1], so your belief was correct.
    [1] https://news.ycombinator.com/item?id=46649046
    simonw12 days ago
    It did compile - the coding agents were compiling it constantly.
    It didn't have correctly configured GitHub Actions so the CI build was broken.
    Veserv12 days ago
    Then you should have no difficulty providing evidence for your claim. Since you have been engaging in language lawyering in this thread, it is only fair your evidence be held up to the same standard and must be incontrovertible evidence for your claims with zero wiggle room.
    Even though I have no burden of proof to debunk your claims as you have provided no evidence for your claims, I will point out that another commenter [1] indicates there were build errors. And the developer agrees there were build errors [2] that they resolved.
    [1] https://news.ycombinator.com/item?id=46627675
    [2] https://news.ycombinator.com/item?id=46650998
    simonw12 days ago
    I mean I interviewed the engineer for 47 minutes and asked him about this and many other things directly. I think I've done enough homework on this one.
    I take back the implication I inadvertently made here that it compiled cleanly the whole time - I know that's not the case, we discussed that in our interview: https://simonwillison.net/2026/Jan/23/fastrender/#intermitte...
    I'm frustrated at how many people are carrying around a mental model that the project "didn't even compile" implying the code had never successfully compiled, which clearly isn't true.
    Veserv12 days ago
    Okay, so the evidence you are presenting is that the entity pushing intentionally deceptive marketing with a direct conflict of interest said they were not lying.
    I am frustrated at people loudly and proudly "releasing" a system they claim works when it does not. They could have pointed at a specific version that worked, but chose not to indicating they are either intentionally deceptive or clueless. Arguing they had no opportunity for nuance and thus had no choice but to make false statements for their own benefit is ethical bankruptcy. If they had no opportunity for nuance, then they could make a statement that errs against their benefit; that is ethical behavior.
    simonw12 days ago
    See my comment here: https://news.ycombinator.com/context?id=46771405
    I do not think Cursor's statements about this project were remotely misleading enough to justify this backlash.
    Which of those things would you classify as "false statements"? The use of "from scratch"?
    blibble12 days ago
    > Arguing they had no opportunity for nuance and thus had no choice but to make false statements for their own benefit is ethical bankruptcy.
    absolutely
    and clueless managers seeing these headlines will almost certainly lead to people losing their jobs
    santadays12 days ago
    That is a good point. It is impressive. Llms from two years ago were impressive, llms a year ago were impressive, and from a month ago even more impressive.
    Still, getting "something" to compile after a week of work is very different from getting the thing you wanted.
    What is being sold, and invested in, is the promise that LLMs can accomplish "large things" unaided.
    But they can't, as of yet, they cannot, unless something is happening in one of the SOTA labs that we don't know about.
    They can however accomplish small things unaided. However there is an upper bound, at least functionally.
    I just wish everyone was on the same page about their abilities and their limitations.
    To me they understand conext well (e.g. the task, build a browser doesn't need some huge specification because specifications already exist).
    They can write code competently (this is my experience anyway)
    They can accomplish small tasks (my experience again, "small" is a really loose definition I know)
    They cannot understand context that doesn't exist (they can't magically know what you mean, but they can bring to bear considerable knowledge of pre-existing work and conventions that helps them make good assumptions and the agentic loop prompts them to ask for clarification when needed)
    They cannot accomplish large tasks (again my experience)
    It seems to me there is something akin to the context window into which a task can fit. They have this compact feature which I suspect is where this limitation lies. Ie a person can't hold an entire browser codebase in their head, but they can create a general top level mapping of the whole thing so they can know where to reach, where areas of improvement are necessary, how things fit together and what has been and what hasn't been implemented. I suspect this compaction doesn't work super well for agents because it is a best effort tacked on feature.
    I say all this speculatively, and I am genuinely interested in whether this next level of capability is possible. To me it could go either way.
- mjr0012 days ago
  Maybe so, but I don't think 3 million lines of code to ultimately call `servo.render()` is a great way to demonstrate how good AI coding is.
  - jason_oster12 days ago
    `servo.render()` does not appear to exist in the code base. Would you please point it out to us?
  - jey12 days ago
    lmao okay, touché. I did not realize it had servo as a dependency.
- nicoburns12 days ago
  Yeah, but starting with a codebase that is (at least approaching) production quality and then mangling it into something that's very far from production quality... isn't very impressive.
- simonw12 days ago
  It didn't have Servo as a dependency.
  Take a look in the Cargo.toml: https://github.com/wilsonzlin/fastrender/blob/19bf1036105d4e...
  - cfreksen12 days ago
    I haven't really looked at the fastrender project to say how much of a browser it implements itself, but it does depend on at least one servo crate: cssparser (https://github.com/servo/rust-cssparser).
    Maybe there is a main servo crate as well out there, and fastrender doesn't depend on that crate, but at least in my mind fastrender depends on some servo browser functionality.
    EDIT: fastrender also includes the servo HTML parser: html5ever (https://github.com/servo/html5ever).
    simonw12 days ago
    Yes, it depends on cssparser and html5ever from Servo, and also uses Taffy which is a dependency shared with Servo.
    I do not think that makes it a "Servo wrapper", because calling it that implies it has no rendering code of its own.
    It has plenty of rendering code of its own, that's why the rendered pages are slow and have visual glitches you wouldn't get with Sero!
- 12 days ago
  undefined
- rsynnott12 days ago
  > I think it was to show just how far the tools have come.
  In… terms of sheer volume of production of useless crap?
augusteo12 days ago
The frustrating part isn't that the project failed. It's that it was marketed as a success.
I use AI coding tools daily. They're genuinely useful for real work. But stunts like this make it harder to have honest conversations about what AI can and can't do. When executives see "AI built a browser in 3 million lines," they form expectations that set everyone up for disappointment.
The gap between AI demos and AI in production is wider than most people realize. We'd all be better off if people stopped optimizing for impressiveness and started optimizing for honesty.
pton_xd12 days ago
Is there a way to measure the entropy of a piece of software?
Is entropy increasing or decreasing the longer agents work on a code base? If it's decreasing, no matter how slowly, theoretically you could just say "ok, start over and write version 2 using what you've learned on version 1." And eventually, $XX million dollars and YY months of churning later, you'd get something pretty slick. And then future models would just further reduce X and Y. Right?
Maybe they just need to keep iterating.
- diarrhea12 days ago
  In thermodynamics, ultimately you need to input work to remove entropy from a system (e.g. by cooling surroundings). Humans do the same for software.
  I am an avid user of LLMs but I have not seen them remove entropy, not even once. They only add. It’s all on the verge of tech debt and it takes substantial human effort to keep entropy increases in check. Anyone can add 100 lines, but it takes genuine skill to do it 10 (and I don’t mean code golf).
  And to truly remove entropy (cut useless tests, cut useless features, DRY up, find genuine abstractions, talk to PM to avoid building more crap, …) you still need humans. LLM built systems eventually collapse under their own chaos.
  I think your analogy is quite fitting!
drob51812 days ago
You would think a CEO with a product that caters to developers would know that everyone was going to clone the repo and check his work. He just squandered a whole lot of credibility.
- cube0012 days ago
  > He just squandered a whole lot of credibility.
  I've yet to see anyone in this space be negatively impacted by their outlandish claims.
  They release a new model or add extra sub agents and the slate is wiped clean.
- mrguyorama12 days ago
  His target reader is management, not developers.
  Management already doesn't trust developers in any way. Why would they believe you, who are clearly just trying to save your job, over a big company who clearly is the future!
  Or do you trust your management to make the right decision?
  - avd20112 days ago
    Fake until you make it
jazzyjackson12 days ago
If I was to spend a trillion tokens on a barely working browser I would have started with the source code of Sciter [0] instead. I really like the premise of an electron alternative that compiles to a 5MB binary, with a custom data store based on DyBASE [1] built into the front end javascript so you can just persist any object you create. I was ready to build software on top of it but couldn't get the basic windows tutorial to work.
[0] https://sciter.com/
[1] http://www.garret.ru/dybase.html
unleaded12 days ago
anyone remember finding the internet explorer control in windows forms, placing it down, adding some buttons, and telling people you made your own web browser? Maybe this exercise is eternal just in different forms
jmmv12 days ago
If we have been complaining about bloat before, the amount of bloat we are going to witness in the future is unfathomable. How can anyone be proud of a claim like "It's 3M+ lines of code across thousands of files." _especially_ when a lot of this code is relying on external dependencies? Less code is almost always better, not more!
I'm also getting really tired of claims like "we are X% more productive with AI now!" (that I'm hearing day in and out at work and LinkedIn of course). Didn't we, as an industry, agree that we _didn't_ know how to measure productivity? Why is everyone believing all of these sudden metrics that try to claim otherwise?
Look, I'm not against AI. I'm finding it quite valuable for certain scenarios -- but in a constrained environment and with very clear guidance. Letting it loose with coding is not one of them, and the hype is dangerous by how much it's being believed.
- owaislone12 days ago
  I mostly agree with you but I AM super productive with AI. I'm working on a side-project and have built in 2-3 months what my quite productive team of 3-4 people would take 4-6 months to build in 2016. And I'm not talking to generative slop but real production grade code that I have reviewed+approved or manually refactored before merging. Like you I'm not impressed by these toy projects but the productivity gain is very real. At least has been for me.
  - jmmv12 days ago
    I don't disregard what you are saying and believe that being more productive with sufficient quality is _possible_.
    But how do you measure it? All the metrics I see being chased (metrics that were never accepted as productivity measurements before) can be gamed with slop, and so slop is what we'll get.
raggi12 days ago
> I'd just cloned a copy of Chromium myself, and for all that time and money, independent developers who cloned the repo reported that the codebase is very far from a functional browser. Recent commits do not compile cleanly, GitHub Actions runs on main are failing, and reviewers could not find a single recent commit that was built without errors.
Significant typo I assume?
cyanydeez12 days ago
When AI does `x`, check with people familiar with `x`.
Thats like the entire hype cycle: LLM builders see a bunch of hyper specific lanuage in fields they're not experts in and thing 'wow, AI is really smart!'
pessimizer12 days ago
Every single high-profile story that shows up on the feeds about how LLMs are just about there and coders are doomed, if you actually read them and are a programmer, seems like a story about how LLMs are bad and generate trash code that rarely even looks superficially good and definitely doesn't work.
There was a story going around about LLMs making minesweeper clones, and they were all terrible in extremely dumb ways. The headline wasn't obvious, so I thought the take that people were getting from it is that AI is making the same dumb mistakes that it was making a year ago. Nope. It was people ranting about how coders are going to be out of a job next week. Meanwhile, none of them can do a minesweeper clone with like 50 working examples online, maybe 8 things you have to do right to be perfect, and 9000 articles about minesweeper and even mathematical papers about minesweeper to make everything about the game and its purpose perfectly clear. And then AI generates buttons that don't do anything and timers that don't stop.
- simonw12 days ago
  Was that a while ago? Minesweeper's pretty easy.
  Claude Opus 4.5: "Build minesweeper as an artifact, don't use react"
  (Then "Fix it to work on mobile where right click isn’t a thing")
  Play it here: https://tools.simonwillison.net/minesweeper
  Transcript here: https://claude.ai/share/2d351b62-a829-4d81-b65d-8f3b987fba23
  - franktankbank11 days ago
    Do you do anything to make sure you are getting the genuine experience as the regular user of these tools?
- giantrobot12 days ago
  Unfortunately management is the audience for AI fluff stories. Management then fires a huge percentage of their staff on the promise AI will write everything.
  It doesn't matter to the people that were fired that the AI isn't as capable as promised. They're still job hunting in a shitty job market. When management does eventually figure out the AI underperforms they'll hire back staff at a fraction of the salary.
  So executives and management look great no matter what and everyone else gets screwed.
avd20112 days ago
How many millions of dollars did this even cost anyway?
kibwen12 days ago
Our modern economy is nearly entirely built on useless bullshit, this is just what it looks like when the ouroboros starts devouring its own tail. It doesn't matter that the product doesn't work; the hype is the product. In our collective nihilism, we have productized faith itself.

I mean, maybe they should have started simple and slowly iterated.

    project 1: build a text based browser using ratatui and quickjs.

    project 2: base it on project 1. convert to gui, pages should render pure html. 

    project 3: acid1 compliance. Use constraint based programming to output final render, no animation support.

etc etc.

redox9912 days ago
People thinking this does not matter just because the code is awful, it used dependencies, or whatever, are missing the point.
6 months ago with previous models this was absolutely impossible. One of the biggest limitations of LLMs is their difficulty with long tasks. This has been steadily improving and this experiment was just another milestone. It will be interesting a year from now to test how much better new models fare at this task.
TZubiri12 days ago
1% import open-source-incumbent 99% misdirection slop
blibble12 days ago
grifters gonna grift
hexage181412 days ago
AI will never be able to create a browser, just as AI was never able to defeat a chess grandmaster.
- antonvs12 days ago
  Yeah that's one of the real takeaways from this. This will improve over time. People seem to get so put off by hype that they forget there can be things of real significance underneath it. You could make a long list of what's amazing and promising about this "implement a browser" task, despite all its shortcomings.
  - Fervicus12 days ago
    So grifting is okay, just because someday the grift might come true?
    antonvs12 days ago
    Meh. What do you think the grift is here exactly? No one’s trying to sell the newly minted source code to a web browser.
    If this is the first time you’ve encountered a hype bubble, it’s a good opportunity to learn so that you can navigate the next one more easily.
    dns_snek12 days ago
    > What do you think the grift is here exactly?
    The obvious? Selling subscriptions to individuals, reaching higher-ups with bombastic headlines, reaching potential investors, perpetuating the bubble.
    antonvs11 days ago
    Do those "higher-ups" or potential investors have any agency?
    Fervicus12 days ago
    If you think there is no grift here, then maybe it's your first hype bubble.
    antonvs11 days ago
    This reminds me of a US District Court judge's ruling about Tucker Carlson on Fox News: "Any reasonable viewer 'arrives with an appropriate amount of skepticism' about the statements".
    Sure, if you're talking about rubes who just got off the bus in the big city, then perhaps there's a point buried somewhere deep in the pile you're pushing. But is that who you think the HN audience is? Admittedly, talking to you, I can't rule it out.
    antonvs8 days ago
    Amusingly, by (dictionary) definition you're wrong, because "grift" means "engage in petty or small-scale swindling" (Merriam-Webster), but you're alluding to a much larger-scale phenomenon.
    But it's tinfoil-hattish to claim that projects like this and the PR about it is part of a "grift". You're squinting hard to be able to see what you want to see.
- 12 days ago
  undefined
- franktankbank11 days ago
  That's probably true but what is the point of misrepresenting it today? It's a sick society where that is a net benefit to a corp
antonvs12 days ago
FTA:
> tools like Cursor can be genuinely helpful as glorified autocomplete and refactoring assistants
That suggests a fairly strong anti-AI bias by the author. Anyone who thinks that this is all AI coding tools are today is not actually using them seriously.
That's not to say that this exercise wasn't overhyped, but a more useful, less biased article that's not trying to push an agenda would look at what went right, as well as what went wrong.
- iainctduncan12 days ago
  No, it suggests the sarcasm that is the Registers in house style. See the page tagline.... "Biting the hand that feeds IT"
  - antonvs12 days ago
    That’s a dubious assumption. In the context of the rest of this article, it doesn’t seem like sarcasm.