From specification to stress test: a weekend with Claude(www.juxt.pro)

26 pointsby henrygarner2 hours ago3 comments

amarble8 minutes ago
As a counterpoint, I also tried writing something with Claude last weekend: a Google docs clone[1]. I spent $170 on Anthropic API credits, and got something that did mostly what I asked but was basically useless. It seems that for simple interfaces for which there is an exact specification, like the recent compiler and web browser examples, it's possible to write bigger projects that "work" as a demo although not in a way they'd be viable alternatives. For anything that requires taste and judgment, we've still got a long way to go. There are lots of great demos out there but few if any real examples of vibe coded (or whatever you want to call it) software standing alone as an alternative to project people wrote.
[1] https://www.marble.onl/posts/this_cost_170.html
altmanaltman27 minutes ago
This blog post is a sophisticated piece of content marketing for a company called JUXT and their proprietary tool, "Allium." While the technical achievement is plausible, the framing is heavily distorted to sell a product.
Here is the breakdown of the flaws and the "BS" in the narrative.
1. The "I Didn't Write Code" Lie The author claims, "I didn't write a line of implementation code." The Flaw: He wrote 3,000 lines of "Allium behavioural specification." The BS: Writing 3,000 lines of a formal specification language is coding. It’s just coding in a proprietary, high-level language instead of Kotlin.
The Ratio is Terrible: The post admits the output was ~5,500 lines of Kotlin. That means for every 1 line of spec, he got roughly 1.8 lines of code.
Why this matters: True "low-code" or "no-code" leverage is usually 1:10 or 1:100. If you have to write 3,000 lines of strict logic to get a 5,000-line program, you haven't saved much effort—you've just swapped languages.
2. The "Weekend Project" Myth The post frames this as a casual project done "between board games and time with my kids." The Flaw: This timeline ignores the massive "pre-computation" done by the human. The BS: To write 3,000 lines of coherent, bug-free specifications for a Byzantine Fault Tolerant (BFT) system, you need to have the entire architecture fully resolved in your head before you start typing. The author is an expert (CTO level) who likely spent weeks or years thinking about these problems. The "48 hours" only counts the typing time, not the engineering time.
3. The "Byzantine Fault Tolerance" (BFT) Bait-and-Switch The headline claims "Byzantine fault tolerance," which implies a system that continues to operate correctly even if nodes lie or act maliciously (extremely hard to build). The Flaw: A "Resolved Question" block in the text admits: "The system's goal is Byzantine fault detection, not classical BFT consensus." The BS: Real BFT (like PBFT or Raft with signatures) is mathematically rigorous and keeps the system running. "Fault Detection" just means "if the two copies don't match, stop." That is significantly easier to build. Calling it "BFT" in the intro is a massive overstatement of the system's resilience.
4. The "Maintenance Nightmare" (The Vendor Lock-in Trap) The post glosses over how this system is maintained. The Flaw: You now have 5,500 lines of Kotlin that no human wrote. The BS: This is the "Model Driven Architecture" (MDA) trap from the early 2000s.
Scenario: You find a bug in the Kotlin code.
Option A: You fix the Kotlin. Result: Your code is now out of sync with the Spec. You can never regenerate from Spec again without losing your fix.
Option B: You fix the Spec. Result: You hope the AI generates the exact Kotlin fix you need without breaking 10 other things.
The Reality: You are now 100% dependent on the "Allium" tool and Claude. If you stop paying for Allium, you have a pile of unmaintainable machine-generated code.
5. The Performance "Turning Point" The dramatic story about 10,000 Requests Per Second (RPS) has a hole in it. The Flaw: The "bottleneck" wasn't the code; it was a Docker proxy setting (gvproxy). The BS: This is a standard "gotcha" for anyone using Docker on Mac. Framing this as a triumph of AI debugging is a stretch—any senior engineer would check network topology when seeing high latency but low CPU usage. 10k RPS is also not "ambitious" for a modern distributed system; a single well-optimized Node.js or Go process can handle that easily.
- antonly3 minutes ago
  Hello, could you please put your over-sensationalized, overly-long, AI-generated comments somewhere else? Thank you.
  Kindly, the HN Community.
henrygarner2 hours ago
Over a weekend, between board games and time with my kids, Claude and I built a distributed system with Byzantine fault tolerance, strong consistency and crash recovery under arbitrary failures. I described the behaviour I wanted in Allium, worked through the bugs conversationally and didn't write a line of implementation code.
- SPICLK226 minutes ago
  I don't see why you need to bring your kids into this, and as a parent any suggestion of being distracted by tech during time with the kids raises my suspicions.
  We have a strict no laptops and no phones rule when the kids are around (unless we're specifically doing something with them using those tools - looking at the weather forecast, or looking up some information).
  "I can prompt AI while playing with the kids" is not a future I want.