Where I'm at with AI(paulosman.me)

46 pointsby crashwhip9 days ago14 comments

Legend24408 days ago
> I am certain that generative AI is a productivity amplifier, but its economic, environmental, and cultural externalities are not being discussed enough.
You sure? That’s basically all that’s being discussed.
There’s nothing in this article I haven’t heard 100 times before. Open any mainstream news article or HN/Reddit thread and you’ll find all of OP’s talking points about water, electricity, job loss, the intrinsic value of art, etc.
- everdrive8 days ago
  This is a weird quirk that I observe in all sorts of contexts. "No one's talking about [thing that is frequently discussed]!" or, "There's never been [an actor in this identity category] in major movie role before!" (except there has plenty of times) or sometimes "You can't even say Christmas anymore!" (except they just did) The somewhat inaccurate use of hyperbolic language does not mean that there is _nothing_ to the particular statement or issue. Only that the hyperbole is just that; an exaggeration of a potentially real and valid issue. The hyperbole is not very helpful, but neither is a total refutation of the issue based on the usage of hyperbole.
  - throwup2388 days ago
    @dang has noted this phenomenon in various forms multiple times. The most recent one I can find:
    > It's common, if not inevitable, for people who feel strongly about $topic to conclude that the system (or the community, or the mods, etc.) are biased against their side. One is far more likely to notice whatever data points that one dislikes because they go against one's view and overweight those relative to others. This is probably the single most reliable phenomenon on this site. Keep in mind that the people with the opposite view to yours are just as convinced that there's bias, but they're sure that it's against their side and in favor of yours. [1]
    [1] https://news.ycombinator.com/item?id=42205856
  - dgeiser137 days ago
    I find hyperbolic comments on recommendation threads, e.g. "I can't believe no one has mentioned <blah> yet", to be quite annoying. Every item recommended on this thread is going to be "first" when it's first recommended. Trying to rocket your suggestion to the top of the pile by suggesting it's self-obviousness doesn't mean it's a better suggestion than any others.
- s-macke8 days ago
  > ... emitting a NYC worth of CO2 in a year is dizzying
  Simplified comparisons like these rarely show the full picture [0]. They focus on electricity use only, not on heating, transport, or meat production, and certainly not on the CO2 emissions associated with New York’s airports. As a rough, back-of-the-envelope estimate, a flight from Los Angeles to New York with one seat is on the order of 1,000,000 small chat queries CO2e.
  Of course we should care about AI’s electricity consumption, especially when we run 100 agents in parallel simply because we can. But it’s important to keep it in perspective.
  [0] https://andymasley.substack.com/p/a-cheat-sheet-for-conversa...
- erxam8 days ago
  It should be reworded as: It's not being discussed amongst the people who matter.
- happytoexplain8 days ago
  Yes, it's being discussed a lot. No, it's not being discussed enough, nor by all the right people. It has the potential to cause untold suffering to the middle class of developed nations, since it's moving too fast for individual humans to adjust. On the Problem Scale that puts it on the "societal emergency" end, which basically can not be discussed enough.
  - sodapopcan8 days ago
    Ya, I think "it's not being discussed enough" is a veiled way to say: "I can't believe so people are ok with this shit."
- Aurornis8 days ago
  > You sure? That’s basically all that’s being discussed
  It’s not just being discussed, it’s dominating the conversation on sites like Hacker News. It’s actually hard to find the useful and informative LLM content because it doesn’t get as many upvotes as the constant flow of anti-LLM thought pieces like this.
  There was even the strange period when the rationalist movement was in the spotlight and getting mainstream media coverage for their AI safety warnings. They overplayed their hand with the whole call to drop bombs on data centers and the AI 2027 project with Scott Alexander that predicted the arrival of AGI and disastrous consequences in 2027. There was so much extreme doom and cynicism for a while that a lot of people just got tired of it and tuned out.
- micromacrofoot8 days ago
  I think maybe people who write that it's not being discussed really mean that people aren't doing anything based on the discussions. Overall all of this is just sort of happening.
- rootnod38 days ago
  And most of those concerns being wildly dismissed by the AI shills, even here on HN.
  Mention anything about the water and electricity wastage and embrace the downvotes.
  - sharifhsn8 days ago
    Because those criticisms misses the forest for the trees. You might as well complain about the pollution caused by the Industrial Revolution. AI doesn’t use nearly as much as water as even a small amount of beef production. And we have cheap ways of producing electricity, we just need to overhaul our infrastructure and regulations.
    The more interesting questions are about psychology, productivity, intelligence, AGI risk, etc. Resource constraints can be solved, but we’re wrestling with societal constraints. Industrialization created modernism, we could see a similar movement in reaction to AI.
    nancyminusone8 days ago
    >we could see
    Well, that's just it. Those extentitential risks aren't even proven yet.
    Meanwhile, threats to resources are already being felt today. "just overhaul our infrastructure" isn't an actionable solution that will magically fix things today or next week. Even if these don't end up being big problems in the grand scheme of things doesn't mean they aren't problems now.
    wiseowise8 days ago
    > AI doesn’t use nearly as much as water as even a small amount of beef production.
    It's almost like humans evolved from prehistoric times eating meat. Wild discovery, right?
    > And we have cheap ways of producing electricity, we just need to overhaul our infrastructure and regulations.
    Why don't AI bros lobby the changes? Much harder than selling snake oil and getting rid of programmers, I guess.
    sooklord8 days ago
    [dead]
  - aspenmartin8 days ago
    Well considering people that disagree with you “shills” is maybe a bad start and indicates you kind of just have an axe to grind. You’re right that there can be serious local issues for data centers but there are plenty of instances where it’s a clear net positive. There’s a lot of nuance that you’re just breezing over and then characterizing people that point this out as “shills”. Water and electricity demands do not have to be problematic, they are highly site specific. In some cases there are real concerns (drought-y areas like Arizona, impact on local grids and possibility of rate impacts for ordinary people etc) but in many cases they are not problematic (closed loop or reclaimed water, independent power sources, etc).
    rootnod38 days ago
    It’s already causing massive problems before we even got to models that even do anything actually useful.
    Wait until other countries jump in the bandwagon and see energy prices jump.
    Currently it is mostly the US and China.
    And RAM and GPU prices are already through the roof with SSDs to follow.
    That is for now with ONLY the US market mostly.
    And those “net benefits” you talk about have very questionable data behind it.
    It’s a net loss of you ask me.
    aspenmartin8 days ago
    Well why would we ask you (no offense)? What is the questionable data you're talking about?
    RAM and GPU prices going up, sure ok, but again: if you're claiming there is no net benefit from AI what is your evidence for that? These contracts are going through legally, so what basis do you have to prevent them from happening? Again I say its site specific: plenty of instances where people have successfully prevented data centers in their area, and lots of problems come up (especially because companies are secretive about details so people may not be able to make informed judgements).
    What are the massive problems besides RAM + GPU prices, which again, what is the societal impact from this?
    dr-detroit8 days ago
    [dead]
- duped8 days ago
  What's not being discussed are the people building these things are evil and they're doing it for evil purposes.
  I spent some time thinking of a better word than "evil" before typing this comment. I can't think of one. Doing something bad that harms more than it helps for the purposes of enrichment and power is simply put: evil.
jillesvangurp8 days ago
> increases in worker productivity at best increase demand for labor, and at worst result in massive disruption - they never result in the same pay for less manual work.
Exactly. Strongly agree with that. This closed world assumption never holds. We would only do less work if nothing else changes. But of course everything changes when you lower the price of creating software. It gets a lot cheaper. So, now you get a lot of companies suddenly considering doing things that previously would be too expensive. This still takes skills and expertise they don't have. So, they get people involved doing that work. Maybe they'll employ some of those but the trend is actually to employ things that are core to your company.
And that's just software creation. Everything else is going to change as well. A lot of software we use is optimized for humans. Including all of our development tools. Replacing all that with tools more suitable for automatic driving by AI is an enormous amount of work.
And we have decades worth of actively used software that requires human operators currently. If you rent a car, some car rental companies still interface with stuff written before I was born. And I'm > 0.5 century old. Same with banks, airline companies, insurers, etc. There's a reason this stuff was never upgraded: doing so is super expensive. Now that just got a bit cheaper to do. Maybe we'll get around to doing some of that. Along with all the stuff for which the ambition level just went up by 10x. And all the rest.
- goalieca8 days ago
  > There's a reason this stuff was never upgraded: doing so is super expensive. Now that just got a bit cheaper to do. Maybe we'll get around to doing some of that.
  I don’t think typing code was ever the problem here. I’m doubtful this got any cheaper.
  - jillesvangurp8 days ago
    There's a big difference between typing assembler, cobol or whatever; or typing something more modern in terms of what the resulting code does. And also there's a good reason why programmers aren't paid in characters/words/lines/bytes whatever per hour: they are mostly not typing but thinking. The amount of thinking they do is a constant. The amount of typing they do is a constant. But there's a big step change in productivity for the resulting stuff.
ivanstojic8 days ago
> If you asked me six months ago what I thought of generative AI, I would have said
It’s always this tired argument. “But it’s so much better than six months ago, if you aren’t using it today you are just missing out.”
I’m tired of the hype boss.
- libraryofbabel8 days ago
  This point about "some people already said six months ago that it was better than it was six months ago" is regularly trotted out in threads like this like it's some sort of trump card that proves AI is just hype. It doesn't make sense to me. What else do you expect people to be saying about a rapidly-improving technology? How does it help you to distinguish technologies that are hype from those that are not?
  I'm sure people were saying similar things about, say, aviation all through the first decades of the 20th century, "wow, those planes are getting better every few years"... "Until recently planes were just gimmicks, but now they can fly across the English channel!"... "I wouldn't have got in one of those death traps 5 years ago, but now I might consider it!" And different people were saying things like that at different times, because they had different views of the technology, different definitions of usefulness, different appetites for risk. It's just a wide range of voices talking in similar-sounding terms about a rapidly-developing technology over a span of time.
  This is just how people are going to talk about rapidly-improving technologies for which different people have different levels of adoption at different times. It's not a terribly interesting point. You have to engage with the specifics, I'm afraid.
- deweller8 days ago
  The second half of that argument was not in this article. The author was just relating his experience.
  For what it is worth, I have also gone from a "this looks interesting" to "this is a regular part of my daily workflow" in the same 6 month time period.
  - jofla_net8 days ago
    "The challenge isn’t choosing “AI or not AI” - that ship has sailed."
- Aurornis8 days ago
  I’m a light LLM user myself and I still write most of the important code by myself.
  Even I can see there has been a clear advancement in performance in the past six months. There will probably be another incremental step 6 months from now.
  I use LLMs in a project that helps give suggestions for a previously manually data entry job. Six months ago the LLM suggestions were hit or miss. Using a recent model it’s over 90% accurate. Everything is still manually reviewed by humans but having a recent model handle the grunt work has been game changing.
  If people are drinking a firehose of LinkedIn style influencer hype posts I could see why it’s tiresome. I ignore those and I think everyone else should do. There is real progress being made though.
- candiddevmike8 days ago
  I think the rapid iteration and lack of consistency from the model providers is really killing the hype here. You see HN stories all the time around how things are getting worse, and it seems folks success with the major models is starting to heavily diffuse.
  The model providers should really start having LTS (at least 2 years) offerings that deliver consistent results regardless of load, IMO. Folks are tired of the treadmill and just want some stability here, and if the providers aren't going to offer it, llama.cpp will...
  - KptMarchewa8 days ago
    There is a difference between quantization of SOTA model and old models. People want non-quantized SOTA models, rather than old models.
    jdjeeee8 days ago
    Put that all aside. Why can’t they demo a model on max load to show what it’s capable of…?
    Yeah, exactly.
- aspenmartin8 days ago
  Yea I hear this a lot, do people genuinely dismiss that there has been step change progress over 6-12 months timescale? I mean it’s night and day, look at benchmark numbers… “yea I don’t buy it” ok but then don’t pretend you’re objective
  - benrutter8 days ago
    I think I'd be in the "don't buy it" camp, so maybe I can explain my thinking at least.
    I don't deny that there's been huge improvements in LLMs over the last 6-12 months at all. I'm skeptical that the last 6 months have suddenly presented a 'category shift' in terms of the problems LLMs can solve (I'm happy to be proved wrong!).
    It seems to me like LLMs are better at solving the same problems that they could solve 6 months ago, and the same could be said comparing 6 months to 12 months ago.
    The argument I'd dismiss isn't the improvement, it's that there's a whole load of sudden economic factors, or use cases, that have been unlocked in the last 6 months because of the improvements in LLMs.
    That's kind of a fuzzier point, and a hard one to know until we all have hindsight. But I think OP is right that people have been claiming "LLMs are fundamentally in a different category to where they were 6 months ago" for the last 2 years - and as yet, none of those big improvements have yet unlocked a whole new category of use cases for LLMs.
    To be honest, it's a very tricky thing to weight into, because the claims being made around LLMs are very varied from "we're 2 months away from all disease being solved" to "LLMs are basically just a bit better than old school Markov chains". I'd argue that clearly neither of those are true, but it's hard to orient stuff when both those sides are being claimed at the same time.
    WarmWash8 days ago
    The improvement in LLMs has come in the form of more successful one shots, more successful bug finding, more efficient code, less time hand-holding the model.
    "Problem solving" (which definitely has improved, but maybe has a spikey domain improvement profile) might not be the best metric, because you could probably hand hold the models of 12 months ago to the same "solution" as current models, but you would spend a lot of time hand holding.
    aspenmartin8 days ago
    > The argument I'd dismiss isn't the improvement, it's that there's a whole load of sudden economic factors, or use cases, that have been unlocked in the last 6 months because of the improvements in LLMs.
    Yes I agree here in principle here in some cases: I think there are certainly problems that LLMs are now better at but that don't reach the critical reliability threshold to say "it can do this". E.g. hallucinations, handling long context well (still best practice to reset context window frequently), long-running tasks etc.
    > That's kind of a fuzzier point, and a hard one to know until we all have hindsight. But I think OP is right that people have been claiming "LLMs are fundamentally in a different category to where they were 6 months ago" for the last 2 years - and as yet, none of those big improvements have yet unlocked a whole new category of use cases for LLMs.
    This is where I disagree (but again you are absolutely right for certain classes of capabilities and problems).
    - Claude code did not exist until 2025
    - We have gone from e.g. people using coding agents for like ~10% of their workflow to like 90-100% pretty typically. Like code completion --> a reasonably good SWE (with caveats and pain points I know all too well). This is a big step change in what you can actually do, it's not like we're still doing only code completion and it's marginally better.
    - Long horizon task success rate has now gotten good enough that basically also enable the above (good SWE) for like refactors, complicated debugging with competing hypotheses, etc, looping attempts until success
    - We have nascent UI agents now, they are fragile but will see a similar path as coding which opens up yet another universe of things you can only do with a UI
    - Enterprise voice agents (for like frontline support) now have a low enough bounce rate that you can actually deploy them
    So we've gone from "this looks promising" to production deployment and very serious usage. This may kind of be like you say "same capabilities but just getting gradually better" but at some point that becomes a step change. Before a certain failure rate (which may be hard to pin down explicitly) it's not tolerable to deploy, but as evidenced by e.g. adoption alone we've crossed that threshold, especially for coding agents. Even sonnet 4 -> opus 4.5 has for me personally (beyond just benchmark numbers) made full project loops possible in a way that sonnet 4 would have convinced you it could and then wasted like 2 whole days of your time banging your head against the wall. Same is true for opus 4.5 but its for much larger tasks.
    > To be honest, it's a very tricky thing to weight into, because the claims being made around LLMs are very varied from "we're 2 months away from all disease being solved" to "LLMs are basically just a bit better than old school Markov chains". I'd argue that clearly neither of those are true, but it's hard to orient stuff when both those sides are being claimed at the same time.
    Precisely. Lots and lots of hyperbole, some with varying degrees of underlying truth. But I would say: the true underlying reality here is somewhat easy to follow along with hard numbers if you look hard enough. Epoch.ai is one of my favorite sources for industry analysis, and e.g. Dwarkesh Patel is a true gift to the industry. Benchmarks are really quite terrible and shaky, so I don't necessarily fault people "checking the vibes", e.g. like Simon Willison's pelican task is exactly the sort of thing that's both fun and also important!
skydhash8 days ago
> that most software will be built very quickly, and more complicated software should be developed by writing the specification, and then generating the code. We may still need to drop down to a programming language from time to time, but I believe that almost all development will be done with generative AI tools
My strongly held belief is that anyone who think that way, also think that software engineering is reading tickets, searching for code snippets on stack overflow and copy-pasting code.
Good specifications are always written after a lot of prototypes, experiments and sample implementations (which may be production level). Natural language specifications exist after the concept has been formalized. Before that process, you only have dreams and hopes.
- leoedin8 days ago
  I've been playing around with "vibe coding" recently. Generally react front end and Rust back end. Rust has the nice benefit that you only really get logic bugs if it compiles.
  In the few apps I've built, progress is initially amazing. And then you get to a certain point and things slow down. I've built a list of things that are "not quite right" and then, as I work through each one all the strange architectural decisions the AI initially made start to become obvious.
  Much like any software development, you have to stop adding features and start refactoring. That's the point at which not being a good software developer will really start to bite you, because it's only experience that will point you in the right direction.
  It's completely incredible what the models can do. Both in speed of building (especially credible front ends), and as sounding boards for architecture. It's definitely a productivity boost. But I think we're still a long way off non-technical people being able to develop applications.
  A while ago I worked on a non-trivial no-code application. I realised then that even though there's "no code", you still needed to apply careful thought about data structures and UI and all the other things that make an application great. Otherwise it turned into a huge mess. This feels similar.
  - throwaway0123_58 days ago
    > But I think we're still a long way off non-technical people being able to develop applications.
    I'm surprised I haven't seen anyone do a case study, having truly non-technical people build apps with these tools. Take a few moderately tech-savvy (can use MS office up to doing basic stuff in excel, understands a filesystem) people who work white collar jobs. Give them a one or two-day crash course on how Claude Code works. See what is the most complicated app which they can develop that is reasonably stable and secure.
  - CuriouslyC8 days ago
    This is a big part of why vibe coded projects fall over. AI creates big balls of mud because it's myopic as hell. Linters help but they're still pretty limited. I've had good success using https://github.com/sibyllinesoft/valknut to automate the majority of my refactoring work. I still do a style/naming pass on things, but it handles most issues that can be analytically measured.
- JBAnderson58 days ago
  I think part of the issue here is that software engineering is a very broad field. If you’re building another crud app, your job might only require reading a ticket and copy/pasting from stack overflow. If you are working in a regulated industry, you are spending most of your time complying with regulations. If you are building new programming languages or compilers, you are building the abstractions from the ground up. I’m sure there’s dozens if not hundreds of other sub fields that build software in other ways with different requirements and constraints.
  LLMs will trivialize some subfields, be nearly useless in others, but will probably help to some degree in most of them. The range of opinions online about how useful LLMs are in their work probably correlates to what subfields they work in
  - skydhash8 days ago
    The thing is if you’re working on a CRUD app, you probably have (and you should) a framework, which make it easy to do all the boilerplate. Editor fluency can add an extra boost to your development speed.
    I’ve done CRUD and the biggest proportion of the time was custom business rules and UI tweaking (updating the design system). And they were commits with small diffs. The huge commits were done by copy pasting, code generators and heavy use of the IDE refactoring tools.
mkw50538 days ago
The Uber comparison feels weak because their lock-in came from regulatory capture and network effects, neither of which LLMs have once weights are commoditized (are we already there?).
dtnewman8 days ago
> The current landscape is a battle between loss-leaders. OpenAI is burning through billions of dollars per year and is expected to hit tens of billions in losses per year soon. Your $20 per month subscription to ChatGPT is nowhere near keeping them afloat. Anthropic’s figures are more moderate, but it is still currently lighting money on fire in order to compete and gain or protect market share.
I don't doubt that the leading labs are lighting money on fire. Undoubtedly, it costs crazy amounts of cash to train these models. But hardware development takes time and it's only been a few years at this point. Even TODAY, one can run Kimi K2.5, a 1T param open-source model on two mac studios. It runs at 24 tokens/sec. Yes, it'll cost you $20k for the specs needed, but that's hobbyist and small business territory... we're not talking mainframe computer costs here. And certainly this price will come down? And it's hard to imagine that the hardware won't get faster/better?
Yes... training the models can really only be done with NVIDIA and costs insane amounts of money. But it seems like even if we see just moderate improvement going forward, this is still a monumental shift for coding if you compare where we are at to 2022 (or even 2024).
[1] https://x.com/alexocheema/status/2016487974876164562?s=20
- AnotherGoodName8 days ago
  And just to add to this the reason the Apple macs are used is that they have the highest memory bandwidth of any easily obtainable consumer device right now. (Yes the nvidia cards which also have hbm are even higher on memory bandwidth but not easily obtainable). Memory bandwidth is the limiting factor for inference more so than raw compute.
  Memory costs are skyrocketing right now as everyone pivots to using hbm paired with moderate processing power. This is the perfect combination for inference. The current memory situation is obviously temporary. Factories will be built and scaled and memory is not particularly power hungry, there’s a reason you don’t really need much cooling for it. As training becomes less of a focus and inference more of a focus we will at some point be moving from the highest end nvidia cards to boxes of essentially power efficient memory hbm memory attached to smaller more efficient compute in the future.
  I see a lot of commentary “ai companies are so stupid buying up all the memory” around the place atm. That memory is what’s needed to run the inference cheaply. It’s currently done on nvidia cards and apple m series cpus because those two are the first to utilise High Bandwidth Memory but the raw compute of the nvidia cards is really only useful for training, they are just using them for inference right now because there’s not much pn the market that has similar memory bandwidth. But this will be changing very soon. Everyone in the industry is coming along with their own dedicated compute using hbm memory.
dudeinhawaii8 days ago
I'm very PRO AI and a daily user but let's be real, where is the productivity gain? It's at the edges, it's in greenfield, we've seen a handful of "things written with AI" be launched successfully and the majority of the time these are shovels for other AI developers. Large corporations are not releasing even 5x higher velocity or quality. Ditto for small corporations. If the claimed multipliers were broadly real at the org level, we would be seeing obvious outward signals like noticeably faster release cycles in mainstream products, more credible challengers shipping mature alternatives, and more rapid catching-up in open-source software projects. I am not seeing that yet, is anyone else? It feels like the 'invisible productivity' version of Gross Domestic Product claims, the claims don't seem to match real-world.
And those are the EASY things for AI to "put out of work".
HARD systems like government legacy systems are not something you can slap 200 unit tests on and say "agent re-write this in Rust". They're hard because of the complex interconnects, myriad patches, workarounds, institutional data codified both in the codebase and outside of it. Bugs that stay bugs because they next 20 versions used that bug in some weird way. I started my career in that realm. I call bullshit on AI taking any jobs here if it can't even accelerate the pace of _quality_ releases of OS and video games.
I'm not saying this won't happen eventually but that eventually is doing a heavy lift. I am skeptical of the 6-12 month timelines for broad job displacement that I see mentioned.
AIs (LLMs) are useful in this subtle way like "Google Search" and _not_ like "the internet". It's a very specific set of text heavy information domains that are potentially augmented. That's great but it's not the end of all work or even the end of all lucrative technical work.
It's a stellar tool for smart engineers to do _even_ more, and yet, the smart ones know you have to babysit and double-check -- so it's not remotely a replacement.
This is without even opening the can of worms that is AI Agency.
- sjsizjhaha8 days ago
  > the smart ones know you have to babysit and double-check -- so it's not remotely a replacement.
  This has been the big one for me. Of all the best devs I know, only one is really big on AI, but he’s also the one that spent years in a relational database without knowing what isolation levels are. He was absolutely a “greenfield” kinda dev - which he was great at, tons of ideas but not big on the small stuff.
  The rest use them judiciously and see a lot of value, but reading and understanding code is still the bottleneck. And reviewing massive PRs is not efficient.
yalogin8 days ago
There really are no other use cases for generative AI than software flows. The irony of all this is software engineers automated their own workflows and made themselves replaceable. One think I am convinced of is that software engineering salaries will fall and we have seen the peak salaries for this industry, they will only fall going forward. Sure there will be a few key senior folks that continue to make a lot, but the software salaries itself will go through a k split just like out economy.
- ryandrake8 days ago
  > The irony of all this is software engineers automated their own workflows and made themselves replaceable.
  "Software Engineers" isn't a single group, here. Some software engineers (making millions of dollars at top AI companies) are automating other software engineers' workflows and making those software engineers replaceable. The people who are going to be mostly negatively affected by these changes are not the people setting out to build them.
- KptMarchewa8 days ago
  > The irony of all this is software engineers automated their own workflows and made themselves replaceable.
  Do you really think we should be like ancient egyptian scribes and gatekeep the knowledge and skills purely for our sake?
  I don't think there has been other field that is this open with trade secrets. I know a dentist who openly claims they should be paid for training dental interns (idk what the US terminology is) despite them being productive, simply because they will earn a lot of money in the future. I did not want and do not want software engineering to become that.
willtemperley8 days ago
It's important to remember these things are almost certainly gaslighting people through subtle psychological triggers, making people believe these chatbots far more than they are, using behavioural design principles [1].
I often find when I come up with the solution, these little autocompletes pretend they knew that all along. Or I make an observation they say something like "yes that's the core insight into this".
They're great at boilerplate. They can immediately spot a bug in a 1000 lines of code. I just wish they'd stop being so pretentious. It's us that are driving these things, it's our intelligence, intuition and experience that's creating solutions.
[1] https://en.wikipedia.org/wiki/Behavioural_design
datsci_est_20158 days ago
This article is different because it actually talks about code review, which I don’t see very often. Especially in the ultra-hype 1000x “we built an operating system in a day using agentic coding”, it seems like code review is implied to be a thing of the past.
As long as code continues to need to be reviewed to (mostly) maintain a chain of liability, I don’t see SWE going anywhere like both hypebros and doomers seem to be intent on posting about.
Code review will simply become the bottle neck for SWE. In other words, reading and understanding code so that when SHTF the shareholders know who to hold accountable.
- sjsizjhaha8 days ago
  The corollary here is there’s 2 cases here:
  1. You know exactly what the code should look like ahead of time. Agents are handy here, but this is rare and takes no time at all as is
  2. You don’t know exactly what it should be. Building the mental model by hand is faster than working backwards from finished work. Agents help with speeding up your knowledge here, but having them build the model is not good
  Code review is and always was the bottle neck. You are doing code review when you write code.
- Normal_gaussian8 days ago
  This is one of the key elements that will shift the SWE discipline. We will move to more explicitly editing systems by their footprint and guarantees, and verifying that what exists matches those.
  - datsci_est_20158 days ago
    “Editing systems” will remain being a rigid and formal process, though. The discipline won’t become softer and more easily understood (SWEs are going to be replaced!), if anything it will become harder and less easily understood.
    Shareholders don’t want a “release” button that when they press it there’s a XX% chance that the entire software stack collapses because an increasingly complex piece of software is understood by no one but a black box called “AI” that can’t be fired or otherwise held accountable.
8 days ago
undefined
fatheranton8 days ago
[dead]
vonneumannstan8 days ago
Its really hard to take people who say this seriously: "If you asked me six months ago what I thought of generative AI, I would have said that we’re seeing a lot of interesting movement, but the jury is out on whether it will be useful"
Like I'm sorry but if you couldn't see that this tech would be enormously useful for millions if not billions of people you really shouldn't be putting yourself out there opining on anything at all. Same vibes as the guys saying horseless carriages were useless and couldn't possibly do anything better than a horse which after all has its own mind. Just incredibly short sighted and lacking curiosity or creativity.
- skydhash8 days ago
  First car prototypes were useless and it has taken a few decades to have a good version. The first combustion engine was in 1826. Would you buy a prototype or a carriage for transportation at that time?
  - vonneumannstan8 days ago
    If you couldn't foresee how they would eventually be useful with improvements over time you probably bought a lot of horse carriages in 1893 and appropriately lost your ass.
    CamperBob28 days ago
    The problem is, that's only one way you could have lost your ass during the transition from horses to cars. I think that's what skydhash was getting at. There were hundreds of early car companies, all competing aggressively with one another, each with something to sell that the others seemed to lack. For every one that succeeded, dozens ended in bankruptcy, and not necessarily through any obvious fault of their own. You could be right about everything and still end up broke.
    How to avoid being a Duryea, a Knox, a Marsh, a Maxwell-Briscoe, or a Pope-Toledo seems to be the real question.
    Same thing pretty much happened in the early days of radio, with the addition of vicious patent wars. Which I'm sure we'll eventually see in the AI field, once the infinite money hose starts to dry up.
    skydhash8 days ago
    And even if AI becomes a staple of daily life, why the rush? No ones today learn how to drive cars on the Model T. I’ve never even used win 95 and earlier, and learned about Linux around 2010. Now I’ve been reading the latter’s source code.
  - volkk8 days ago
    no, but AI isn't going to light on fire as I drive and potentially kill me. it's also not an exorbitant expense.
    irishcoffee8 days ago
    LLMs have convinced people to "light themselves on fire" as they drive the LLM. They're dead now.
- gllmariuty8 days ago
  [dead]
- rootnod38 days ago
  [dead]
bandrami8 days ago
> I use it for routine coding tasks like generating scaffolding or writing tests
IDK this sounds a whole lot like paying for snippets
- rootnod38 days ago
  Almost. It’s worse though. It’s using snippets without using grey matter and shifting blame if needed.
  A snippet for a for loop in your editor, yeah, go ahead. A “snippet for “make this spec into half assed code” is a wholly different story.
  Overpaid bubble snippets at best.
- speff8 days ago
  Framed this way, it's useful for saving time creating or finding those snippets at least.
  - rootnod38 days ago
    for<TAB>….
    Yeh, you are right. A snippet for they would have taken longer than booting an agent and let it hallucinate for—loop parameters that didn’t even exist.
    speff8 days ago
    This hasn't been the case in my experience, but I don't doubt it can happen.
  - bandrami8 days ago
    I still think that's missing the point of LLMs; they're sources of plausible text continuations. Their strength is using them in places where the actual semantics of their output isn't important.