Prompt engineering playbook for programmers(addyo.substack.com)

435 pointsby vinhnx2 days ago33 comments

DebtDeflation2 days ago
In my experience there's really only three true prompt engineering techniques:
- In Context Learning (providing examples, AKA one shot or few shot vs zero shot)
- Chain of Thought (telling it to think step by step)
- Structured output (telling it to produce output in a specified format like JSON)
Maybe you could add what this article calls Role Prompting to that. And RAG is its own thing where you're basically just having the model summarize the context you provide. But really everything else just boils down to tell it what you want to do in clear plain language.
- dachris2 days ago
  Context is king.
  Start out with Typescript and have it answer data science questions - won't know its way around.
  Start out with Python and ask the same question - great answers.
  LLMs can't (yet) really transfer knowledge between domains, you have to prime them in the right way.
  - christophilusa day ago
    Dunno. I was working on a side project in TypeScript, and couldn’t think of the term “linear regression”. I told the agent, “implement that thing where you have a trend line through a dot cloud”, or something similarly obtuse, and it gave me a linear regression in one shot.
    I’ve also found it’s very good at wrangling simple SQL, then analyzing the results in Bun.
    I’m not doing heavy data processing, but so far, it’s remarkably good.
    whoknowsidonta day ago
    Linear regression is a non-niche, well understood topic that's used in many other domains other than data science.
    However, asking it to implement "that thing that groups data points into similar groups" needs a bit more context (I just tried it) as K-means is very much specific to machine learning.
    throw234241231421 hours ago
    From a fresh session:
    initial prompt: Start a new typescript file. It will be used for data science purposes
    second prompt: Implement that "that thing that groups data points into similar groups"
    The output was a full function implementing K-means (along with a Euclidean distance function).
    https://chatgpt.com/share/68420bd4-113c-8006-a7fe-c2d0c9f91d...
    whoknowsidont4 hours ago
    >It will be used for data science purposes
    Doesn't this ruin / ignore the point we're discussing? I don't think anyone thought otherwise?
  - nxobject20 hours ago
    I see that as applying to niche platforms/languages without large public training datasets - if Rust was introduced today, the productivity differential would be so stacked against it that I’m not sure it would hypothetically survive.
  - 0pointsa day ago
    That's your made up magical explanation right there dude.
    Every day tech broism gets closer to a UFO sect.
    LPisGooda day ago
    I think it’s not really a magical explanation; it’s pretty grounded in how LLMs work.
    Obviously how exactly they work still isn’t fully explained, but calling basic principles magical is too far in my opinion.
    apwell23a day ago
    > how LLMs work.
    maybe to gp but i thought no one knows how they work.
- lexandstuff2 days ago
  Even role prompting is totally useless imo. Maybe it was a thing with GPT3, but most of the LLMs already know they're "expert programmers". I think a lot of people are just deluding themselves with "prompt engineering".
  Be clear with your requirements. Add examples, if necessary. Check the outputs (or reasoning trace if using a reasoning model). If they aren't what you want, adjust and iterate. If you still haven't got what you want after a few attempts, abandon AI and use the reasoning model in your head.
  - dimitri-vs2 days ago
    It's become more subtle but still there. You can bias the model towards more "expert" responses with the right terminology. For example, a doctor asking a question will get a vastly different response than a normal person. A query with emojis will get more emojis back. Etc.
    didgeoridooa day ago
    This is definitely something I’ve noticed — it’s not about naïve role-priming at all, but rather about language usage.
    “You are an expert doctor, help me with this rash I have all over” will result in a fairly useless answer, but using medical shorthand — “pt presents w bilateral erythema, need diff dx” — gets you exactly what you’re looking for.
    james_marksa day ago
    If this holds up, it’s an interesting product idea you could MVP in a day.
    Lay person’s description -> translate into medical shorthand -> get the expert response in shorthand -> translate back.
    Liability and error is obviously problematic.
  - easyThrowawaya day ago
    I get the best results with Claude by treating the prompt like a pseudo-SQL language, treating words like "consider" or "think deeply" like keywords in a programming language. Also making use of their XML tags[1] to structure my requests.
    I wouldn't be surprised if in a few years from now some sort of actual formalized programming language for "gencoding" AI is gonna emerge.
    [1]https://docs.anthropic.com/en/docs/build-with-claude/prompt-...
  - petesergeanta day ago
    One thing I've had a lot of success with recently is a slight variation on role-prompting: telling the LLM that someone else wrote something, and I need their help assessing the quality of it.
    When the LLM thinks _you_ wrote something, it's nice about it, and deferential. When it thinks someone else wrote it, you're trying to decide how much to pay that person, and you need to know what edits to ask for, it becomes much more cut-throat and direct.
    dwringera day ago
    I notice this to affect its tendency to just make things up in other contexts, too. I asked it to take a look at "my" github, gave it a link, then asked it some questions; it started talking about completely different repos and projects I never heard of. When I simply said take a look at `this` github and gave it a link, its answers had a lot more fidelity to what was actually there (within limits of course - it's still far from perfect) [This was with Gemini Flash 2.5 on the web]. I have had simlar experiences asking it to do style transfer from an example of "my" style versus "this" style, etc. Presumably this has something to do with the idea that in training, every text that speaks in first person is in some sense seen as being from the same person.
- coolKid721a day ago
  The main thing I think is people just trying to do everything in "one prompt" or one giant thing throwing all the context at it. What you said is correct but also, instead of making one massive request breaking it down into parts and having multiple prompts with smaller context that say all have structured output you feed into each other.
  Make prompts focused with explicit output with examples, and don't overload the context. Then the 3 you said basically.
- denhaus2 days ago
  Regarding point 3, my colleagues and i studied this for a use case in science: https://doi.org/10.1038/s41467-024-45563-x
  - caterama2 days ago
    Can you provide a "so what?" summary?
    melagonster2 days ago
    >We test three representative tasks in materials chemistry: linking dopants and host materials, cataloging metal-organic frameworks, and general composition/phase/morphology/application information extraction. Records are extracted from single sentences or entire paragraphs, and the output can be returned as simple English sentences or a more structured format such as a list of JSON objects. This approach represents a simple, accessible, and highly flexible route to obtaining large databases of structured specialized scientific knowledge extracted from research papers.
- denhaus2 days ago
  As a clarification, we used fine tuning more than prompt engineering because low or few-shot prompt engineering did not work for our use case.
- faustocarva2 days ago
  Did you find it hard to create structured output while also trying to make it reason in the same prompt?
  - demosthanos2 days ago
    You use a two-phase prompt for this. Have it reason through the answer and respond with a clearly-labeled 'final answer' section that contains the English description of the answer. Then run its response through again in JSON mode with a prompt to package up what the previous model said into structured form.
    The second phase can be with a cheap model if you need it to be.
    faustocarva2 days ago
    Great, will try this! But, in a chain-based prompt or full conversational flow?
    demosthanos2 days ago
    You can do this conversationally, but I've had the most success with API requests, since that gives you the most flexibility.
    Pseudo-prompt:
    Prompt 1: Do the thing, describe it in detail, end with a clear summary of your answer that includes ${THINGS_YOU_NEED_FOR_JSON}.
    Prompt 2: A previous agent said ${CONTENT}, structure as JSON according to ${SCHEMA}.
    Ideally you use a model in Prompt 2 that supports JSON schemas so you have 100% guarantee that what you get back parses. Otherwise you can implement it yourself by validating it locally and sending the errors back with a prompt to fix them.
    faustocarvaa day ago
    Thanks!
haolez2 days ago
Sometimes I get the feeling that making super long and intricate prompts reduces the cognitive performance of the model. It might give you a feel of control and proper engineering, but I'm not sure it's a net win.
My usage has converged to making very simple and minimalistic prompts and doing minor adjustments after a few iterations.
- taosx2 days ago
  That's exactly how I started using them as well. 1. Give it just enough context, the assumptions that hold and the goal. 2. Review answer and iterate on the initial prompt. It is also the economical way to use them. I've been burned one too many times by using agents (they just spin and spin, burn 30 dollars for one prompt and either mess the code base or converge on the previous code written ).
  I also feel the need to caution others that by letting the AI write lots of code in your project it makes it harder to advance it, evolve it and just move on with confidence (code you didn't think about and write it doesn't stick as well into your memory).
  - apwell232 days ago
    > they just spin and spin, burn 30 dollars for one prompt and either mess the code base or converge on the previous code written ).
    My experience as well. I fear admitting this for fear of being labled a luddite.
  - scarface_742 days ago
    How is that different than code I wrote a year ago or when I have to modify someone else’s code?
- conception2 days ago
  I’d have to hunt, but there is evidence that using the vocabulary of an expert versus a layman will produce better results. Which makes sense since places where people talk “normally” in spaces are more likely to be incorrect. Whereas in places where people speak in the in the professional vernacular they are more likely to be correct. And the training will associate them together in their spaces.
  - ijk20 hours ago
    At their heart, these are still just document-completion machines. Very clever ones, but still inherently trying to find a continuation that matches the part that came before.
  - heisenzombiea day ago
    This seems right to me. I often ask questions in two phases to take advantage of this (1) How would a professional in the field ask this question? Then (2) paste that question into a new chat.
- tgv2 days ago
  For another kind of task, a colleague had written a very verbose prompt. Since I had to integrate it, I added some CRUD ops for prompts. For a test, I made a very short one, something like "analyze this as a <profession>". The output was pretty much comparable, except that the output on the longer prompt contained (quite a few) references to literal parts of that prompt. It wasn't incoherent, but it was as if that model (gemini 2.5, btw) has a basic response for the task it extracts from the prompt, and merges the superfluous bits in. It would seem that, at least for this particular task, the model cannot (easily) be made to "think" differently.
- nicoa day ago
  That’s also been my experience
  At the same time, I’ve seen the system prompts for a few agents (https://github.com/x1xhlol/system-prompts-and-models-of-ai-t...), and they are huge
  How does that work?
- pjm3312 days ago
  Yeah I had this experience today where I had been running code review with a big detailed prompt in CLAUDE.md but then I ran it in a branch that did not have that file yet and got better results.
- sagarpatil2 days ago
  That has been my conclusion too but how do you explain the long ass prompt by AI labs: https://docs.anthropic.com/en/release-notes/system-prompts#m...
  - haoleza day ago
    Well, your prompt adds up to the baseline. The logic still applies.
- dwringera day ago
  I would simplify this as "irrelevant context is worse than no context", but it doesn't mean a long prompt of relevant context is bad.
- wslh2 days ago
  Same here: it starts with a relatively precise need, keeping a roadmap in mind rather than forcing one upfront. When it involves a technology I'm unfamiliar with, I also ask questions to understand what certain things mean before "copying and pasting".
  I've found that with more advanced prompts, the generated code sometimes fails to compile, and tracing the issues backward can be more time consuming than starting clean.
  - lodovic2 days ago
    I use specs in markdown for the more advanced prompts. I ask the llm to refine the markdown first and add implementation steps, so i can review what it will do. When it starts implementing, i can always ask it to "just implement step 1, and update the document when done". You can also ask it to verify if the spec has been implemented correctly.
- matt32102 days ago
  At what point does it become programming in legalese?
  - efitz2 days ago
    It already did. Programming languages already are very strict about syntax; professional jargon is the same way, and for the same reason- it eliminates ambiguity.
- ath3nda day ago
  > It might give you a feel of control and proper engineering
  Maybe a super salty take, but I personally haven't ever thought anything involving an LLM as "proper engineering". "Flailing around", yes. "Trial and error", definitely. "Confidently wrong hallucinations", for sure. But "proper engineering" and "LLM" are two mutually exclusive concepts in my mind.
bsoles2 days ago
There is no such thing as "prompt engineering". Since when the ability to write proper and meaningful sentences became engineering?
This is even worse than "software engineering". The unfortunate thing is that there will probably be job postings for such things and people will call themselves prompt engineers for their extraordinary abilities for writing sentences.
- NitpickLawyer2 days ago
  > Since when the ability to write proper and meaningful sentences became engineering?
  Since what's proper and meaningful depends on a lot of variables. Testing these, keeping track of them, logging and versioning take it from "vibe prompting" to "prompt engineering" IMO.
  There are plenty of papers detailing this work. Some things work better than others (do this and this works better than don't do this - pink elephants thing). Structuring is important. Style is important. Order of information is important. Re-stating the problem is important.
  Then there's quirks with family models. If you're running an API-served model you need internal checks to make sure the new version still behaves well on your prompts. These checks and tests are "prompt engineering".
  I feel a lot of people take the knee-jerk reaction to the hype and miss critical aspects because they want to dunk on the hype.
  - gwervca day ago
    It's still very very far from engineering. Like, how long and how much one has to study to get an engineering degree? 5 years over many disciplines.
    On the other hand, prompt tweaking can be learned in a few days just by experimenting.
    NitpickLawyera day ago
    >the branch of science and technology concerned with the design, building, and use of engines, machines, and structures.
    Not this one.
    > (alt) a field of study or activity concerned with modification or development in a particular area. "software engineering"
    This one ^^^
    Too many people seem really triggered by this. I don't know why, but it's weird. It's just a term. It's well understood by now. The first 5 pages on google all state the same thing. Why bicker about something so trivial?
  - apwell23a day ago
    > Some things work better than others
    That could be said about ordering coffee at local coffee shop. Is there a "barista order engineering" we are all supposed to read?
    > Re-stating the problem is important.
    maybe you can show us some examples ?
- liampullesa day ago
  If my local barista were to start calling themselves a coffee engineer, I would treat that as a more credible title.
  - hansmayera day ago
    Yeah, if this catches on, we may definitely see the title "engineer" go the way of "manager" and "VP" in the last decades...So, yeah, we may start seeing coffee engineers now :D
  - gwervca day ago
    Mixologist has already replaced bartender.
- theshrike797 hours ago
  Context is king.
  Let's say you have two teams of contractors. One from your native country (I'm assuming US here), working remotely and one from India, located in India.
  Would you communicate with both in the exact same manner, you wouldn't adjust your messaging in any way?
  Of course you would, that's exactly what "prompt engineering" is.
  The language models are a different and a bit fiddly at the moment, so getting quality output from each requires a specific input.
  You can try it yourself, ask each of the big free-tier models to write a simple script in a specific language for you, every single one will have a different output. They all have a specific "style" they fall into.
- SchemaLoad2 days ago
  AI sloperators are desperate to make it look like they are actually doing something.
- zeliasa day ago
  Since modern algorithmic driven brainrot has degraded the ability of the average consumer to read a complete sentence, let alone write one
- sach1a day ago
  I agree with yowlingcat's point but I see where you are coming from and also agree with you.
  The way I see it, it's a bit like putting up a job posting for 'somebody who knows SSH'. While that is a useful skill, it's really not something you can specialize in since it's just a subset within linux/unix/network administration, if that makes sense.
- mkfs21 hours ago
  > The unfortunate thing is that there will probably be job postings for such things
  I don't think you have to worry about that.
- mseepgooda day ago
  You don't even have to write proper sentences. "me get error X how fix here code:" usually works.
- bicepjaia day ago
  I would argue code is a meaningful sentence. So software writers is more appropriate :) ?
- yowlingcat2 days ago
  I would caution against thinking it's impossible even if it's not something you've personally experienced. Prompt engineering is necessary (but not sufficient) to creating high leverage outcomes from LLMs when solving complex problems.
  Without it, the chances of getting to a solution are slim. With it, the chances of getting to 90% of a solution and needing to fine tune the last mile are a lot higher but still not guaranteed. Maybe the phrase "prompt engineering" is bad and it really should be called "prompt crafting" because there is more an element of craft, taste, and judgment than there is durable, repeatable principles which are universally applicable.
  - SrslyJosha day ago
    > "high leverage outcomes"
    You're not talking to managers here, you can use plain english.
    > Maybe the phrase "prompt engineering" is bad and it really should be called "prompt crafting" because there is more an element of craft, taste, and judgment than there is durable, repeatable principles which are universally applicable.
    Yes, the biggest problem with the phrase is that "engineering" implies a well-defined process with predicable results (think of designing a bridge), and prompting doesn't check either of those boxes.
    yowlingcata day ago
    My goals are outcomes on the key metrics of the business I run (such as opex, funnel pull-through, time to resolution) that translate the human effort and inputs into significantly more output than previous approaches we were using. That particular quotient (amplification of outputs over inputs) is the definition of leverage, and it is plain english. I don't think it's fair to hand-wave a word with a very specific meaning away as "manager speak" because you're unfamiliar with it.
    I have been using LLMs very specifically to drive towards those goals, and prompt engineering (or crafting given you dislike the phrase "engineering") is a crucial tool to get to those outcomes. And yes, sometimes that means writing my own code itself to interact with them, template prompts, post process, utilize them in workflows. And so, the more I think about it, the more that I see patterns in how I create prompts (that probably could be automated with the same tools used for content templating) that make it feel somewhere in the middle of engineering and crafting.
    My guess is that semantic ambiguity makes a lot of folks uncomfortable. If that's engineering, what isn't engineering, and doesn't it dilute engineering? Yet from the other angle, it's absolutely the case that LLM utilization is absolutely becoming a greater and greater part of how cutting edge companies write code, so much so that the "code" that the humans must put effort and intention into writing is just as much prompts as it is the fine tuned artifact created by running the prompt. If the reality of what actually is "code" is itself evolving, it's hard to imagine anything but that what constitutes software engineering must also itself be evolving in a very fundamental way.
- lowbloodsugar21 hours ago
  Like lawyers?
  - ocimbote13 hours ago
    Please be respectful to justice engineers.
    For the uneducated, law engineers are members of the Congress / Parliament / Bundestag / [add for your own country]
- wiseowisea day ago
  God, do you get off of word “engineer”? Is it cultural?
- empath75a day ago
  I think prompt engineering is closer to a people management skill than it is to engineering.
heisenburgzero2 days ago
In my own experience, if the problem is not solvable by a LLM. No amount of prompt "engineering" will really help. Only way to solve it would be by partially solving it (breaking down to sub-tasks / examples) and let it run its miles.
I'll love to be wrong though. Please share if anyone has a different experience.
- TheCowboy2 days ago
  I think part of the skill in using LLMs is getting a sense for how to effectively break problems down, and also getting a sense of when and when not to do it. The article also mentions this.
  I think we'll also see ways of restructuring, organizing, and commenting code to improve interaction with LLMs. And also expect LLMs to get better at doing this, and maybe suggesting ways for programmers to break problems down that it is struggling with.
- stets2 days ago
  I think the intent of prompt engineering is to get better solutions quicker, in formats you want. But yeah, ideally the model just "knows" and you don't have to engineer your question
ColinEberhardt2 days ago
There are so many prompting guides at the moment. Personally I think they are quite unnecessary. If you take the time to use these tools, build familiarity with them and the way they work, the prompt you should use becomes quite obvious.
- Disposal84332 days ago
  It reminds me that we had the same hype and FOMO when Google became popular. Books were being written on the subject and you had to buy those or you would become a caveman in a near future. What happened is that anyone could learn the whole thing in a day and that was it, no need to debate about whether you would miss anything if you didn't knew all those tools.
  - verbify2 days ago
    I certainly have better Google fu than some relatives who are always asking me to find something online.
    Timwia day ago
    I love the term “Google fu”. We should call it prompt fu or LLM fu instead of “prompt engineering”.
  - wiseowisea day ago
    You’re only proving the opposite: there’s definitely a difference between “experienced Google user” and someone who just puts random words and expects to find what they need.
    marliechillera day ago
    Is there? I feel like google has optimised heavily for the caveman input rather than the enlightened search warrior nowadays
- sokoloff2 days ago
  I think there are people for whom reading a prompt guide (or watching an experienced user) will be very valuable.
  Many people just won't put any conscious thought into trying to get better on their own, though some of them will read or watch one thing on the topic. I will readily admit to picking up several useful tips from watching other people use these tools and from discussing them with peers. That's improvement that I don't think I achieve by solely using the tools on my own.
- awb2 days ago
  Many years ago there were guides on how to write user stories: “As a [role], I want to be able to do [task] so I can achieve [objective]”, because it was useful to teach high-level thinkers how to communicate requirements with less ambiguity.
  It may seem simple, but in my experience even brilliant developers can miss or misinterpret unstructured requirements, through no fault of their own.
- TheCowboy2 days ago
  It's at least useful for seeing how other people are being productive with these tools. I also sometimes find a clever idea that improves that I'm already doing.
  And documenting the current state of this space as well. It's easy to have tried doing something a year ago and think they're still bad.
  I also usually prefer researching some area before reinventing the wheel by trial/failure myself. I appreciate when people share what they've discovered with their own their time, as I don't always have all the time in the world to explore it as I would if I were still a teen.
- babya day ago
  There are definitely tricks that are not obvious. For example it seems like you should delete all politeness (e.g. "please")
orochimaaru2 days ago
A long time back for my MS CS I took a science of programming course. The way to verify has helped me craft prompts when I do data engineering work. Basically:
Given input (…) and preconditions (…) write me spark code that gives me post conditions (…). If you can formally specify the input, preconditions and post conditions you usually get good working code.
1. Science of programming, David Gries 2. Verification of concurrent and sequential systems
yuvadam2 days ago
Seems like so much over (prompt) engineering.
I get by just fine with pasting raw code or errors and asking plain questions, the models are smart enough to figure it out themselves.
leshow2 days ago
using the term "engineering" for writing a prompt feels very unserious
- vunderba2 days ago
  I came across a pretty amusing analogy back when prompt "engineering" was all the rage a few years ago.
  > Calling someone a prompt engineer is like calling the guy who works at Subway an artist because his shirt says ‘Sandwich Artist.’
  All jokes aside I wouldn't get to hung up on the title, the term engineer has long since been diluted to the point of meaninglessness.
  https://jobs.mysubwaycareer.eu/careers/sandwich-artist.htm
  - theanonymousone2 days ago
    Why would I have a problem calling that guy a sandwich engineer?
    https://en.wikipedia.org/wiki/Audio_engineer
    guappaa day ago
    It's cute that you think that being a sound engineer is something you can pick up in a few minutes, while it requires knowledge of acoustics, electronics, music theory and human perception.
    theanonymousonea day ago
    It's your understanding of what I wrote, not what I meant and far from it.
    wiseowisea day ago
    Because you’ll hurt ops huge ego. God forbid you put the godly title of ENGINEER near something so trivial as sandwich.
- guappaa day ago
  Well in USA they have "sales engineers", which in my experience are people who have no clue how the thing they're supposed to sell works.
  - ndriscolla day ago
    I went into software instead, but IIRC sales and QA engineers were common jobs I heard about for people in my actual accredited (optical) engineering program. A quick search suggests it is common for sales engineers to have engineering degrees? Is this specifically about software (where "software engineers" frequently don't have engineering degrees either)?
    guappaa day ago
    In my (software) organisation, sales engineers were not aware of the fact that after entering a command on a linux terminal you must press enter for it to work.
    They were also unaware of the fact that if you create a filename with spaces you must then escape/quote it for it to work.
    They requested this important information to be included in the user manual (the users being sysadmins at very large companies).
    cbm-vic-20a day ago
    In the places I've worked, sales engineers are similar to consultants. They work with the sales team to produce a demo for a prospective customer. They need to have development chops to do any customizations, produce realistic sample data, and need to understand the architecture of the product to make a compelling demo. They also need to have the social skills to answer technical questions on-the-fly.
    Timwia day ago
    I've always wondered why so many instruction pages for software meticulously include all key presses like Enter. This explains a lot
- dwringer2 days ago
  Isn't this basically the same argument that comes up all the time about software engineering in general?
  - leshow2 days ago
    I have a degree in software engineering and I'm still critical if its inclusion as an engineering discipline, just given the level of rigour that's applied to typical software development.
    When it comes to "prompt engineering", the argument is even less compelling. Its like saying typing in a search query is engineering.
    klntsky2 days ago
    googling pre-LLMs was a required skill. Prompting is not just for search if you build LLM pipelines. Cost commonly can be easily optimized 2x if you know what you are doing.
    leshowa day ago
    something being a skill does not mean it is an engineering discipline. engineering is the application of science or math to solve problems. writing prompts doesn't fit that definition.
    SrslyJosha day ago
    I think a more fundamental aspect of engineering is that it involves well-defined processes with predictable results, and prompting doesn't check either box.
- ozim2 days ago
  Because your imagination stopped at chat interface asking for funny cat pictures.
  There are prompts to be used with API an inside automated workflows and more to it.
- kovac2 days ago
  IT is where words and their meanings come to die. I wonder if words ever needed to mean something :p
- theanonymousone2 days ago
  I understand your point, but don't we already have e.g. AWS engineers? Or I believe SAP/Tableau/.. engineers?
- liampullesa day ago
  Absolutely. It's not appropriate to describe developers in general either. That fight has been lost I think and that's all the more reason to push against this nonsense now.
- morkalork2 days ago
  For real. Editing prompts bares no resemblance to engineering at all, there is no accuracy or precision. Say you have a benchmark to test against and you're trying to make an improvement. Will your change to the prompt make the benchmark go up? Down? Why? Can you predict? No, it is not a science at all. It's just throwing shit and examples at the wall in hopes and prayers.
  - yawnxyza day ago
    updating benchmarks and evals is something closer to test engineering / qa engineering work though
  - echelon2 days ago
    > Will your change to the prompt make the benchmark go up? Down? Why? Can you predict? No, it is not a science at all.
    Many prompt engineers do measure and quantitatively compare.
    morkalork2 days ago
    Me too but it's after the fact. I make a change then measure, if it doesn't I roll back. But it's as good as witch craft or alchemy. Will I get I get gold with this adjustment? Nope, still lead. Tries variation #243 next
    a2dam2 days ago
    This is literally how the light bulb filament was discovered.
    MegaButts2 days ago
    And Tesla famously described Edison as an idiot for this very reason. Then Tesla revolutionized the way we use electricity while Edison was busy killing elephants.
    SrslyJosha day ago
    Lots of things have been discovered by guess-and-check, but it's a shit method for problem solving. You wouldn't use guess-and-check unless A) you don't know enough about the problem to employ a better method or B) the problem space can be searched quickly enough that using a better method doesn't really matter.
    yowlingcat14 hours ago
    I think that the only reason guess-and-check fails is when you don't know enough to actually check.
    Maybe this is why we disagreed on a previous comment. I think guess and check is generally the best way to solve problems, so long as your checks well-designed (which to be fair, does require understanding the problem) and you incorporate the results from the check back into further guesses, and you can afford to brute force it (which is statistically the most common problems big and small I've had to solve).
    In a lot of ways, there's nothing fancy about that, it's just the scientific method -- the whole point which is that you don't really need to "know" about the problem a priori to get to the truth, and that you can recompile all dependent knowledge from scratch through experimentation to get to it.
    In practice it feels really expensive but it's also where the best insights come from -- experimentally re-deriving common knowledge often exposes where its fractures and exceptions are in clear enough detail to translate it into novel discovery.
sherdil20222 days ago
So it looks like we all need to have firm understanding and tailor our prompts now to effectively use LLMs. Isn't this all subjective? I get different answers based upon how I word my question. Shouldn't things be little bit more objective? Isn't this random that I get different results based upon just wording? This whole thing is just discombobulating to me.
- fcoury2 days ago
  And to add to it, here's my experience: sometimes you spend a lot of time on this upfront prompt engineering and get bad results and sometimes you just YOLO it and get good results. It's hard to advocate for a determined strategy for prompt engineering when the tool you're prompting itself is non-deterministic.
  Edit: I also love that the examples come with "AI’s response to the poor prompt (simulated)"
  - prisenco2 days ago
    Also that non-determinism means every release will change the way prompting works. There's no guarantee of consistency like an API or a programming language release would have.
Kiyo-Lynn2 days ago
At first I kept thinking the model just wasn't good enough, it just couldn’t give me what I wanted. But over time, I realized the real problem was that I hadn’t figured out what I wanted in the first place. I had to make my own thinking clear first, then let the AI help organize it. The more specific and patient I am, the better it responds.
jwr2 days ago
I find the name "prompt engineering" so annoying. There is no engineering in throwing something at the wall and seeing if it sticks. There are no laws or rules that one can learn. It's not science, and it is certainly not engineering.
- dimitri-vs2 days ago
  It's really just technical writing. Majority of tricks from the GTP-4 era are obsolete with reasoning models.
akkad332 days ago
I'd rather write my own code than do all that
- Lich2 days ago
  Yeah I don’t get it. By the time I’m done writing all these prompts to get what I want, refining over and over not to mention the waiting time for the characters to appear on the screen, I could have written what I want myself. I find LLMs more useful as a quick documentation search and for the most basic refactoring and template building.
- coffeefirst2 days ago
  Writing the code would also save you time over whatever I just read.
- echelon2 days ago
  Your boss (or CEO) probably wouldn't.
  - guappaa day ago
    My boss cares about speed, because he doesn't sell AI.
jorge_caba day ago
I don't think it should be the role of developers to write "good prompts" I think ideally an intermediate layer should optimize the information passed to an LLM.
Like what Agentic IDEs are starting to do. I don't copy paste code in the correct way to optimize my prompt, I select the code I want, with MCPs picking up you might not even have to paste input/output the Agent can run it and parse it into the LLM in an optimal way.
Of course, the quality of your instructions matter but I think that falls outside of "prompt engineering"
air7a day ago
A few days ago Sergey Brin said "We don't circulate this too much in the AI community – not just our models but all models – tend to do better if you threaten them … with physical violence"
-- https://www.theregister.com/2025/05/28/google_brin_suggests_...
- layman51a day ago
  This reminds me of that funny detail in a YouTube video by “Programmers are also human” on professional vibe coders where he keeps ending his orders to the LLM with “.. or you go to jail.”
- xigencya day ago
  So that's why they dropped "Don't Be Evil."
nexoft2 days ago
"prompt engineering" ....ouch. I would say "common sense"\ also the problem with software engineering is that there is an inflation of SWE, too much people applying for it for compensation level rather than being good at it and really liking it, we ended up having a lot of bad software engineers that requires this crutch crap.
- adamhartenz2 days ago
  "Common sense" doesn't exist. It is a term people use when they can't explain what they actually mean.
  - marliechillera day ago
    Not sure I fully agree - sometimes maybe, but I think in the majority of cases it's used when people feel they dont need to explain exactly what they mean because it should already be obvious to most people.
    Example. "Always look when you cross the road" is a snippet common sense, with lack of heeding to that sense resulting in you potentially being hit by a car. Even a 4 year old wouldnt need the latter explanation, but most people could articulate that if they needed to. Its just a way of speeding up communication
    lowbloodsugar21 hours ago
    I was quite old when I realized that Common sense is literally “common experiences”.
    A colleague and I were lamenting a laughably bad outage that we thought showed a total lack of common sense resulting in an obvious issue. Then we both realized that the team had never had such an experience whereas the two of us had. Every member of that team now has that particular “common sense”.
    Likewise, “don’t run in front of cars”. As a kid, a mate broke his leg running onto the street and getting hit. I think near misses happen a lot when we’re kids.
    But far fewer has an “common sense” about prompt engineering because there’s just much less experience.
  - ozim2 days ago
    Also how common sense can exist with LLM?
    There is no common sense with it - it is just an illusion.
    wiseowisea day ago
    Presumably it is your common sense.
- jjmarr2 days ago
  Markets shift people to where they are needed with salaries as a price signal.
  There aren't enough software engineers to create the software the world needs.
  - the_d3f4ult2 days ago
    >There aren't enough software engineers to create the software the world needs.
    I think you mean "to create the software the market demands." We've lost a generation of talented people to instagram filters and content feed algorithms.
  - ozim2 days ago
    To maintain ;) the software.
- ozim2 days ago
  Lots of those „prompt engineering” things would be nice to teach to business people as they seem to lack common sense.
  Like writing out clear requirements.
- wiseowisea day ago
  The less you use “crutches” the better you get, right? Judging by your comment, you don’t use Google, Stack Overflow, public forums (for programming assistance), books, courses, correct?
- 2 days ago
  undefined
BuyMyBitcoins21 hours ago
I have a mental barrier around giving prompts in natural English. Part of me feels like I should be giving precise information as if I were writing a SQL query or providing arguments to an executable.
It is fascinating that you can give instructions as if you were talking to someone, but part of me feels like doing it this way is imprecise.
Nevertheless, having these tools process natural language for instructions is probably for the best. It does make these tools dramatically more accessible. That being said I still feel silly writing prompts as if I was talking to a person.
neves2 days ago
Any tip that my fellow programmers find useful that's not in the article?
- didibus2 days ago
  Including a coding style guide can help the code looks like what you want. Also including an explanation of the project structure, and overall design of the code base. Always specify what libraries it should make use of (or it'll bring in anything or implement stuff a library has already).
  You can also make the AI review itself. Have it modify code, than ask to review the code, than ask to address review comments, and iterate until it has no more comments.
  Use an agentic tool like Claude Code or Amazon Q CLI. Then ask it to run tests after code changes and to address all issues until test pass. Make sure to tell it not to change the test code.
  - taosx2 days ago
    Unless your employer pays for you to use agentic tools, avoid them. They burn through money and tokens like there's no tomorrow.
- trcf222 days ago
  I found that presenting your situation and asking for a plan/ideas + « do not give me code. Make sure you understand the requirements and ask questions if needed.» works much better for me.
  It also allows me to more easily control what the llm will do and not end up reviewing and throwing 200 lines of code.
  In a nextjs + vitest context, I try to really outline which tests I want and give it proper data examples so that it does not cheat around mocking fake objects.
  I do not buy into the whole you’re a senior dev etc. Most people use Claude for coding so I guess it’s engrained by default.
julienchastanga day ago
I did not read the entire blog post (it is pretty long), but from what I read, I actually disagree with the author. You want to provide the right amount of prompting, but not too much either. Also the LLM can derive a huge amount of information from the context you give it (e.g., you don't have to say it is looking at a k8s log, it will figure it out immediately from the information entered). However, in other situations, you do want to be more specific. Good prompting ends up being as much of an art as a science which you realize as you gain experience with it.
namanyayg2 days ago
It's cool to see Addy's prompts. I also wrote about some of mine! (ignoring the obvious ones): https://nmn.gl/blog/ai-prompt-engineering
bbuchaltera day ago
I currently take a very causal approach to prompting. Have others started at that point and moved to the rigor discussed here? How much marginal benefit did you see? Does the investment of prompt time provide linear returns? Exponential? Logarithmic?
- SrslyJosha day ago
  Don't forget the emotional payoff of feeling like you're not having to do the actual work. I'd bet that some people are spending more time and energy writing prompts that it would take to just solve problems themselves.
GuB-42a day ago
Does the "you are an expert" kind of prompt really work? Role playing may help if you want to direct the focus of the LLM in a particular way (as explained in the article), for example by focusing on security, performance, style, etc...
But I am skeptical over the idea that asking a LLM to be an expert will actually improve its expertise. I did a short test prompting ChatGPT do be an "average developer, just smart enough not to get fired", an "expert" and no persona. I got 3 different answers but I couldn't decide which one was the best. The first persona turned out quite funny, with "meh" comments and an explanation about what the code "barely" does, but the code itself is fine.
I fear that but asking a LLM to be an expert, it will get the confidence of an expert rather than the skills of an experts, and a manipulative AI is something I'd rather not have.
- bfeynmana day ago
  This used to work but new thinking models made this unnecessary for the most part.
ofrzeta2 days ago
In the "Debugging example", the first prompt doesn't include the code but the second does? No wonder it can find the bug! I guess you may prompt as you like, as long as you provide the actual code, it usually finds bugs like this.
About the roles: Can you measure a difference in code quality between the "expert" and the "junior"?
mseepgooda day ago
Is asking good questions or making clear requests what people now refer to as 'prompt engineering'?
groby_b2 days ago
"State your constraints and requirements well and exhaustively".
Meanwhile, I just can't get over the cartoon implying that a React Dev is just a Junior Dev who lost their hoodie.
MontagFTB2 days ago
Is there some way to quantify the effects of these prompt modifications versus not including them? How do we know the models are producing better output?
almosthere2 days ago
At this point it's a bit dogfooded, so why not just ask the LLM for a good prompt for the project you're working on?
max_on_hna day ago
Some solid tips here but I think this bit really misses the point:
> The key is to view the AI as a partner you can coach – progress over perfection on the first try
This is not how to use AI. You cannot scale the ladder of abstraction if you are babysitting a task at one rung.
If you feel that it’s not possible yet, that may be a sign that your test environment is immature. If it is possible to write acceptance tests for your project, then trying to manually coach the AI is just a cost optimization, you are simply reducing the tokens it takes the AI to get the answer. Whether that’s worth your time depends on the problem, but in general if you are manually coaching your AI you should stop and either:
1. Work on your pipeline for prompt generation. If you write down any relevant project context in a few docs, an AI will happily generate your prompts for you, including examples and nice formatting etc. Getting better at this will actually improve
2. Set up an end-to-end test command (unit/integration tests are fine too add later but less important than e2e)
These processes are how people use headless agents like CheepCode[0] to move faster. Generate prompts with AI and put them in a task management app like Linear, then CheepCode works on the ticket and makes a PR. No more watching a robot work, check the results at the end and only read the thoughts if you need to debug your prompt.
[0] the one I built - https://cheepcode.com
m3kw92 days ago
The “you are a world class expert..” prompt to me is more like a superstious thing a sports player do before they play. I’ve used it and it still gives similar results, but maybe on a good day(that random seed) it will give me superior results
b0a04gl2 days ago
tighter prompts, scoped context and enforced function signatures. let it selfdebug with eval hooks. consistency > coherence.
vrnvu2 days ago
We are teaching programmers how to prompt engineer instead of programming…
What a world we live in.
- throw123543513 hours ago
  Pretty much. Everything is about prompt now. Its either "You aren't prompting right", or "Need to prompt harder/more accurately/etc". The only skill that matters now seems to be using a paid vendor product (pick your favorite proprietary AI chatbot) - which isn't really a long term skill at all nor really all that technical.
  Whether its true or not it is besides the point - its getting kind of boring and it seems to be drowning everything else. It feels at least from the outside like a profession that used to require intelligence/skill, and something creative problem solving just went downhill fast. The anti-SWE crowd (e.g VC's, AI devs, CEO's, etc) seems to be winning the argument at the moment as to what the future will bring in these forums. Every podcast I hear its always software engineering that is mentioned as the industry to be disrupted away.
  Who would of thought the job of the future only 5 years ago would be the first to go? Hope I'm wrong, but it seems the AI crowd's main target atm.
- wiseowisea day ago
  What shall we teach them instead? Creating bazillion CRUDs? Serializing/Deserializing JSON?
- awb2 days ago
  Your native language is now a programming language.
  - 1_08iua day ago
    If people feel that they need to learn different language patterns in order to communicate effectively in their native language with an LLM then I'm not sure if I agree. I think that if your native language truly was a programming language then there wouldn't be any need for prompt engineering.
    Regardless, I think that programmers are quite well-suited to the methods described in the article, but only in the same way that programmers are generally better at Googling things than the average person; they can imagine what the system needs to see in order to produce the result they want even if that isn't necessarily a natural description of their problem.
Avalaxy2 days ago
I feel like all of this is nonsense for people who want to pretend they are very good at using AI. Just copy pasting a stack trace with error message works perfectly fine, thank you.
- yoyohello132 days ago
  Seriously, I seem to get good results by just "being a good communicator". Not only that, but as the tools get better, prompt engineering should get less important.
- abletonlive2 days ago
  In most HN LLM Programming discussion there's a vocal majority saying LLMs are useless. Now we have this commenter saying all they need to do is vibe and it all works out.
  WHICH IS IT?
  - rognjen2 days ago
    I downvoted your comment because of your first sentence. Your point makes is made even without it.
bongodongobob2 days ago
None of this shit is necessary. I feel like these prompt collections miss the point entirely. You can talk to an LLM and reason about stuff going back and forth. I mean, some things are nice to one shot, but I've found keeping it simple and just "being yourself" works much better than a page long prompt.
Noelia-11 hours ago
[dead]