17 pointsby matsura day ago5 comments
  • StefanJVA2 hours ago
    Intressting read!
  • mmmehullla day ago
    This is really interesting, I feel if llms can respond at existing without finetuning, it can be huge.
    • spenczar5a day ago
      Yes! This works really well from Sonnet 4.5 onwards, in our experience. Sonnet 4.0 was a little rocky - we had to give it tons of documentation - but by now it works without much effort.

      One thing that works very well is just giving it one or two example valid programs/statements in the custom language. It usually picks up what you're getting at very quickly.

      When it slips up, you get good signal you can capture for improving the language. If you're doing things in a standard agent-y loop, a good error message also helps it course-correct.

      • mmmehullla day ago
        That’s really interesting. The “one or two examples + good error messages” part feels especially important. It suggests the limiting factor may be less finetuning and more whether the model is given a tight representation and a feedback loop it can recover from.
  • hwernettia day ago
    Great read! We're running into a similar problem at my company: we've given agents the ability to query our databases but not enough guidance to write correct and efficient queries. I haven't tried solving this problem yet but I'm curious if you explored any code-to-sql approaches, something similar to SQLAlchemy but with your own guardrails and customizations?
    • spenczar5a day ago
      That's a pretty interesting idea! I guess 160+ is sort of doing some of that for us - it compiles to SQL WHERE clauses, right - but generally, we found good results giving it a SQL dialect directly.

      I think some of the reason is that there's so much coverage of writing SQL in its training set.

      • hwernetti22 minutes ago
        Good point, that makes a lot of sense to use a tool that has plenty of sample usage data available.
  • spenczar5a day ago
    Author here! I am pretty jazzed about these ideas and happy to dig into more detail than a blog post allows.
  • shablulmana day ago
    [dead]