Ask HN: What are you building with AI coding agents / tooling?

3 pointsby giancarlostoro4 hours ago2 comments

vibe423 hours ago
Building my own home lab for local AI inference and general-purpose servers. Purpose is to learn more about hardware, Linux, networking, open source AI tools.
Decided as a constraint to exclusively use local AI! This was fun in that the first step became assembling the first server able to run a small local model, that would then assist with everything else.
After I got the first one running it was used for almost everything, except it could not assemble the 42U steel server rack.. (shoulders hurt a bit now, probably good exercise!)
The first thing I tried on the new servers after first boot of debian was feeding the entire Linux dmesg log with one simple instruction: "Check all dmesg entries and provide recommendations for any errors, issues or other considerations".
This was very helpful even with smaller local models, as a complement to just searching for various errors (drivers etc). Learned a lot of new things like BMC network configs.
Home lab networking in general was incredible to work through using local AI. Being a bit rusty on various things like firewalls, local DNS etc it was refreshing asking questions so dumb that one might not want them in the logs of hosted AI providers given a history as a SWE...lol
And more complex things like how packets flow in mikrotik RouterOS.
Some general findings:
* The latest generation of local AI models are _way_ better than even just 6 months ago. In particular dense models 7B+ are surprisingly useful for anything Linux, network configs, small to medium sized scripts.
* Latest gen open models from small AI labs generally beat last gen models of the same size from larger labs.
* Don't trust recommendations for any specific model - try it for real stuff and get messy with it - feed it system/app logs, mad half-spelled ramblings late at night along with more clear and well written instructions the next day...
* Larger open models of decent quant (Q5 and up) are now so good enough that the bottleneck for many use cases is no longer the model, but your workflow.
* Simpler workflows beat complex prompts, skills, AGENT.md etc. I run most things with the pi-mono coding agent with no extensions.
* Have the same model verify a finding/claim in a fresh context. This drastically reduces false positives and improves correctness of findings. Going further, run a third verification with a different model.
* If you grew up with the sounds of floppy disks, 56k modems etc, you might just like the coil whine of local GPUs... it's oddly comforting and different models sound different when working on the same tasks.
- nathan_douglas3 hours ago
  What's your GPU setup like?
  I'm doing a vaguely similar thing - I have a 10" rack minilab [1] and I've vibe-coded an MCP server that runs in the cluster to introspect, etc, but the main longterm goal is to set up some ML pipelines and maybe work toward formal verification via TLA+ or smth. (_not_ vibecoding that... I'm thinking of moving into AI formal verification or compliance automation as a career move.)
  I have a separate amd64 server with an RTX 2070 Super - which is obviously old and low-powered. Useful for some general ML stuff, but I don't think it's sufficient to run any non-trivial modern LLM.
  I'm thinking about upgrading that GPU, but haven't committed to it or even really thought that hard about it.
  [1] https://clog.goldentooth.net/
agentura4 hours ago
I'm building a tool to catch AI agent regressions. For example, behavior can silently shift for a number of reasons -- a prompt tweak, model swap, context change, routing -- and the impact on output wont be obvious until a few weeks later when refunds for edge cases spike!
Agentura is like pytest for AI agents. Its 100% free.
Try here: https://agentura.run