1 pointby yatesdr4 hours ago2 comments
  • yatesdr4 hours ago
    I run a few different models on my compute nodes and was constantly editing json files managing configs for which one was where. Built this to solve the problem of aggregating them into one place behind a public nginx reverse proxy. My goal was hooking it to claude-code or qwen when I run out of tokens so I could use minimax or glm-5, but it works great for that and also sharing those with other people.

    MIT licensed, reasonably secure, maybe useful.

  • TZubiri4 hours ago
    So, like litellm?
    • yatesdr4 hours ago
      Pretty similar to litellm[proxy], but supports the Responses API and also some re-write. This is pretty much targeted at coding TUIs but I do use it a lot for text embeddings and streaming inference in applications too.