OSS ChatGPT WebUI – 530 Models, MCP, Tools, Gemini RAG, Image/Audio Gen(llmspy.org)

131 pointsby mythz12 days ago9 comments

mdrzn12 days ago
Posted 5 times in the last 7 days, today it finally got 29 points with 0 comments? Weird.
- mythz12 days ago
  Most announcements slip through without notice, it only picks up votes when it hits the main page.
  v1 also took a while to make it to HN, v3 is a complete rewrite focused on extensibility with a lot more new features.
  - digiown12 days ago
    The few people looking at /new on HN are ridiculously overpowered. A few upvotes from them in the few hours will get you to the front page, and just 1-2 downvotes will make your post never see the light of day.
    freedomben12 days ago
    You can't downvote a post, so that's not a factor.
    Also it's not as powerful as you think. In the past I have spent a lot of time looking at /new, and upvoting stories that I think should be surfaced. The vast majority of them still never hit near the front page.
    It's a real shame, because some of the best and most relevant submissions don't seem to make it.
    tuhgdetzhh12 days ago
    If you are in a company like e.g. ClickHouse and share a new HN Submission of ClickHouse via the internal Slack to #general, then you easily get enough upvotes for the front page.
    oceansweep12 days ago
    You can absolutely downvote posts. You have to have a certain amount of karma before the option becomes available.
    digiown12 days ago
    No I was wrong. You can't downvote posts. Flags are used instead, apparently.
    freedomben11 days ago
    Yes, and I will fully agree with you that flags are overpowered. That system does need to be re-worked IMHO.
    nebezb12 days ago
    freedomben has 28k karma. I don’t think the downvote button is coming.
    lukan12 days ago
    What is stopping you from joining those "ridiculously overpowered people"?
turblety12 days ago
This looks great. I've been using OpenWebUI for a while now and the weird licence and inability to just pay for branding has frustrated me.
This looks like it's not only a better license, but also much better features.
- mythz12 days ago
  Yep Open WebUI's switch to a non OSS license to inhibit competitive forks [1], in their own words [2] ensures I'll never use them. Happy to develop an OSS alternative that does the opposite whose rewrite on extensibility enables community extensions can replace built-in components and extensions so it can easily be rebranded and extended with custom UI + Server features.
  The goal is for the core main.py to be a single file without requiring additional dependencies, anything that does can be loaded as an extension (i.e. just a folder with .py server and UI hooks). There's also a script + docs so you can mix n' match the single main.py file and repackage it which whatever extensions you want included [3].
  [1] https://www.reddit.com/r/opensource/comments/1kfhkal/open_we...
  [2] https://docs.openwebui.com/license/
  [3] https://llmspy.org/docs/deployment/custom-build
12 days ago
undefined
storystarling12 days ago
How are you handling the orchestration for the Computer Use agent? Is that running on LangGraph or did you roll a custom state machine? I've found managing state consistency in long-running agent loops to be the hardest part to get right reliably.
- mythz12 days ago
  No custom state machine or agent, it's only a copy of Anthropic's 3 computer use tools: run_bash, edit, computer.
  https://github.com/ServiceStack/llms/tree/main/llms/extensio...
  It's run in the same process, there's no long agent loops, everything's encapsulated within a single message thread.
tiahura12 days ago
Do people really use claude code or any other agent with a paid api key? Why? Why wouldn't you just get Claude Max?
- mythz12 days ago
  I wouldn't use Claude API Key pricing, but I also wouldn't get a Claude Max sub unless it was the only AI tool I used.
  Antigravity / Google AI Pro is much better value, been using it as my primary IDE assistant for a couple months and have yet to hit a quota limit on my $16/mo sub (annual pricing) which also includes a tonne of other AI perks inc. Nano Banana, TTS, NotebookLM, storage, etc.
  No need to use Anthropic's premium models for tool calling when Gemini/MiniMax are better value models that still perform well.
  I still have a Claude Pro plan, but I use it much less than Antigravity and thanks to Anthropic axing their sub usage, I no longer use it outside of CC.
  - esperent12 days ago
    Counterpoint: on the $20 monthly account I would hit my 5 hour limits within an hour on antigravity. I end up spending half my time managing my context and keeping conversations short.
    adam_patarino11 days ago
    Yeah, I’ve hit this too. Once you do real agentic work or TDD, you’re optimizing context instead of code. That frustration is why we built Cortex: flat cost, no turn limits, runs locally, and git-aware context so you can just keep going. cortex.build
- tgtweak12 days ago
  Rate limits mostly - plus claude code is a relatively recent thing but sonnet api has been around for a while with 3rd party apps (like cline). In those scenarios, it was only api.
thedevilslawyer12 days ago
Can this be used in a multi user scenario?
- mythz12 days ago
  Yep, but it only supports GitHub OAuth. i.e. Content is either saved under no user (anonymous) or the authenticated GitHub User.
  https://llmspy.org/docs/deployment/github-oauth
  - thedevilslawyer12 days ago
    Thanks. Looks like this is purely to gatekeep internal access, but isn't ready for any oidc, or with a db backed session store.
    All the best for the project, will check in later on these..
    hobofan12 days ago
    If you are looking for a open source Chat WebUI with support for OIDC, maybe you are interested in the one we are building?[0]
    We are leveraging oauth2-proxy for the login here, so it should support all OIDC-compliant IDPs, and there are some guides by oauth2-proxy on how to configure for all the bigger providers. We do have customers using it with e.g. Azure, Keycloak, Google Directory.
    [0]: https://erato.chat
    thedevilslawyer11 days ago
    I see you have a dockerfile.combined - is this built and served via gh artifacts? I can try it out.
    Pros: Open source, and focus on lightweight. This is good.
    Cons: "customers" - Ugh, no offense, but smells of going down the same path as "open" webui, with the services expanding to fill enterprise use cases, and simplicity lost.
    LLMs.py seems to be focussing purely on simplicity + OK with rewriting for it. this + 3bsd is solid ethos. Will await their story on multi-user, hosted app. They have most of the things sorted anyway, including RAG, extensions, etc.
    hobofan11 days ago
    > I see you have a dockerfile.combined - is this built and served via gh artifacts? I can try it out.
    Our recommended way of deploying is via Helm[0] with latest version listed here[1].
    > with the services expanding to fill enterprise use cases, and simplicity lost.
    TBH, I don't think that simplicity was lost for OpenWebUI because of trying to fill enterprise needs. Their product has felt like a mess of too many cooks and no consistent product vision from the start. That's also where part of our origin story comes from: We started out as freelancers in the space and got inquiries to setup up a Chat UI for different companies, but didn't deem OpenWebUI and the other typical tools fit for the job, and too much of a mess internally to fork.
    We are small team (no VC funding), our customers end-users are usually on the low-end of AI literacy and there is about ~1 DevOps/sysadmin at the company our tool is deployed, so we have many factors pushing us towards simplicity. Our main avenue of monetization is also via SLAs, so a simple product for which we can more easily have test coverage and feel comfortable about the stability is also in our best interest here.
    [0]: https://erato.chat/docs/deployment/deployment_helm
    [1]: https://artifacthub.io/packages/helm/erato/erato
augusteo12 days ago
Curious about the MCP integration. Are people using this for production workloads or mostly experimentation?
- mythz12 days ago
  MCP support is available via the fast_mcp extension: https://llmspy.org/docs/mcp/fast_mcp
  I use llms .py as a personal assistant and MCP is required to access tools available via MCP.
  MCP is a great way to make features available to AI assistants, here's a couple I've created after enabling MCP support:
  - https://llmspy.org/docs/mcp/gemini_gen_mcp - Give AI Agents ability to generate Nano Banana Images or generate TTS audio
  - https://llmspy.org/docs/mcp/omarchy_mcp - Manage Omarchy Desktop Themes with natural language
  I will say there's a noticable delay in using MCP vs tools, where I ended up porting Anthropic's node filesystem MCP to Python [1] to speed up common AI Assistant tasks, so their not ideal for frequent access of small tasks, but are great for long running tasks like Image/Audio generation.
  [1] https://github.com/ServiceStack/llms/blob/main/llms/extensio...
  - storystarling12 days ago
    Does the MCP implementation make it easy to swap out the underlying image provider? I've found Gemini is still a bit hit or miss for actual print-on-demand products compared to Midjourney. Since MJ still doesn't have a real API I've been routing requests to Flux via Replicate for higher quality automated flows. Curious if I could plug that in here without too much friction.
    mythz12 days ago
    MCP allows AI Models that doesn't support Image generation the ability to generate images/audio via tool calling.
    But you can just select the Image Generation model you prefer to use directly [1]. Currently supports Google, Open AI, OpenRouter, Chutes, Z.ai and Nvidia.
    I tried Replicate's MCP, but it looks like everything but generate images which I didn't understand, surely image generation would be its most sought after feature?
    [1] https://llmspy.org/docs/v3#image-generation-support
cyanydeez12 days ago
What is ChatGPT used in the title when it's clearly a much more flexible ui?
- mythz12 days ago
  Couldn't think of a better title, do you have any suggestions?
chicagobuss12 days ago
why not just use llm by simon willison