Want to swap out your client for a different one? Good luck - it probably expects a completely different schema. Trying a new model? Hope you're ready to deal with a different chat template. It felt like every layer had its own way of doing things, which made understanding the flow pretty frustrating for a noobie.
So I sketched out a diagram that maps out what (rough) schema is being used at each step of the process - from the initial request all the way through Ollama and an MCP server with OpenAI-compatible endpoints showing what transformations occur where.
Figured I'd share it as it may help someone else.
https://moog.sh/posts/openai_ollama_mcp_flow.html
Somewhat ironically, Claude built the JS hooks for my SVG with about five minutes of prompting.
I'm also getting more into the lower-level LLM fine-tuning, training on custom chat templates, etc. which is more of where the diagram was needed.
I thought it funny to think how this is all to give the impression to the user that the AI, for example, _knows_ the weather. The AI doesn't: it's just getting it from a weather API and wrapping some text around it.
Now, imagine being given a requirement 5 years ago like: "When the user asks, we need to be able to show them the weather from this API, and wrap some text around it". Imagine something like your diagram came back as the proposed the solution:| Not at all a criticism of any of your stuff, but it blows my mind how tech develops.
You don't really have to parse the output, Python already has a parser in the form of the AST library[1].
But I get your drift. Depending on your workflow this could seem like more work.
What I don't want to happen is for some shitty webdev who writes an AI client in JavaScript to be forced to write a custom parser for some bespoke tool call language (call it "MLML", the Machine Learning Markup Language, to be superseded by YAMLML and then YAYAMLML, ...), or god forbid, somehow embed a WASM build of Python in their project to be able to `import ast`, instead of just parsing JSON and looking at the fields of the resulting object.
I got a good snicker out of the YAYAMLMLOLOL :D
Seems like it's tools calling tools all the way down heh
So back to the credentials, that means that the credentials are managed “client-side” and the LLM never needs to see any of that. Think of it like this, say you set up an MCP url (my-mcp.com); the LLM knows nothing of this url, or what MCP server you use. So if instead you called my-mcp.com/<some-long-string>/, the LLM still doesn’t know. Now, instead of a URL parameter, your tool calls the MCP with a header (Bearer: <token>), the LLM still doesn’t know and you’ve accessed an OAUTH endpoint.
ref: https://modelcontextprotocol.io/specification/2025-03-26/bas...
One of the many issues with Spring is that abstractions it provides are extremely leaky [1]. It leaks frequently and when it does, an engineer is faced with the need to comprehend a pile of technology[2] that was supposed to be abstracted away in the first place.
This is the best I can do for rationalizing Spring.
Also Spring is a kind of franchise or brand, and the individual projects under the umbrella vary a lot in quality.
Spring has more “synergy” in a sense than using a bunch of separate libraries, but because of that it’s also a big ball of mud that your code sits on top of, but isn’t on top of it in the sense of being in control.
But all in all, it's a great set of frameworks in the enterprise Java/Kotlin space. I'd say it's that synergy, which makes it worth the while.
I'm curious, though. Is the use of dependency injection part of the portfolio of criticisms towards Spring?