2 pointsby yatesdr7 hours ago1 comment
  • yatesdr7 hours ago
    Happy to report v0.3 released for go-llm-proxy!

    Great for connecting your local LLM coding and vision models to Claude Code and Codex.

    General improvements

    > Vision pipeline - images described by your vision model, transparent to the client

    > Dual OCR pipeline - smart routing for PDFs and tool output (text extraction first, vision fallback for scanned docs). Dedicated OCR models like

    > PaddleOCR-VL are ~17x faster than general vision models on document pages

    > Brave & Tavily search integration - native behavior for Claude Code and Codex when configured on the proxy

    > Per-model processor routing - override vision, OCR, and search settings per model

    > Context window auto-detection from backends SSE keepalive improvements during pipeline processing Full MCP SSE endpoint for web search on OpenCode, Qwen Code, Claw, and other MCP-compatible agents Docker update for easier deployment (limited testing so far)

    Codex-specific

    > Full Responses API translation - Chat Completions under the hood, your local backend doesn't need to support /v1/responses

    > Reasoning token display - reasoning_summary_text.delta events so Codex shows thinking natively

    > Native search UI - emits web_search_call output items so Codex renders "Searched N results" in its interface

    > Structured tool output - Codex's view_image returns arrays/objects, not strings. The proxy handles all three formats

    > mcp_tool_call_output and mcp_list_tools input types handled (Codex sends these, other backends choke on them)

    > Config generator produces config.toml with provider, reasoning effort, context window, and optional Tavily MCP

    Claude Code-specific:

    > Full Messages API translation - Anthropic protocol to Chat Completions, so Claude Code works with vLLM/llama-server

    > Thinking blocks - backend reasoning tokens wrapped as thinking/signature_delta content blocks so Claude Code renders them

    > web_search_20250305 server tool intercepted and executed proxy-side

    > PDF type: "document" blocks extracted to text before forwarding

    > Streaming search with server_tool_use + web_search_tool_result blocks so Claude Code shows "Did N searches"

    > /anthropic/v1/messages explicit route for clients that use the Anthropic base URL convention

    > Config generator produces settings.json with Sonnet/Opus/Haiku tier selectors, thinking toggles, and start scripts