53 pointsby zc26104 hours ago11 comments
  • neomantraan hour ago
    > MCP tools don't really work for financial data at scale. One tool call for five years of daily prices dumps tens of thousands of tokens into the context window.

    I maintain an OSS SDK for Databento market data. A year ago, I naively wrapped the API and certainly felt this pain. Having an API call drop a firehose of structured data into the context window was not very helpful. The tool there was get_range and the data was lost to the context.

    Recently I updated the MCP server [1] to download the Databento market data into Parquet files onto the local filesystem and track those with DuckDB. So the MCP tool calls are fetch_range to fill the cache along with list_cache and query_cache to run SQL queries on it.

    I haven't promoted it at all, but it would probably pair well with a platform like this. I'd be interested in how people might use this and I'm trying to understand how this approach might generally work with LLMs and DuckLake.

    [1] https://github.com/NimbleMarkets/dbn-go/blob/main/cmd/dbn-go...

  • zc26104 hours ago
    Hi HN. We built LangAlpha because we wanted something like Claude Code but for investment research.

    It's a full stack open-source agent harness (Apache 2.0). Persistent sandboxed workspaces, code execution against financial data, and a complete UI with TradingView charts, live market data, and agent management. Works with any LLM provider, React 19 + FastAPI + Postgres + Redis.

    • zc26103 hours ago
      Some technical context on what we ran into building this.

      MCP tools don't really work for financial data at scale. One tool call for five years of daily prices dumps tens of thousands of tokens into the context window. And data vendors pack dozens of tools into a single MCP server, schemas alone can eat 50k+ tokens before the agent does anything useful. So we auto-generate typed Python modules from the MCP schemas at workspace init and upload them into the sandbox. The agent just imports them like a normal library. Only a one-line summary per server stays in the prompt. We have around 80 tools across our servers and the prompt cost is the same whether a server has 3 tools or 30. This part isn't finance-specific, it works with any MCP server.

      The other big thing was making research actually persist across sessions. Most agents treat a single deliverable (a PDF, a spreadsheet) as the end goal. In investing that's day one. You update the model when earnings drop, re-run comps when a competitor reports, keep layering new analysis on old. But try doing that across agent sessions, files don't carry over, you re-paste context every time. So we built everything around workspaces. Each one maps to a persistent sandbox, one per research goal. The agent maintains its own memory file with findings and a file index that gets re-read before every LLM call. Come back a week later, start a new thread, it picks up where it left off.

      We also wanted the agent to have real domain context the way Claude Code has codebase context. Portfolio, watchlist, risk tolerance, financial data sources, all injected into every call. Existing AI investing platforms have some of that but nothing close to what a proper agent harness can do. We wanted both and couldn't find it, so we built it and open-sourced the whole thing.

      • loumacielan hour ago
        You can make MCP tools work for any type of data by using a proxy like https://github.com/lourencomaciel/sift-gateway/.

        It saves the payloads into SQLite, maps them, and exposes tools for the model to run python against them. Works very well.

      • esafak2 hours ago
        You shouldn't dump data in the context, only the result of the query.
        • zc26102 hours ago
          Yes, thats is the idea and exactly what we did
  • kolinko2 hours ago
    Nice!

    What I missed from the writeup were some specific cases and how did you test that all this orchestration delivers worthwhile data (actionable and full/correct).

    E.g. you have a screenshot of the AI supply chain - more of these would be useful, and also some info about how you tested that this supply chain agrees with reality.

    Unless the goal of the project was to just play with agent architecture - then congrats :)

  • D_R_Farrellan hour ago
    I've been wondering for a long time about when this more Bayesian approach would become available alongside an AI. Really excited to play around with this!

    Is this kind of like a Karpathy 2nd brain for investing then?

  • erdaniels4 hours ago
    Then people would lose a lot of money
    • locusofself2 hours ago
      Agreed. Unless this really helps people somehow make better trading decisions than existing tools, the vast majority of them are probably still better off index investing.
    • xydacan hour ago
      Its crazy how many similar threads exists today.
  • ForOldHackan hour ago
    Note: Never make angry the gods of code. Never. If you do, they will leave angry on Friday night, and come back with some *amazing* thing like this on Monday:

    Obligatory: Brilliant Work. Brilliant.

    "We wanted both and couldn't find it, so we built it and open-sourced the whole thing."

    \m/ \m/ /m\ /m\

  • s_suiindik37 minutes ago
    [dead]
  • adilkhanovkzan hour ago
    [dead]
  • 4 hours ago
    undefined
  • 4 hours ago
    undefined
  • zz072 hours ago
    The PTC architecture and smart skills loading strategy save a lot of token. More cache strategies may be implemented in the future to improve performance. Additionally, the static user preference setting can be upgraded to be adaptive in the future
    • tornikeo2 hours ago
      Forgive my senses, but this writing feels like a low effort Claude response. What's the point adding responses like this to a Show HN post? I don't think you are fooling anyone.
      • cbg02 hours ago
        They're trying to build up new accounts with karma to astroturf products/services.