1 pointby sowmith-tsrc6 hours ago1 comment
  • sowmith-tsrc6 hours ago
    I built Clonar, a Node.js RAG pipeline inspired by Perplexity that does explicit multihop reasoning over query intent, retrieval planning, and answer synthesis.

    Why multihop matters: Most RAG systems do "retrieve → synthesize" in one shot. Clonar chains 8 reasoning stages where each step conditions on prior outputs: query rewrite → clarification gate → filter extraction → grounding decision → retrieval planning → execution → quality-aware synthesis → optional deep-mode critique (which triggers a second full retrieval pass if the answer is insufficient).

    Technical approach:

    TypeScript/Node backend with REST and SSE streaming endpoints

    Orchestrator pattern: orchestrator.ts sequences LLM calls for planning vs synthesis

    Pluggable retrievers via pipeline-deps.ts (currently web via Perplexity API, extensible to vector DBs / SQL)

    Session + user memory for cross-turn reasoning

    Quality-driven formatting: derives retrieval confidence and adjusts UI hints

    What's interesting:

    The model reasons about whether to clarify ambiguous queries before retrieval (step 2)

    Retrieval plan is generated dynamically based on extracted filters and vertical (step 5), not hardcoded

    Deep mode (step 8): critique → expand prompt → second 7-stage run when first answer is weak

    No frontend needed—call /api/query or /api/query/stream from curl/Postman

    Feedback I'd love:

    Is the 8-stage flow overkill or are there steps I'm missing?

    API design for plugging in custom retrievers (vector, SQL, etc.)

    Production observability—I have basic tracing/metrics; what else would you prioritize?

    Repo: https://github.com/clonar714-jpg/clonar

    Stack: Node 18+, TypeScript, OpenAI + Perplexity APIs, optional Redis/Postgres

    Happy to answer questions about the architecture or design tradeoffs.