4 pointsby botwork8 hours ago1 comment
  • botwork8 hours ago
    OP here. I spent the last few months building a local semantic search engine for my personal archive (1997-2025).

    The Problem: I have 28 years of digital history (journals, emails, notes) that I wanted to query to find patterns, but I didn't want to upload sensitive data to a cloud vector store.

    The Stack:

    Ingestion: Python scripts to parse chaotic formats (mbox, docx, json).

    Embeddings: Local FAISS index.

    Inference: Running Qwen 2.5 (32b) via Ollama on MBP. App on my NAS via Tailscale.

    UI: Simple React frontend.

    Full Architecture Notes: I wrote up the detailed breakdown (chunking strategy, PII redaction pipeline) here: https://botwork.com/trace