I built a FULLY private AI to keep your data from big tech.(chatpdf-server-shtq.onrender.com)

1 pointby jimmyjonnya month ago1 comment

jimmyjonnya month ago
This is the most critical post you will make. Hacker News (HN) can crash your server with traffic if you get to the front page, so be ready.
The Golden Rule of HN: Do not "market." Explain how you built it. They care about the architecture, the code, and the hardware—not the "product benefits."
Here is the exact template to use. The Submission Fields
Title:
```
    Show HN: I built a public web interface that tunnels to my home RTX 3090 for inference
```
Url:
```
    [Your Render Link]
```
Text (The "First Comment"): (You must post this immediately after submitting the link. This is where you win them over.)
Hello HN,
I wanted to run a privacy-focused LLM service without paying for H100s or leaking data to OpenAI, so I built a hybrid architecture.
The Architecture:
```
    Frontend/Gateway: Hosted on Render. This handles the public traffic, auth, and UI.

    Inference: Hosted under my desk on my personal rig (RTX 3090).

    The Tunnel: The Render server acts as a dumb WebSocket relay. It accepts user prompts, encrypts them, and tunnels them to my local Python worker.

    Processing: My local worker uses Ollama for the LLM (Qwen 2.5 14B/32B) and Playwright for live web scraping/RAG.
```
The Stack:
```
    Backend: FastAPI + Python

    Protocol: Secure WebSockets (WSS)

    Inference: Ollama (Local)

    RAG: PyMuPDF (for PDF analysis) + Playwright (for "Deep Research" browsing)
```
Why I did this: I wanted "Sovereign AI." The cloud server never stores the unencrypted chat logs; it just passes packets. The actual intelligence lives on my hardware. It’s essentially a free way to expose a local LLM to the web for personal use (and for friends).