4 pointsby ludovicianul4 hours ago4 comments
  • austin-cheney3 hours ago
    I am going to write an original in-memory database in JavaScript. I hate SQL and believe I can write something that executes faster than existing solutions while also feeling natural to JavaScript: storage and search via objects and arrays.
    • Horos3 hours ago
      Interesting project. A few questions that came to mind: How do you handle GC pressure at scale? V8's hidden classes make homogeneous object arrays fast, but the per-object overhead adds up — 100K entries is already 6-8 MB of metadata alone, and major GC pauses become unpredictable. What's the persistence story? The moment you serialize to IndexedDB or OPFS, the "native structures" advantage disappears. Have you looked at columnar formats to keep it fast? How do you handle compound queries without a planner? Something like "age > 30 AND city = 'Paris' ORDER BY name" needs index selection strategy, otherwise you're back to full scans. The part I find most compelling is reactive queries — define a filter, then as objects land in the store (from DOM extraction, a WebSocket, whatever), results update incrementally via Proxy interception. No re-scan. That's not really a database, it's a live dataflow layer. Concrete example: a browser extension that extracts product data from whatever page you're on. Each page dumps heterogeneous objects into the store. A reactive query like "items where price < 50 and source contains 'amazon'" updates in real time as you browse. No server, no SQL, just JS objects flowing through live filters. That would be genuinely useful and hard to do well with existing tools.
  • Horos3 hours ago
    Prompt injection detection library in Go, zero regex.

    Most injection evasion works by making text look different to a scanner than to the LLM. Homoglyphs, leet speak, zero-width characters, base64 smuggling, ROT13, Unicode confusables — the LLM reads through all of it, but pattern matchers don't.

    The project is two curated layers, not code:

    Layer 1 — what attackers say. ~35 canonical intent phrases across 8 categories (override, extraction, jailbreak, delimiter, semantic worm, agent proxy, rendering...), multilingual, normalized.

    Layer 2 — how they hide it. Curated tables of Unicode confusables, leet speak mappings, LLM-specific delimiters (<|system|>, [INST], <<SYS>>...), dangerous markup patterns. Each table is a maintained dataset that feeds a normalisation stage.

    The engine itself is deliberately simple — a 10-stage normalisation pipeline that reduces evasion to canonical form, then strings.Contains + Levenshtein. Think ClamAV: the scan loop is trivial, the definitions are the product.

    Long term I'd like both layers to become community-maintained — one curated corpus of injection intents and one of evasion techniques, consumable by any scanner regardless of language or engine.

    Everything ships as go:embed JSON, hot-reloadable without rebuild. No regex (no ReDoS), no API calls, no ML in the loop. Single dependency (golang.org/x/text). Scans both inputs and LLM outputs.

    result := injection.Scan(text, injection.DefaultIntents()) if result.Risk == "high" { ... }

    https://github.com/hazyhaar/pkg/tree/main/injection

  • 3 hours ago
    undefined
  • steppacodes4 hours ago
    [dead]