Show HN: Aegis-DB – Multi-paradigm database in Rust,in production(github.com)

1 pointby AutomataNexus4 hours ago1 comment

AutomataNexus4 hours ago
Hey HN, I'm Andrew. I built Aegis-DB — a database that handles SQL, key-value, document, time series, graph, and event streaming in a single Rust binary. It's been running in production on 50+ Raspberry Pi edge controllers across commercial buildings for months.
*The real story:* I built an entire building automation ecosystem from scratch. NexusBMS is the central platform (won an InfluxDB hackathon with it — runs InfluxDB 3.0 OSS alongside Aegis-DB). 16+ facilities including Taylor University, Element Labs, Byrna Ammunition, St. Jude Catholic School, Heritage Point Retirement Facilities (two cities), and more. Over 120 pieces of equipment — air handlers, boilers, cooling towers, pumps, DOAS units, natatorium pool units, exhaust fans, greenhouses.
The edge controllers are 50+ Raspberry Pi 4/5s running my custom NexusEdge software — Rust hardware daemons for I2C, BACnet, and Modbus communications, direct HVAC equipment control via analog outputs, 24V triacs, 0-10V inputs, 10K/1K thermistor inputs, and dry contact inputs. Custom control logic per equipment type. Pi 5s have Hailo NPU chips running larger ML models for predictive maintenance, Pi 4s run smaller AxonML Rust inference models (my own ML framework — also open source).
Each Pi runs Aegis-DB locally for sensor data collection, time series storage, equipment state, and real-time alert streaming. Those edge instances replicate to the central Aegis-DB server using CRDTs for conflict-free synchronization. OTA rolling updates push new versions across the fleet without downtime.
The edge deployment is what drove the design, but Aegis-DB isn't just for Pis. It's the primary database for my PWAs, mobile apps, and server-side services too. The central NexusBMS server runs it. My laptop runs it for development. It's a general-purpose multi-paradigm database that happens to also scale down to a Raspberry Pi — which is a harder constraint to satisfy than scaling up.
*What it actually is:*
- Full SQL engine (sqlparser crate) with cost-based planner, volcano-model executor, B-tree/hash indexes, index-accelerated SELECT/UPDATE/DELETE, plan cache (LRU 1024), MVCC with snapshot isolation, WAL, VACUUM/compaction - Direct execution API — closure-based indexed updates that bypass SQL parsing entirely (this is how the fund transfer benchmark hits 758K TPS) - KV store on DashMap (12.3M reads/sec, 203K/sec over HTTP, optional TTL per key) - Document store with MongoDB-style query operators ($eq, $gt, $in, $regex, $and, $or, etc.), collection-level hash/B-tree indexes, sort/skip/limit/projection - Time series with Gorilla compression (delta-of-delta timestamps + XOR floats), retention policies, automatic downsampling, atomic persistence with crash recovery - Graph engine with adjacency lists for O(degree) traversal, label and relationship indexes, property bags on nodes and edges - Pub/sub streaming with persistent subscriptions, consumer groups, CDC with before/after images - Raft consensus + 8 CRDT types (GCounter, PNCounter, GSet, TwoPSet, ORSet, LWWRegister, MVRegister, LWWMap) + vector clocks + hybrid clocks + 2-phase commit + consistent hashing (HashRing, JumpHash, Rendezvous) - OTA rolling updates — followers first, leader last, SHA-256 binary verification, automatic rollback on health check failure - Multi-database isolation — each app gets its own namespace, auto-provisioned on first query, separate persistence - Query safety limits (max rows, query timeout) enforced at executor level - Bulk import (CSV/JSON) for SQL tables, document collections, and KV pairs - Encrypted backups (AES-256-GCM) with restore and backup management - Full web dashboard (Leptos/WASM) — cluster monitoring, data browsers for every paradigm, query builder, user/role management, activity feed, alerts - Python SDK (async, aiohttp), JavaScript/TypeScript SDK (fetch-based), Grafana data source plugin - CLI with interactive SQL shell, node registry with auto-discovery, multi-format output (table/JSON/CSV)
*What makes it different from SurrealDB / other multi-model databases:*
- *Compliance engine.* Built-in GDPR, HIPAA, CCPA, SOC 2, FERPA support with actual REST endpoints — not documentation about how you could do compliance. GDPR right to erasure with cryptographic deletion certificates. HIPAA PHI column-level classification (6 levels). Consent lifecycle management (12 purpose types, full audit trail). Breach detection with anomaly thresholds and incident response workflow. Over 25 compliance endpoints under `/api/v1/compliance/`. - *Edge-first design.* Runs on a Raspberry Pi at ~50 MB RSS. 8 CRDT types for conflict-free edge-to-central replication. OTA rolling updates across a fleet. Offline-first — Pis keep working when network drops, sync when it returns. - *Security from day one.* TLS 1.2/1.3 (rustls), Argon2id (19MB memory-hard), RBAC with 25+ permissions, OAuth2/OIDC + LDAP/AD, MFA (TOTP with backup codes), HashiCorp Vault (Token/AppRole/Kubernetes auth), token bucket rate limiting (30/min login, 1000/min API), security headers (CSP, HSTS, X-Frame-Options), encrypted backups (AES-256-GCM), cryptographic audit log verification, request ID tracing. - *Actually fast.* 758K TPS fund transfers (7x SpacetimeDB). 12.3M KV reads/sec. 203K KV ops/sec over HTTP. Direct execution API for hot paths that bypasses SQL entirely.
*Performance (engine-level, single node):*
- SQL inserts: 223K rows/sec - KV reads: 12.3M ops/sec | KV writes: 3.97M ops/sec | KV over HTTP: 203K ops/sec - Fund transfers: 758K TPS zero contention (7x SpacetimeDB), 2.5M TPS high contention (24x SpacetimeDB) - HTTP API: 80K SQL inserts/sec, 40K reads/sec, 245μs avg KV latency
*License:* BSL 1.1 (free for everything except reselling as a hosted DBaaS). Converts to Apache 2.0 in 2030.
13 Rust crates, ~60K LOC, 634 tests. Happy to answer questions about the edge deployment architecture, the CRDT replication, compliance features, or anything else.