Key Architectural Choices:
- Custom B-tree Variant: Unlike LSM-trees used in many disk-backed stores, our B-tree variant avoids the "compaction stalls" that typically cause high tail latency during heavy writes.
- Coroutines & io_uring: We leverage io_uring for asynchronous I/O and use coroutines to manage thousands of concurrent I/O requests without the context-switching overhead.
- Object Storage Integration (optional): EloqStore uses object storage as the primary persistent layer, with NVMe acting as a high-speed cache/tier, providing durability without sacrificing speed.
We’ve reached a point where we can provide predictable P99.99 latency even when the working set is primarily on NVMe. We’d love to answer any questions about the storage internals or our benchmarking process.
[1] github.com/eloqdata/eloqstore
Disclaimer: I am the CEO of EloqData