For benchmarks: I used real network (not loopback) and sync-majority writes in a 3-node Raft cluster. Happy to answer questions about tradeoffs vs Kafka / Redis Streams and what’s still missing.
VSR makes a lot of sense for their problem space: fixed schema, deterministic state machine, and a very tight control over replication + execution order.
Ayder has a different set of constraints: - append-only logs with streaming semantics - dynamic topics / partitions - external clients producing arbitrary payloads over HTTP
Raft here is a pragmatic choice: it’s well understood, easier to reason about for operators, and fits the “easy to try, easy to operate” goal of the system.
That said, I think VSR is a great example of what’s possible when you fully own the problem and can specialize aggressively. Definitely a project I’ve learned from.
I wish there was a standard protocol for consuming event logs, and that all the client side tooling for processing them didn't care what server was there.
I was part of making this:
https://github.com/vippsas/feedapi-spec
https://github.com/vippsas/feedapi-spec/blob/main/SPEC.md
I hope some day there will be a widespread standard that looks something like this.
An ecosystem building on Kafka clients libraries with various non-Kafka servers would work fine too, but we didn't figure out how to easily do that.
I’d love a world where “consume an event log” is a standard protocol and client-side tooling doesn’t care which broker is behind it.
Feed API is very close to the mental model I’d want: stable offsets, paging, resumability, and explicit semantics over HTTP. Ayder’s current wedge is keeping the surface area minimal and obvious (curl-first), but long-term I’d much rather converge toward a shared model than invent yet another bespoke API.
If you’re open to it, I’d be very curious what parts of Feed API were hardest to standardize in practice and where you felt the tradeoffs landed in real systems.
But because there wasn't any official spec it was a topic of bikeshedding organizationally. That would have been avoided by having more mature client libs and spec provided externally..
This spec is I a bit complex but it is complexity that is needed to support a wide range of backend/database technologies.. Simpler specs are possible by making more assumptions/hardcoding of how backend/DB works.
It has been a few years since I worked with this, but reading it again now I still like it in this version. (This spec was the 2nd iteration.)
The partition splitting etc was a nice idea that wasn't actually implemented/needed in the end. I just felt it was important that it was in the protocol at the time.
I think there’s a similar philosophy around simplicity and operator experience. Where Ayder diverges is in durability and recovery semantics nsq intentionally trades some of that off to stay lightweight.
The goal here is to keep the “easy to run” feeling, but with stronger guarantees around crash recovery and replication.
Thank you for sharing this with us.
Classic HTTP Range is byte-oriented, but custom range units (e.g. `Range: offsets=…`) or using `Link` headers for pagination both fit log semantics well.
I kept the initial API explicit (`offset` / `limit`) to stay obvious for curl users, but offset-range via headers is something I want to experiment with, especially if it helps generic tooling.
> Numbers are real, not marketing.
I'm not questioning the actual benchmarks or anything, but this README is substantially AI generated, yeah?
The benchmarks, logs, scripts, and recovery scenarios are all real and hand-run that’s the part I care most about being correct.
For the README text itself: I did iterate on wording and structure (including tooling), but the system, measurements, and tradeoffs are mine.
If any part reads unclear or misleading, I’m very open to tightening it up. Happy to clarify specifics.
When I read this type of prose it makes me feel like the author is more worried about trying to sell me something than just describing the project.
For instance, you don't need to tell me the numbers are "real". You just have to show me they're covering real-world use-cases, etc. LLMs love this sort of "telling not showing" where it's constantly saying "this is what I'm going to tell you, this is what I'm telling you, this is what I told you" structure. They do it within sections and then again at higher levels. They have, I think, been overindexed on "five-paragraph essays". They do it way more than most human writers do.
You're welcome to make your substantive points thoughtfully, of course.
Also if you disapprove, modding down is enough, you don't need to start a meta-discussion thread, which is itself a discouraged practice.
There are infinitely many facts. They don't select themselves—humans do that, and we do it for reasons which are not particularly factual (https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...).
> If you don't want feedback, don't solicit it.
If you read the lower part of https://news.ycombinator.com/showhn.html you'll see that the site has specific rules around how to offer feedback.
> you don't need to start a meta-discussion thread, which is itself a discouraged practice
That's true in general. I'm a mod here (sorry if that wasn't clear) and part of my job is to post replies when people are breaking the site guidelines. You're right that such comments are off topic and tediously meta - but it's a form of out-of-band communication that is necessary for keeping the site on-kilter. If it helps at all, these comments are even more tedious to write than they are to read :)
The point of these numbers is the envelope: 3-node consensus (Raft), real network (not loopback), and sync-majority writes (ACK after 2/3 replicas) plus the crash/recovery semantics (SIGKILL → restart → offsets/data still there).
If you have a quick Python setup that does majority-acked replication + fast crash recovery with similar measurements, I’d honestly love to compare apples-to-apples happy to share exact scripts/config and run the same test conditions.
Some poor quality software with bad performance, but an established piece of tech regardless.