For example, what inspired this over a typical paginated API that lets you sort old to new with an afterId parameter?
If instead the client paginated by lastEventTimestamp, then a server that for any reason no longer had a particular event UUID could at least start at the following one.
I also think using UUID alone isn't the best way to make the ID number. If events only come from one source, then just using autoincrementing will work (like NNTP does for article numbers within a group); being able to request by time might also work (which is also something that NNTP does).
What is the intention there though. Is this for social media type feeds, or is this meant for synchronising data (at the extreme for DB replication for example!).
What if anything is expected of the producer in terms of how long to store events?
I believe cloud events are most common in Kafka-adjacent or event-driven architectures but I think they're used in some GCP serverless things, too
Yep, any opportunity to not reinvent stuff is a big win.
But I'm wary of layers upon layers. I'm thinking of how this could be combined with MQTT... it doesn't seem totally redundant.
Cursors are actually better because you can put any kind of sort order in there. This "lastEventId" seems to be strictly chronological.
Use HTTP server-sent events instead. Those can keep the connection open so you don't have to poll to get real-time updates and they will also let you resume from the last entry you saw previously.
https://developer.mozilla.org/en-US/docs/Web/API/Server-sent...
SSE works with normal load balancing the same as regular request/response. It's only stateful if you make your server stateful.
I would also rather not have a handful of long-polling loops pollute the network tab.
With SSE the server has to be stateful, for load balancing to work you need to be able to migrate connections between servers. Some proxies / load balancers don’t like long lasting connections and will tear them down if there has been no traffic so your need to constantly send a heart beat.
I have deployed SSE, I love the technology, I wouldn’t deploy it if I don’t control the end devices and everything in between, I would just do long polling.
> With SSE the server has to be stateful, for load balancing to work you need to be able to migrate connections between servers.
Weird take. SSE is inherently stateful, sure, in the sense that it generally expects there to be a single long-lived connection between the client and the server, thru which events are emitted. Purpose of that being that it's a more efficient way to stream data from server to client -- for specific use cases -- than having the client long-poll on an endpoint.
What would be a scalable alternative?
Simple edge-case why this is a reasonable approach. Load balancer sends request to server A, server A sends response and goes offline, now load balancer has to send all request to server B->Z until server A comes back online. If the session data was stored on server A all users who were previously communicating to server A now lost their session data, probably reprompting a sign-in etc
Theres some state you can store in a cookie, hopefully said state isn’t in any was mean to be trusted since rule 1 of web is you don’t trust the client. Simple case of a JWT for auth, you still need to validate the JWT is issued by you and hasn’t been invalidated, ie a DB lookup.
Moving the session data to a JWT stores some session data in the JWT but then you need to validate the JWT on each request which depending on your architecture might be less overhead but it still means you need some state stored in a KV/DB and it cannot be stored on server same as with a session, this might legitimately be less state, just a JWT id of some sort and whether it’s not revoke but it cannot exist on the server, it needs to be persistent.
And I too naively believed there won’t be a need for ping/pong, then my code hit the real world and ping/pong with aliveness checks was in the very next commit because not only do load balancers and proxies decide to kill your connection, they will do it without actually closing the socket for some timeout so your server and client is still blissfully unaware the connection is dead. This may be a bug, but it’s in some random device on the internet which means I have to work around it.
Long polling might run into the same issues but in my experience it hasn’t.
I really do encourage you to actually implement this kind of pattern in production for a reasonable number of users and time, there’s a reason so many people recommend just using long polling.
This also assumes long running servers, long polling would fall back to just boring old polling, SSE would be more expensive if your architecture involves “serverless”.
Realistically I still have SSE in production, on networks I can control all the devices in the chain because otherwise things just randomly break…
Last event ID is not mandatory. You may omit event IDs and not deal with last event ID headers at all.
More importantly, the client is sending the last event ID header. Not the server. The only state in the server is a list of events somewhere which you would have to have anyway if you want clients to receive events that occurred when they were not connected or if you allowed clients to fetch a subset of them like with long-polling.
So there is really no difference at all here with regards to long-polling
(Details in the previous thread on HTTP Feeds: https://news.ycombinator.com/item?id=30908492 )
If you have to use H1 for some reason you can easily prune connections with the browser visibility api.