71 pointsby sea-gold3 days ago8 comments
  • hombre_fatal3 days ago
    It could use a section on high level justification / inspiration.

    For example, what inspired this over a typical paginated API that lets you sort old to new with an afterId parameter?

  • philsnow2 days ago
    Because the client requests pagination by lastEventId (a UUID), the server needs to remember every event forever in order to correctly catch up clients.

    If instead the client paginated by lastEventTimestamp, then a server that for any reason no longer had a particular event UUID could at least start at the following one.

    • tinodb2 days ago
      That’s why the article suggests using a uuid v6 which is time orderable. Or prefixing with an incrementing db id. So indeed, if you intend to delete events, you might want to make sure you have orderable ids of some sort.
  • sea-gold3 days ago
    Previously discussed (April 2022; 95 comments): https://news.ycombinator.com/item?id=30904220
  • zzo38computer2 days ago
    I think that HTTP is not the best way to do it, and that JSON is also not the best way to do it. (HTTP may work reasonably when you only want to download existing events and do not intend to continue polling.)

    I also think using UUID alone isn't the best way to make the ID number. If events only come from one source, then just using autoincrementing will work (like NNTP does for article numbers within a group); being able to request by time might also work (which is also something that NNTP does).

  • lud_lite3 days ago
    What happens if you need to catch up? You keep calling in a loop with a new lastEventId?

    What is the intention there though. Is this for social media type feeds, or is this meant for synchronising data (at the extreme for DB replication for example!).

    What if anything is expected of the producer in terms of how long to store events?

    • dgoldstein03 days ago
      Sounds like it. But the compaction section has more details - basically you can discard events that are overwritten by later ones
  • DidYaWipe2 days ago
    Never heard of "CloudEvents" before. How do people feel about those?
  • wackget3 days ago
    Did someone just reinvent a GET API with cursor-based pagination?
    • hdjrudni2 days ago
      Sure looks like it. I'm not getting what's new or interesting here.

      Cursors are actually better because you can put any kind of sort order in there. This "lastEventId" seems to be strictly chronological.

  • fefe233 days ago
    This is an astonishingly bad idea. Don't do this.

    Use HTTP server-sent events instead. Those can keep the connection open so you don't have to poll to get real-time updates and they will also let you resume from the last entry you saw previously.

    https://developer.mozilla.org/en-US/docs/Web/API/Server-sent...

    • montroser3 days ago
      Yeah, but in real life, SSE error events are not robust, so you still have to do manual heartbeat messages and tear down and reestablish the connection when the user changes networks, etc. In the end, long-polling with batched events is not actually all that different from SSE with ping/pong heartbeats, and with long-polling, you get the benefit of normal load balancing and other standard HTTP things
      • andersmurphya day ago
        Never had to use ping/pong with SSE. The reconnect is reliable. What you probably had happen was your proxy or server return a 4XX or 5XX and that cancels the retry. Don't do that and you'll be fine.

        SSE works with normal load balancing the same as regular request/response. It's only stateful if you make your server stateful.

      • mikojan2 days ago
        But SSE is a standard HTTP thing. Why would you not be able to do "normal load balancing"?

        I would also rather not have a handful of long-polling loops pollute the network tab.

        • jpc02 days ago
          “Normal load balancing” means “Request A goes to server A”, “Request B goes to server B” and there is no state held in the server, if there is a session its stored in a KV store or database which persists.

          With SSE the server has to be stateful, for load balancing to work you need to be able to migrate connections between servers. Some proxies / load balancers don’t like long lasting connections and will tear them down if there has been no traffic so your need to constantly send a heart beat.

          I have deployed SSE, I love the technology, I wouldn’t deploy it if I don’t control the end devices and everything in between, I would just do long polling.

          • kiitos2 days ago
            Your description of "normal load balancing" is certainly one way to do load balancing, but in no way is it the presumptive default. Keeping session data in a shared source of truth like a KV store or DB, and expecting (stateless) application servers to do all their session stuff thru that single source of truth, is a fine approach for some use cases, but certainly not a general-purpose solution.

            > With SSE the server has to be stateful, for load balancing to work you need to be able to migrate connections between servers.

            Weird take. SSE is inherently stateful, sure, in the sense that it generally expects there to be a single long-lived connection between the client and the server, thru which events are emitted. Purpose of that being that it's a more efficient way to stream data from server to client -- for specific use cases -- than having the client long-poll on an endpoint.

            • jpc02 days ago
              > Keeping session data in a shared source of truth like a KV store or DB, and expecting (stateless) application servers to do all their session stuff thru that single source of truth

              What would be a scalable alternative?

              Simple edge-case why this is a reasonable approach. Load balancer sends request to server A, server A sends response and goes offline, now load balancer has to send all request to server B->Z until server A comes back online. If the session data was stored on server A all users who were previously communicating to server A now lost their session data, probably reprompting a sign-in etc

              Theres some state you can store in a cookie, hopefully said state isn’t in any was mean to be trusted since rule 1 of web is you don’t trust the client. Simple case of a JWT for auth, you still need to validate the JWT is issued by you and hasn’t been invalidated, ie a DB lookup.

              • andersmurphya day ago
                This is the same with request response. You need to auth on each request (unless you use a cookie).
                • jpc019 hours ago
                  Exactly that you use a cookie which stores an id to a session stored in the KV/DB.

                  Moving the session data to a JWT stores some session data in the JWT but then you need to validate the JWT on each request which depending on your architecture might be less overhead but it still means you need some state stored in a KV/DB and it cannot be stored on server same as with a session, this might legitimately be less state, just a JWT id of some sort and whether it’s not revoke but it cannot exist on the server, it needs to be persistent.

          • andersmurphya day ago
            This take that SSE is stateful is so strange. Server dies it reconnects to another server automatically (and no you don't need ping/pong). It's only stateful if you make it stateful. It works with load balancing the same as anything else.
            • jpc019 hours ago
              The SSE spec has an event id and the spec states sending last event id on reconnection. That is by its nature stateful, now you could store that in a DB/KV itself but presumably you are already storing session data for auth and rate limiting so now you had to implement a different store for events.

              And I too naively believed there won’t be a need for ping/pong, then my code hit the real world and ping/pong with aliveness checks was in the very next commit because not only do load balancers and proxies decide to kill your connection, they will do it without actually closing the socket for some timeout so your server and client is still blissfully unaware the connection is dead. This may be a bug, but it’s in some random device on the internet which means I have to work around it.

              Long polling might run into the same issues but in my experience it hasn’t.

              I really do encourage you to actually implement this kind of pattern in production for a reasonable number of users and time, there’s a reason so many people recommend just using long polling.

              This also assumes long running servers, long polling would fall back to just boring old polling, SSE would be more expensive if your architecture involves “serverless”.

              Realistically I still have SSE in production, on networks I can control all the devices in the chain because otherwise things just randomly break…

              • mikojan4 hours ago
                > The SSE spec has an event id and the spec states sending last event id on reconnection.

                Last event ID is not mandatory. You may omit event IDs and not deal with last event ID headers at all.

                More importantly, the client is sending the last event ID header. Not the server. The only state in the server is a list of events somewhere which you would have to have anyway if you want clients to receive events that occurred when they were not connected or if you allowed clients to fetch a subset of them like with long-polling.

                So there is really no difference at all here with regards to long-polling

      • xyzzy_plugh3 days ago
        Correct. In the end, mechanically, nothing beats long polling. Everything ends up converging at which point you may as well just long poll.
    • toomim3 days ago
      Or use Braid-HTTP, which gives you both options.

      (Details in the previous thread on HTTP Feeds: https://news.ycombinator.com/item?id=30908492 )

    • Alifatisk2 days ago
      Isn't SSE limited to like 12 tabs or something? I remember vividly reading about a huge limitation on that hard limit.
      • curzondax2 days ago
        6 tabs is the limit on SSE. In my opinion Server Sent Events as a concept is therefore not usable in real world scenarios as of this limitation or error-detection around that limitation. Just use Websockets instead.