Rack makes Pion SCTP 71% faster with 27% less latency(pion.ly)

22 pointsby mosuraa month ago1 comment

Veserva month ago
While it is nice that it is faster, ~7 Gb/core-second using a in-process "virtual network", (thus only measuring the protocol implementation itself instead of the rest of the network stack) is not exactly a fast network protocol implementation. That is ~500,000-700,000 full packets per second or ~1.5-2 core-us/packet.
Under those same conditions, you can quite readily do ~100 Gb/core-second (ignoring encryption, encryption will bottleneck you to 30-50 Gb/core-second on modern chips with AES acceleration instructions) in software with feature parity with proper protocol design and implementation.
- JoTurka month ago
  SCTP isn't just a UDP pipe, It's a message oriented, congestion-controlled, reliability protocol, with bunch of other semantics.
  We measured:
  1. Association state + Per PATH CC/RTO, timers, RTT tracking, cwnd etc.
  2. Selective ACKs and re-transmit logic.
  3. chunk framing + tsn sequences.
  4. ordered vs unordered delivery, and fragmentation/reassembly.
  much more ...
  Also our vnet-based implementation isn't just dumb buffer, we have packet on wire validation, SCTP parsing, CRC32c validations. deterministic network conditions emulator. With real time conditions.
  Sure you can get 100 GB/Core second if you bypass all of that and just do huge batching
  The blog post claim is just under the same SCTP semantics and the same test harness, enabling RACK has a huge win. not the absolute ceilings of in-process "virtual network" sockets :)
  - Veserva month ago
    Yes, I meant all of that when I explicitly said feature parity at 100 Gb/core-second. Reliable delivery of multiple independent bytestreams (which is actually more than SCTP gives since SCTP still suffers from head-of-line blocking due to SCTP SACKs being by TSN instead of a per-stream identifier) with dynamic stream count (again, more than SCTP gives) over a unreliable network that may reorder or lose packets.
    JoTurka month ago
    Okay, I see your point, but our test harness isn't meant to be an absolute "max throughput" benchmark. every packet is parsed, corrected (if needed), and validated in real time (CRC32c, on-the-wire checks, deterministic network emulation, etc.).
    If we ever want a true ceiling number, we could add a separate fast path (e.g, a dump-writer / sink that skips most validation) or validate after run, but that's not in scope right now. our scope was: (1) validate Pion/SCTP PRs and (2) compare performance against other branches and version. so for relative benchmark under identical conditions.
    on head-of-line blocking: we have a pending RFC 8260 message interleaving (I-DATA) implementation, and we've tested with it; it helps reduce HoL on the sender side (especially around fragmentation). our benchmark tool has a flag to run with interleaving, and we tested it quit a bit. We plan to release it in Jan.