31 pointsby tdortmana day ago2 comments
  • dgacmu16 hours ago
    Kudos!

    It would be interesting if in your performance analysis on the readme you also showed the false positive rate, assuming the memory use between the data structures you're comparing is identical.

    • tdortman7 hours ago
      Sure thing, I added them for the two cases in the readme. There's a section in the thesis about the FPR for more fixed sizes if you're curious (spoiler: it's pretty much exactly in the middle, notably higher than the CPU Cuckoo Filter though because really small buckets are bad for performance)
  • shetayea day ago
    Very interesting! Nice work on your thesis. I am curious: if the data is not resident on the GPU (e.g. multi-TB datasets, line-rate packet inspection, etc.), is this approached bottle necked by the PCIe bus?

    (You may have addressed this in your thesis, feel free to tell me to go RTFD ;)

    • tdortmana day ago
      I haven't tested this but I would be very surprised if the PCIe bus wasn't a severe bottleneck in that case, unless you can somehow amortize the cost of the memcpy.

      Though that being said, with such massive datasets you'll already be bottlenecked by the necessary communication between GPUs (sadly even with NVLink) since the queried data always lives on the GPU.