2 pointsby fatihturker2 hours ago1 comment
  • fatihturker2 hours ago
    One question I'm interested in exploring:

    If models become heavily compressed and streamed from SSD, where do people think the real bottleneck moves to — storage bandwidth, memory bandwidth, or kernel efficiency?