5 pointsby tesserato2 hours ago1 comment
  • tesserato2 hours ago
    A framework that decouples representational width from backbone width, offering the benefits of wider LLMs without the quadratic computational costs.