1 pointby fblgit5 hours ago1 comment
  • fblgit5 hours ago
    one of a kind single-transformer block layer, high throughput. The new generation of transformer-based lightweight models for common NLP tasks?