71 pointsby reqo9 months ago2 comments
  • Centigonal9 months ago
    There's something that tickles me about this paper's title. The thought that everyone should know these three things. The idea of going to my neighbor who's a retired K-12 teacher and telling her about how adding MLP-based patch pre-processing layers improves Bert-like self-supervised training based on patch masking.
    • woopwoop9 months ago
      Clickbait titles are something of a tradition in this field by now. Some important paper titles include "One weird trick for parallelizing convolutional neural networks", "Attention is all you need", and "A picture is worth 16x16 words". Personally I still find it kind of irritating, but to each their own I guess.
      • minimaxir9 months ago
        Only the first one is clickbait in the style of blogs that incentivize you to click on the headline (i.e. the information gap), the last two are just fun puns.
        • janalsncm9 months ago
          Honestly I took the first one as making fun of that trope. Usually the “one weird trick to” ends in some tabloid-style thing like lose 15 pounds or find out if your husband is loyal. So “parallizing CNNs” is a joke, as if that’s something you’d see in a checkout isle.
        • woopwoop9 months ago
          In what sense is "Attention is all you need" a pun?
          • minimaxir9 months ago
            It's a reference to the lyric "love is all you need" from the song "All You Need Is Love" by the Beatles, and it uses a faux-synonym with a different meaning.
      • adultSwim9 months ago
        "Attention is all you need" is an outlier. They backed up their bold claim with breakthrough results.

        For modest incremental improvements, I greatly prefer boring technical titles. Not everything needs to a stochastic parrot. We see this dynamic with building luxury condos. On any individual project, making that pick will help juice profit. When the whole city follows that , it leads to a less desirable outcome.

      • 9 months ago
        undefined
      • throwaway_x0319 months ago
        "Time and Space Are Not What You Think — Introducing the Special Theory of Relativity"
    • pixl979 months ago
      Hey, when the AI powered T-rex is chasing you down you'll wish you paid attention that the vision transformers perception is based on movement!

      Had to throw some Jurassic Park humor in here.

    • guerrilla9 months ago
      Yeah, I guess today was the day that I learned I am not part of "everyone". I feel so left out now.
  • i5heu9 months ago
    I put this paper into 4o so i can check if it is relevant, so that you do not have to do this too here are the bullet points:

    - Vision Transformers can be parallelized to reduce latency and improve optimization without sacrificing accuracy.

    - Fine-tuning only the attention layers is often sufficient for adapting ViTs to new tasks or resolutions, saving compute and memory.

    - Using MLP-based patch preprocessing improves performance in masked self-supervised learning by preserving patch independence.

    • Jamesoncrate9 months ago
      just read the abstract
      • jmugan9 months ago
        You would think. I don't know about this paper in particular, but I'm continually surprised about how much more I get out of LLM summaries of papers than the abstracts of papers written by the authors.
        • mananaysiempre9 months ago
          Paper abstracts are not optimized by drive-by readers like you and me. They are optimized for active researchers in the field reading their daily arXiv digest that lists all the new papers across the categories they work in, and needing to take the read/don't-read decision for each entry there as efficiently as possible.

          If you’ve already decided you’re interested in the paper, then the Introduction and/or Conclusion sections are what you’re looking for.

          • andai9 months ago
            Wouldn't a more comprehensive, digestible bullet point summary be even more helpful to actual researchers choosing which papers to read?
        • tough9 months ago
          This would be an interesting metric to track, how different an abstract generated from LLM giving it the paper as source, vs the actual abstract is, and if it has any correlation whatsoever with the overall quality of the paper or not
        • kridsdale39 months ago
          Same. I don't think GP deserves the downvotes.