66 pointsby adunk9 hours ago9 comments
  • bot4035 minutes ago
    Now do the equivalent of just in time compilation. Claude sees that we need to respond to a lot of pings and writes a program to compute it instead of thinking about each one.
  • ShinyLeftPad42 minutes ago
    How quickly claude responds when it acts like a user space LLM chatbot?
  • fouc3 hours ago
    think about how much faster it would've been with a small local model!
  • twoodfin2 hours ago
    Modulo Anthropic messing with the model for load mitigation, I wonder how stable this result is.

    1,000 pings, how many correctly ponged?

  • 5 hours ago
    undefined
  • ValdikSS6 hours ago
    That's why LLM will eventually be used only for initial interaction between the user in their language, to prepare the data to a specialized model.

    Imagine face recognition to work like a text chat, where the PC gets the frame from the camera and writes in the chat: "Who's that? Here's the RGB888 image in hex: ...".

    • FeepingCreature2 hours ago
      That's actually how vision language models already work, pretty much.
      • stingraycharles2 hours ago
        Huh? The images are tokenized in the same way language is and it’s just fed into one single model. Not multiple smaller expert models.

        Image gets rasterized into smaller pieces (eg 4x4 pixels) and each of those is assigned a token, similarly how text is broken up into tokens. And the whole thing is fed into a single model.

        • FeepingCreature19 minutes ago
          Yes I'm saying

          > Imagine face recognition to work like a text chat, where the PC gets the frame from the camera and writes in the chat: "Who's that? Here's the RGB888 image in hex: ...".

          that's p much how it works.

    • stingraycharles2 hours ago
      Do you know that MoE is a thing?
      • jampekka2 hours ago
        The experts in MoEs aren't specialized in any meaningful task sense. From level of what we would think as tasks MoEs are selected essentially arbitrarily per token and per block.
        • stingraycharlesan hour ago
          It’s unsupervised, yes, but “unspecialized in any meaningful task sense” is incorrect, that’s the whole point. It’s just not in the sense of “this is a legal expert, this is a software developer”.
  • westurner5 hours ago
    Wouldn't this be faster with an agent skill that has code?

    /skill-creator [or /create-skill] Write an agent skill with code script(s) that use an existing user space IP library that works with your agent runtime, to [...]

    ComposioHQ/awesome-claude-skills: https://github.com/ComposioHQ/awesome-claude-skills

    anthopics/skills//skill-creator/SKILL.md: https://github.com/anthropics/skills/blob/main/skills/skill-...

    /.agents/skills/skill-name/SKILL.md, scripts/{script_name.py,__init__.py}

    https://agentskills.io/what-are-skills

    • trollbridge5 hours ago
      Well, yeah, of course it would be.

      Even faster would just to be use code in the first place!

  • brcmthrowaway6 hours ago
    Next up: Claude replacement to handle simdjson processing.
  • jeremyjh3 hours ago
    Perhaps one day, all network services will be provided by LLMs natively. Truly, that would be a day in the future.
    • pastagean hour ago
      You could read about that in 1992 "A Fire Upon the Deep" by Vernor Vinge. There is prompt injection in communication, in the book certain protocols for information communication can not be deterministic so if someone is too smart you get hacked.
    • vrighter2 hours ago
      why? We already have more efficient specialized hardware.
    • codezero3 hours ago
      I mean, we did decades of JavaScript, so... I mean... anything is possible, right? :)