2 pointsby steveharing13 hours ago1 comment
  • throw3108223 hours ago
    If I understand it correctly, this is based on the "RYS" architecture (or findings) by David Ng? ( https://dnhkng.github.io/posts/rys/ )

    And, related: if there are small subsets of layers that can be looped inside LLMs to improve their reasoning, and if the layers to loop change depending on the competencies used by the LLM in that particular context, has anyone yet tried to build and train an LLM that can decide which layers to loop and how much?