SubQ: a sub-quadratic LLM with 12M-token context(subq.ai)

32 pointsby mitchwainer6 hours ago8 comments

2001zhaozhao2 minutes ago
Assuming this is real, not launching with a technical report is a big miss.
pstorm5 hours ago
I’m very surprised this isn’t getting more attention. Am I missing something?
It seems at or above SOTA on the given benchmarks, doesn’t have context rot, is orders of magnitude faster, and uses less compute that current transformer models. I suppose it’s just an announcement and we can’t test it ourselves yet.
- alexsubq2 hours ago
  We are SOTA in some ways and not in others, continuously working to make it better! We need a little more time to scale, as we are working on things like disaggregated prefill, etc., the norms of large-scale model infra.
  I am happy to answer any questions!
  - supern0va29 minutes ago
    This seems super cool if as described, but I'm sure you can understand the skepticism.
    Do you anticipate having any kind of public accessible chat interface for testing in the near future?
    Also, what, if any, benefits are there for smaller context windows? Is there still a material improvement in cost to serve under say 256k? I'm curious about the broader implications for the space beyond improvements for very large context windows.
- jakevoytko5 hours ago
  The proof is in the pudding. At this point, there have been plenty of models that overperformed on benchmarks and underperformed on real work. So my stance is that I'm curious, I'm excited to see where it goes, and I don't believe it until I can try it.
- shdhan hour ago
  no one has access to it yet
  no published benchmarks
  no paper
  no demonstrations of capabilities
- remaximize5 hours ago
  I agree, it's a real architectural breakthrough if true
creamyhorror4 hours ago
Whether this is real or not, multiple commenters here look like astroturfers - created in the past year (or hours) with very low karma
- GorbachevyChase2 hours ago
  There are some comments which are identical to comments on X as well. That is not the say the frontier labs do not engage in highly unethical marketing, but this is a little bit too obvious.
remaximize6 hours ago
This is pretty remarkable. We've spent a lot of time finding workarounds for LLMs reading long docs. Now that's gone.
williamimoh6 hours ago
Looks like long context isn’t a problem anymore
- tamarru5 hours ago
  Neither is cost, and latency, in the long-term. LLMs ultimately become more economically viable than they are now, and broaden the scope of every existing LLM-driven application (particularly STS, conversational AI, etc, etc.)
tuandin5 hours ago
if it's true then it's a breakthrough.
wilddolphin6 hours ago
optimizing AI in general. How cool is that?
- 5 hours ago
  undefined
thlt5 hours ago
[dead]