3 pointsby zacharyozer4 hours ago1 comment

zacharyozer4 hours ago
> According to Dangel, it costs $2,600 to run Anthropic’s LLM Opus 4.6 through RULER 128, a test developed by Nvidia to assess a model’s ability to retrieve information from large data sets. And SubQ? “It cost us eight dollars,” he says.
> SubQ does seem to be able to handle a lot of text at once. The model has a context window (roughly akin to a working memory) up to 12 million tokens long. Most top models today have context windows one million tokens long. In a demo that Whedon ran for me, he asked SubQ to perform a task that required it to reason about information contained in 400 documents. It responded in seconds. When he gave Perplexity—a popular LLM-powered search engine—the same task, it failed to load all 400 documents.