Modern language models are impressive at generating text, yet they're fragile in ways human cognition isn't. They lack persistent reasoning, genuine goals, long-horizon planning.
The intuition: what if language isn't the foundation of thought, but just an interface to it? What if reasoning happens in a latent, continuous, non-linguistic space we rarely inspect?
We've optimized prediction of the next token. But token prediction isn't cognition—it's an output modality. Researchers like Yann LeCun and Karl Friston have been pointing at this for years: biological intelligence emerges from continuous, predictive, embodied interaction with the world.
Our AI still treats language as the primary substrate of thought. Maybe that's the bottleneck.
I've been exploring this informally—nothing validated, no claims of novelty, just thinking out loud with some structure. I collected rough ideas, open questions, and a sketch of what this might mean for architecture design.
Curious what HN thinks. Is this direction interesting? Obviously wrong? Am I just redescribing what transformers already do?
GitHub: https://github.com/stramanu/latent-cognitive-arch-exploratio...