One question is to what extent you dig into or have considered oversampling? One of the core hypotheses we've converged on is that nearly all models are optimized for source coding vs. channel coding. The implication is path to AGI likely involves oversampling to capture channel coding gains and which will resolve phase errors, etc.
Random sampling naturally does this albeit inefficiently. Curious if you do something more structured than random in terms of oversampling and especially partial overlapped samples / think supersaturated subspaces / subchannels, etc.
Structured Temporal Oversampling: Our stream-based approach effectively performs high-density oversampling in the time domain. Instead of random sampling, the theta-phase (hippocampal rhythm) in our MultiGate architecture creates structured, overlapping "integration windows" to capture temporal context.
Phase Error Resolution: Phase errors are resolved not by averaging (as in L2 models), but by NMDA-gating. The gate only opens when the anchor velocity and theta-phase align, physically "locking" the signal to a specific codebook vertex. This is a computational implementation of theta-gamma coupling.
Supersaturated Subspaces: Our Simplex constraint (L1) naturally handles what you call "supersaturated subspaces" by enforcing non-negative competition. This ensures that even with overlapping temporal samples, the resulting internal representation remains discrete and grounded within the convex hull.
By treating cognition as a communication channel between an "Anchor" and "Codebook," we prioritize the stability of the compositional mapping over the mere efficiency of representation.