felipemayamuniz10 hours ago
I trained a 354M decoder-only LLM called AletheionLLM-v2 on 1 billion tokens using a geometric cognitive architecture called ATIC (Adaptive Turing Intelligence Cognition), which adds per-token uncertainty quantification via a 5D Riemannian manifold and 14 loss functions including epistemic tomography. Evaluated on WikiText-103 (OOD, not seen during training): ECE: 0.0176 vs 0.0236 (GPT-2 Medium) and 0.0241 (OPT-350M) Brier Score: 0.1528 vs 0.1618 (GPT-2 Medium) and 0.1595 (OPT-350M) The perplexity is higher, which is expected on OOD data. The calibration improvement is the point. Trained solo on 5x H200 via RunPod. AGPL-3.0. Repo: github.com/gnai-creator/aletheion-llm-v2 Paper: 10.13140/RG.2.2.11471.14241