The orthogonal LoRA constraint is interesting. Have you thought about whether orthogonality conflict with the timestamped training? If two temporally adjacent observations should produce similar LoRA updates, orthogonality would actively push them apart. Maybe you want similarity for recency, orthogonality only for distinct episode types?