Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published 26 days ago • 104
Unlocking Out-of-Distribution Generalization in Transformers via Recursive Latent Space Reasoning Paper • 2510.14095 • Published Oct 15 • 5
Muon Outperforms Adam in Tail-End Associative Memory Learning Paper • 2509.26030 • Published Sep 30 • 19
Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders Paper • 2506.14002 • Published Jun 16 • 5