Submitted by Sylvestre 45 Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion · 6 authors 8 1
Submitted by philschmid 43 Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models · 6 authors 708 1
Submitted by zuom 20 Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages · 5 authors 59 1