@etemiz on Hugging Face: "Today's winner is Ling 1T with a score of 38! Btw AHA2 is in the works, with…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

etemiz

posted an update 14 days ago

Post

1720

Today's winner is Ling 1T with a score of 38!

Btw AHA2 is in the works, with more domains, better comparison LLMs and questions, overall better signal.

AmosTipton

12 days ago

Nice update — Ling 1T’s consistency here is impressive.

This kind of work is what pushed me to think more about when systems should refuse to report results instead of reporting early with caveats.

I’ve been building a small verifier that enforces “no result until durable” as a hard rule. It’s interesting how much trust behavior changes when refusal is allowed.

In this post

etemiz Emin Temiz
AmosTipton Founder & Chief Architect