Leaderboard benchmark?

#5
by djuna - opened

I'm curious how is it compare to the original Mistral small.

@ehartford I evaluated the MATH500 score for this:
Dolphin3-R1: 87%
Mistral-Small3: 70%

Dolphin org

I'm not engaging with huggingface leaderboard.
They can eval it, or not. Not my concern.

ehartford changed discussion status to closed

Sign up or log in to comment