Leaderboard benchmark?

by djuna - opened Feb 8

djuna

Feb 8

I'm curious how is it compare to the original Mistral small.

PSM24

Feb 10

@ehartford I evaluated the MATH500 score for this:
Dolphin3-R1: 87%
Mistral-Small3: 70%

Dolphin org Feb 10

I'm not engaging with huggingface leaderboard.
They can eval it, or not. Not my concern.

ehartford changed discussion status to closed Feb 10

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment