Add model-index with instruction benchmark evaluations

#45

by davidlms - opened 20 days ago

base: refs/heads/main

←

from: refs/pr/45

Discussion Files changed

+37

-1

davidlms

20 days ago

Added structured evaluation results from README benchmark table:

Instruction Model Benchmarks (No Extended Thinking):

AIME 2025 (High school math): 9.3
GSM-Plus (Math problem-solving): 72.8
LiveCodeBench v4 (Competitive programming): 15.2
GPQA Diamond (Graduate-level reasoning): 35.7
IFEval (Instruction following): 76.7
MixEval Hard (Alignment): 26.9
BFCL (Tool Calling): 92.3
Global MMLU (Multilingual Q&A): 53.5

Total: 8 benchmarks covering reasoning, math, coding, instruction-following, alignment, tool use, and multilingual capabilities.

This enables the model to appear in leaderboards and makes it easier to compare with other models.

Note: This PR adds benchmark metadata to the model card frontmatter and should not conflict with existing PRs #43, #32, and #16 which only modify the chat template.

Add model-index with instruction benchmark evaluationsa55c9e2e

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment