omniomni
/

omni-0-mini-preview

@@ -36,6 +36,42 @@ https://github.com/omniomni-ai/omni-0-preview-models/raw/refs/heads/main/compute
 </figure>
 </p>
 # Models
 Omni comes in a total of 5 models:
@@ -111,7 +147,7 @@ uv pip install vllm --torch-backend=auto
 After that, run this command to start a server with Omni via vLLM
 ```bash
-vllm serve omniomni/omni-0-mini-preview
 ```
 To use Omni with vLLM without creating a server, run this code to generate outputs within a Python file
@@ -123,7 +159,7 @@ from vllm import LLM, SamplingParams
 prompts = [
     "Hello, my name is",
     "The president of the United States is",
-    "The answer to x^2 + 2x + 1 is",
     "The future of AI is",
 ]

 </figure>
 </p>
+# Benchmarks
+**Omni-0-mini-preview Benchmarks**
+| **Benchmark**                           | **Omni** | **Base** | **Alternative Models** | **Llama 3.2 3B** | **Gemma 3 4B** | **Llama 3.1 8B** |
+|-----------------------------------------|----------|----------|------------------------|------------------|----------------|------------------|
+| MMLU STEM (4-shot CoT)                  | **35.02** | 26.59    |                        | 33.28            | 40.82          | 52.22            |
+| MMLU Science (4-shot CoT)               | **34.44** | 28.03    |                        | 33.47            | 42.93          | 52.54            |
+| MMLU Technology (4-shot CoT)            | **41.07** | 30.86    |                        | 45.28            | 46.74          | 63.72            |
+| MMLU Engineering (4-shot CoT)           | **37.50** | 25.93    |                        | 34.65            | 43.66          | 55.58            |
+| MMLU Math (4-shot CoT)                   | **35.54** | 23.86    |                        | 39.51            | 35.31          | 45.84            |
+| HumanEval (pass@1)                      | **31.71** | 29.88    |                        | 51.83            | 57.32          | 57.93            |
+| SciQ (0-shot)                           | **87.30** | 76.10    |                        | 93.30            | 87.50          | 91.80            |
+| MATH (4-shot)                           | 15.66     | **16.12**|                        | 28.44            | 26.38          | 29.56            |
+| ARC-Challenge (0-shot)                  | **43.00** | 40.10    |                        | 46.16            | 44.11          | 54.18            |
+| ARC-Easy (0-shot)                       | **66.67** | 58.54    |                        | 67.93            | 63.01          | 75.80            |
+| **Average**                             | **37.91** | 30.25    |                        | 38.33            | 43.91          | 54.22            |
+| **Improvement**                         | **25.32%**| Base     |                        |                  |                |                  |
+<br>
+**Expert Model Benchmarks**
+|Benchmark            | Science | Technology | Engineering | Math  |
+|-------------------------------|------------------|---------------------|----------------------|----------------|
+| MMLU Science (4-shot CoT)     | 26.69            | --                  | --                   | --             |
+| SciQ (0-shot)                 | 85.80            | --                  | --                   | --             |
+| ARC-Challenge (0-shot)        | 42.41            | --                  | --                   | --             |
+| ARC-Easy (0-shot)             | 66.96            | --                  | --                   | --             |
+| MMLU Technology (4-shot CoT)  | --               | 35.30               | --                   |                |
+| Humaneval (pass@1)            | --               | 32.93               | --                   | --             |
+| MMLU Engineering (4-shot CoT) | --               | --                  | 32.07                | --             |
+| MMLU Math (4-shot CoT)        | --               | --                  | --                   | 30.83          |
+| MATH (4-shot)                 | --               | --                  | --                   | 18.76          |
+| Expert Average       | **36.28**   | **34.83**      | **32.07**       | **28.82** |
+| Base Average         | 35.59            | 29.79               | 30.86                | 22.57          |
+| Improvement          | 1.94\%           | 16.92\%             | 3.92\%               | 27.69\%        |
+> Note: Expert average refers to the average of the expert model for the STEM domain in focus while benchmarking
 # Models
 Omni comes in a total of 5 models:
 After that, run this command to start a server with Omni via vLLM
 ```bash
+vllm  serve  omniomni/omni-0-mini-preview
 ```
 To use Omni with vLLM without creating a server, run this code to generate outputs within a Python file
 prompts = [
     "Hello, my name is",
     "The president of the United States is",
+    "The capital of France is",
     "The future of AI is",
 ]