Update README.md
Browse files
README.md
CHANGED
|
@@ -36,6 +36,42 @@ https://github.com/omniomni-ai/omni-0-preview-models/raw/refs/heads/main/compute
|
|
| 36 |
</figure>
|
| 37 |
</p>
|
| 38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
# Models
|
| 40 |
Omni comes in a total of 5 models:
|
| 41 |
|
|
@@ -111,7 +147,7 @@ uv pip install vllm --torch-backend=auto
|
|
| 111 |
|
| 112 |
After that, run this command to start a server with Omni via vLLM
|
| 113 |
```bash
|
| 114 |
-
vllm
|
| 115 |
```
|
| 116 |
|
| 117 |
To use Omni with vLLM without creating a server, run this code to generate outputs within a Python file
|
|
@@ -123,7 +159,7 @@ from vllm import LLM, SamplingParams
|
|
| 123 |
prompts = [
|
| 124 |
"Hello, my name is",
|
| 125 |
"The president of the United States is",
|
| 126 |
-
"The
|
| 127 |
"The future of AI is",
|
| 128 |
]
|
| 129 |
|
|
|
|
| 36 |
</figure>
|
| 37 |
</p>
|
| 38 |
|
| 39 |
+
# Benchmarks
|
| 40 |
+
**Omni-0-mini-preview Benchmarks**
|
| 41 |
+
| **Benchmark** | **Omni** | **Base** | **Alternative Models** | **Llama 3.2 3B** | **Gemma 3 4B** | **Llama 3.1 8B** |
|
| 42 |
+
|-----------------------------------------|----------|----------|------------------------|------------------|----------------|------------------|
|
| 43 |
+
| MMLU STEM (4-shot CoT) | **35.02** | 26.59 | | 33.28 | 40.82 | 52.22 |
|
| 44 |
+
| MMLU Science (4-shot CoT) | **34.44** | 28.03 | | 33.47 | 42.93 | 52.54 |
|
| 45 |
+
| MMLU Technology (4-shot CoT) | **41.07** | 30.86 | | 45.28 | 46.74 | 63.72 |
|
| 46 |
+
| MMLU Engineering (4-shot CoT) | **37.50** | 25.93 | | 34.65 | 43.66 | 55.58 |
|
| 47 |
+
| MMLU Math (4-shot CoT) | **35.54** | 23.86 | | 39.51 | 35.31 | 45.84 |
|
| 48 |
+
| HumanEval (pass@1) | **31.71** | 29.88 | | 51.83 | 57.32 | 57.93 |
|
| 49 |
+
| SciQ (0-shot) | **87.30** | 76.10 | | 93.30 | 87.50 | 91.80 |
|
| 50 |
+
| MATH (4-shot) | 15.66 | **16.12**| | 28.44 | 26.38 | 29.56 |
|
| 51 |
+
| ARC-Challenge (0-shot) | **43.00** | 40.10 | | 46.16 | 44.11 | 54.18 |
|
| 52 |
+
| ARC-Easy (0-shot) | **66.67** | 58.54 | | 67.93 | 63.01 | 75.80 |
|
| 53 |
+
| **Average** | **37.91** | 30.25 | | 38.33 | 43.91 | 54.22 |
|
| 54 |
+
| **Improvement** | **25.32%**| Base | | | | |
|
| 55 |
+
|
| 56 |
+
<br>
|
| 57 |
+
|
| 58 |
+
**Expert Model Benchmarks**
|
| 59 |
+
|Benchmark | Science | Technology | Engineering | Math |
|
| 60 |
+
|-------------------------------|------------------|---------------------|----------------------|----------------|
|
| 61 |
+
| MMLU Science (4-shot CoT) | 26.69 | -- | -- | -- |
|
| 62 |
+
| SciQ (0-shot) | 85.80 | -- | -- | -- |
|
| 63 |
+
| ARC-Challenge (0-shot) | 42.41 | -- | -- | -- |
|
| 64 |
+
| ARC-Easy (0-shot) | 66.96 | -- | -- | -- |
|
| 65 |
+
| MMLU Technology (4-shot CoT) | -- | 35.30 | -- | |
|
| 66 |
+
| Humaneval (pass@1) | -- | 32.93 | -- | -- |
|
| 67 |
+
| MMLU Engineering (4-shot CoT) | -- | -- | 32.07 | -- |
|
| 68 |
+
| MMLU Math (4-shot CoT) | -- | -- | -- | 30.83 |
|
| 69 |
+
| MATH (4-shot) | -- | -- | -- | 18.76 |
|
| 70 |
+
| Expert Average | **36.28** | **34.83** | **32.07** | **28.82** |
|
| 71 |
+
| Base Average | 35.59 | 29.79 | 30.86 | 22.57 |
|
| 72 |
+
| Improvement | 1.94\% | 16.92\% | 3.92\% | 27.69\% |
|
| 73 |
+
> Note: Expert average refers to the average of the expert model for the STEM domain in focus while benchmarking
|
| 74 |
+
|
| 75 |
# Models
|
| 76 |
Omni comes in a total of 5 models:
|
| 77 |
|
|
|
|
| 147 |
|
| 148 |
After that, run this command to start a server with Omni via vLLM
|
| 149 |
```bash
|
| 150 |
+
vllm serve omniomni/omni-0-mini-preview
|
| 151 |
```
|
| 152 |
|
| 153 |
To use Omni with vLLM without creating a server, run this code to generate outputs within a Python file
|
|
|
|
| 159 |
prompts = [
|
| 160 |
"Hello, my name is",
|
| 161 |
"The president of the United States is",
|
| 162 |
+
"The capital of France is",
|
| 163 |
"The future of AI is",
|
| 164 |
]
|
| 165 |
|