chejames commited on
Commit
0736a8a
·
verified ·
1 Parent(s): f3d1ab8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -2
README.md CHANGED
@@ -36,6 +36,42 @@ https://github.com/omniomni-ai/omni-0-preview-models/raw/refs/heads/main/compute
36
  </figure>
37
  </p>
38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  # Models
40
  Omni comes in a total of 5 models:
41
 
@@ -111,7 +147,7 @@ uv pip install vllm --torch-backend=auto
111
 
112
  After that, run this command to start a server with Omni via vLLM
113
  ```bash
114
- vllm serve omniomni/omni-0-mini-preview
115
  ```
116
 
117
  To use Omni with vLLM without creating a server, run this code to generate outputs within a Python file
@@ -123,7 +159,7 @@ from vllm import LLM, SamplingParams
123
  prompts = [
124
  "Hello, my name is",
125
  "The president of the United States is",
126
- "The answer to x^2 + 2x + 1 is",
127
  "The future of AI is",
128
  ]
129
 
 
36
  </figure>
37
  </p>
38
 
39
+ # Benchmarks
40
+ **Omni-0-mini-preview Benchmarks**
41
+ | **Benchmark** | **Omni** | **Base** | **Alternative Models** | **Llama 3.2 3B** | **Gemma 3 4B** | **Llama 3.1 8B** |
42
+ |-----------------------------------------|----------|----------|------------------------|------------------|----------------|------------------|
43
+ | MMLU STEM (4-shot CoT) | **35.02** | 26.59 | | 33.28 | 40.82 | 52.22 |
44
+ | MMLU Science (4-shot CoT) | **34.44** | 28.03 | | 33.47 | 42.93 | 52.54 |
45
+ | MMLU Technology (4-shot CoT) | **41.07** | 30.86 | | 45.28 | 46.74 | 63.72 |
46
+ | MMLU Engineering (4-shot CoT) | **37.50** | 25.93 | | 34.65 | 43.66 | 55.58 |
47
+ | MMLU Math (4-shot CoT) | **35.54** | 23.86 | | 39.51 | 35.31 | 45.84 |
48
+ | HumanEval (pass@1) | **31.71** | 29.88 | | 51.83 | 57.32 | 57.93 |
49
+ | SciQ (0-shot) | **87.30** | 76.10 | | 93.30 | 87.50 | 91.80 |
50
+ | MATH (4-shot) | 15.66 | **16.12**| | 28.44 | 26.38 | 29.56 |
51
+ | ARC-Challenge (0-shot) | **43.00** | 40.10 | | 46.16 | 44.11 | 54.18 |
52
+ | ARC-Easy (0-shot) | **66.67** | 58.54 | | 67.93 | 63.01 | 75.80 |
53
+ | **Average** | **37.91** | 30.25 | | 38.33 | 43.91 | 54.22 |
54
+ | **Improvement** | **25.32%**| Base | | | | |
55
+
56
+ <br>
57
+
58
+ **Expert Model Benchmarks**
59
+ |Benchmark | Science | Technology | Engineering | Math |
60
+ |-------------------------------|------------------|---------------------|----------------------|----------------|
61
+ | MMLU Science (4-shot CoT) | 26.69 | -- | -- | -- |
62
+ | SciQ (0-shot) | 85.80 | -- | -- | -- |
63
+ | ARC-Challenge (0-shot) | 42.41 | -- | -- | -- |
64
+ | ARC-Easy (0-shot) | 66.96 | -- | -- | -- |
65
+ | MMLU Technology (4-shot CoT) | -- | 35.30 | -- | |
66
+ | Humaneval (pass@1) | -- | 32.93 | -- | -- |
67
+ | MMLU Engineering (4-shot CoT) | -- | -- | 32.07 | -- |
68
+ | MMLU Math (4-shot CoT) | -- | -- | -- | 30.83 |
69
+ | MATH (4-shot) | -- | -- | -- | 18.76 |
70
+ | Expert Average | **36.28** | **34.83** | **32.07** | **28.82** |
71
+ | Base Average | 35.59 | 29.79 | 30.86 | 22.57 |
72
+ | Improvement | 1.94\% | 16.92\% | 3.92\% | 27.69\% |
73
+ > Note: Expert average refers to the average of the expert model for the STEM domain in focus while benchmarking
74
+
75
  # Models
76
  Omni comes in a total of 5 models:
77
 
 
147
 
148
  After that, run this command to start a server with Omni via vLLM
149
  ```bash
150
+ vllm serve omniomni/omni-0-mini-preview
151
  ```
152
 
153
  To use Omni with vLLM without creating a server, run this code to generate outputs within a Python file
 
159
  prompts = [
160
  "Hello, my name is",
161
  "The president of the United States is",
162
+ "The capital of France is",
163
  "The future of AI is",
164
  ]
165