patricklifixie commited on
Commit
17a79bc
·
verified ·
1 Parent(s): 247f9a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -6
README.md CHANGED
@@ -126,12 +126,6 @@ Supervised speech instruction finetuning via knowledge-distillation. For more in
126
  - **Training regime:** BF16 mixed precision training
127
  - **Hardware used:** 8x H100 GPUs
128
 
129
- #### Speeds, Sizes, Times
130
-
131
- The current version of Ultravox, when invoked with audio content, has a time-to-first-token (TTFT) of approximately 150ms, and a tokens-per-second rate of ~50-100 when using an A100-40GB GPU, all using a text-based LLM (Llama, Gemma, or Qwen) backbone.
132
-
133
- Check out the audio tab on [TheFastest.ai](https://thefastest.ai/?m=audio) for daily benchmarks and a comparison with other existing models.
134
-
135
  ## Evaluation
136
 
137
  Evaluations are conducted on covost2 (speech translation measured in BLEU), fleurs and ultravox_calls (speech recognition measured in WER), big bench audio (audio reasoning measured in accuracy), as well as musan and ultravox_unintelligible (noise/unintelligible speech detection measured in recall).
 
126
  - **Training regime:** BF16 mixed precision training
127
  - **Hardware used:** 8x H100 GPUs
128
 
 
 
 
 
 
 
129
  ## Evaluation
130
 
131
  Evaluations are conducted on covost2 (speech translation measured in BLEU), fleurs and ultravox_calls (speech recognition measured in WER), big bench audio (audio reasoning measured in accuracy), as well as musan and ultravox_unintelligible (noise/unintelligible speech detection measured in recall).