Automatic Speech Recognition
Transformers
PyTorch
English
wav2vec2
audio
speech
xlsr-fine-tuning-week
Eval Results (legacy)
Instructions to use jonatasgrosman/wav2vec2-large-english with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jonatasgrosman/wav2vec2-large-english with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="jonatasgrosman/wav2vec2-large-english")# Load model directly from transformers import AutoProcessor, AutoModelForCTC processor = AutoProcessor.from_pretrained("jonatasgrosman/wav2vec2-large-english") model = AutoModelForCTC.from_pretrained("jonatasgrosman/wav2vec2-large-english") - Notebooks
- Google Colab
- Kaggle
Commit ·
7fee82d
1
Parent(s): e3bbe4d
update README
Browse files
README.md
CHANGED
|
@@ -159,7 +159,7 @@ print(f"CER: {cer.compute(predictions=predictions, references=references, chunk_
|
|
| 159 |
|
| 160 |
**Test Result**:
|
| 161 |
|
| 162 |
-
In the table below I report the Word Error Rate (WER) and the Character Error Rate (CER) of the model. I ran the evaluation script described above on other models as well (on 2021-05-20). Note that the table below may show different results from those already reported, this may have been caused due to some specificity of the other evaluation scripts used..
|
| 163 |
|
| 164 |
---
|
| 165 |
|
|
|
|
| 159 |
|
| 160 |
**Test Result**:
|
| 161 |
|
| 162 |
+
In the table below I report the Word Error Rate (WER) and the Character Error Rate (CER) of the model. I ran the evaluation script described above on other models as well (on 2021-05-20). Note that the table below may show different results from those already reported, this may have been caused due to some specificity of the other evaluation scripts used. Initially, I've tested the model only using the Common Voice dataset. Later I've also tested the model using the LibriSpeech and TIMIT datasets, which are better-behaved datasets than the Common Voice, containing only examples in US English extracted from audiobooks.
|
| 163 |
|
| 164 |
---
|
| 165 |
|