Typhoon-isan-asr-whisper
Typhoon Isan ASR Whisper is a specialized, fine-tuned version of the biodatlab/whisper-th-medium-combined model, optimized specifically for the Isan dialect of the Thai language. Built for high-accuracy offline transcription, it delivers state-of-the-art performance on Isan speech. This enables users to host their own ASR service for Isan dialect recognition, reducing costs and avoiding the need to send sensitive data to third-party cloud services.
The model is based on OpenAI's Whisper architecture, utilizing the robust Thai-enhanced foundation from biodatlab to ensure superior understanding of regional tones and vocabulary.
Try our demo available on Demo
Code / Examples available on Github
Release Blog available on OpenTyphoon Blog
Performance
Note on Baseline: The scb10x/whisper-medium-slscu-nectec included in the comparison is a model we fine-tuned specifically for this benchmark using existing dialect data from NECTEC and SLSCU. It serves as a representative baseline for performance based on public data, distinct from the SLSCU_korat_model (the prominent previous work for Isan dialect ASR). This helps to determine the clear gap between capabilities derived from previously available resources and the new Typhoon Isan ASR.
Key Findings
- Outperforming Proprietary State-of-the-Art: The
typhoon-isan-asr-whispermodel achieves a Character Error Rate (CER) of 0.0885, surpassing Gemini-2.5-pro (0.1020) by a clear margin. This result validates that a specialized, fine-tuned open model can deliver superior accuracy compared to massive, general-purpose proprietary systems for the Isan dialect. - The Definitive Choice for Accuracy: Among all architectures tested—including latency-optimized realtime models and historical baselines—the Whisper-based model stands out as the absolute leader. While our
typhoon-isan-asr-realtimemodel is highly competitive (0.1065), thetyphoon-isan-asr-whisperoffers the highest possible fidelity, making it the preferred choice for offline transcription where precision is paramount.
Follow us
https://twitter.com/opentyphoon
Support
- Downloads last month
- 593
Model tree for scb10x/typhoon-isan-asr-whisper
Base model
openai/whisper-medium