Typhoon-isan-asr-whisper

| Model architecture | Model size | Language

Typhoon Isan ASR Whisper is a specialized, fine-tuned version of the biodatlab/whisper-th-medium-combined model, optimized specifically for the Isan dialect of the Thai language. Built for high-accuracy offline transcription, it delivers state-of-the-art performance on Isan speech. This enables users to host their own ASR service for Isan dialect recognition, reducing costs and avoiding the need to send sensitive data to third-party cloud services.

The model is based on OpenAI's Whisper architecture, utilizing the robust Thai-enhanced foundation from biodatlab to ensure superior understanding of regional tones and vocabulary.

Try our demo available on Demo

Code / Examples available on Github

Release Blog available on OpenTyphoon Blog


Performance

cer comparison

Note on Baseline: The scb10x/whisper-medium-slscu-nectec included in the comparison is a model we fine-tuned specifically for this benchmark using existing dialect data from NECTEC and SLSCU. It serves as a representative baseline for performance based on public data, distinct from the SLSCU_korat_model (the prominent previous work for Isan dialect ASR). This helps to determine the clear gap between capabilities derived from previously available resources and the new Typhoon Isan ASR.

Key Findings

  • Outperforming Proprietary State-of-the-Art: The typhoon-isan-asr-whisper model achieves a Character Error Rate (CER) of 0.0885, surpassing Gemini-2.5-pro (0.1020) by a clear margin. This result validates that a specialized, fine-tuned open model can deliver superior accuracy compared to massive, general-purpose proprietary systems for the Isan dialect.
  • The Definitive Choice for Accuracy: Among all architectures tested—including latency-optimized realtime models and historical baselines—the Whisper-based model stands out as the absolute leader. While our typhoon-isan-asr-realtime model is highly competitive (0.1065), the typhoon-isan-asr-whisper offers the highest possible fidelity, making it the preferred choice for offline transcription where precision is paramount.

Follow us

https://twitter.com/opentyphoon

Support

https://discord.gg/us5gAYmrxw

Downloads last month
593
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for scb10x/typhoon-isan-asr-whisper

Finetuned
(9)
this model

Collections including scb10x/typhoon-isan-asr-whisper