Whisper Medium Pruned (29.8% sparsity)

This is a pruned version of islomov/navaistt_v2_medium with 29.8% of parameters removed using magnitude-based structured pruning.

Model Details

  • Base Model: islomov/navaistt_v2_medium
  • Pruning Method: L1-norm magnitude-based structured pruning
  • Sparsity Level: 29.8%
  • Original Parameters: 763,857,920
  • Pruned Parameters: 536,531,512
  • Size Reduction: 29.8%

Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch

# Load the pruned model
processor = WhisperProcessor.from_pretrained("your-username/whisper-medium-pruned")
model = WhisperForConditionalGeneration.from_pretrained("your-username/whisper-medium-pruned")

# Use for inference
def transcribe_audio(audio_path):
    # Load and process audio
    audio_input = processor(audio, sampling_rate=16000, return_tensors="pt")
    
    # Generate transcription
    with torch.no_grad():
        predicted_ids = model.generate(audio_input.input_features)
    
    # Decode transcription
    transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
    return transcription[0]

Performance

The pruned model maintains most of the original performance while being significantly smaller and faster. Fine-tuning after pruning helps recover any performance degradation.

Training Details

  • Pruning was applied to linear layers with >64 parameters in each dimension
  • Model was fine-tuned after pruning to recover performance
  • Calibration dataset: Mozilla Common Voice 11.0 (English)

Limitations

  • May have slightly reduced accuracy compared to the original model
  • Performance may vary depending on the specific audio domain
  • Recommended to evaluate on your specific use case

Citation

If you use this model, please cite the original Whisper paper:

@article{radford2022robust,
  title={Robust speech recognition via large-scale weak supervision},
  author={Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
  journal={International Conference on Machine Learning},
  pages={28492--28518},
  year={2022},
  organization={PMLR}
}
Downloads last month
4
Safetensors
Model size
0.8B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Dovud-Asadov/whisper-medium-pruned

Finetuned
(1)
this model