Whisper Medium Pruned (29.8% sparsity)

This is a pruned version of islomov/navaistt_v2_medium with 29.8% of parameters removed using magnitude-based structured pruning.

Model Details

Base Model: islomov/navaistt_v2_medium
Pruning Method: L1-norm magnitude-based structured pruning
Sparsity Level: 29.8%
Original Parameters: 763,857,920
Pruned Parameters: 536,531,512
Size Reduction: 29.8%

Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch

# Load the pruned model
processor = WhisperProcessor.from_pretrained("your-username/whisper-medium-pruned")
model = WhisperForConditionalGeneration.from_pretrained("your-username/whisper-medium-pruned")

# Use for inference
def transcribe_audio(audio_path):
    # Load and process audio
    audio_input = processor(audio, sampling_rate=16000, return_tensors="pt")
    
    # Generate transcription
    with torch.no_grad():
        predicted_ids = model.generate(audio_input.input_features)
    
    # Decode transcription
    transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
    return transcription[0]

Performance

The pruned model maintains most of the original performance while being significantly smaller and faster. Fine-tuning after pruning helps recover any performance degradation.

Training Details

Pruning was applied to linear layers with >64 parameters in each dimension
Model was fine-tuned after pruning to recover performance
Calibration dataset: Mozilla Common Voice 11.0 (English)

Limitations

May have slightly reduced accuracy compared to the original model
Performance may vary depending on the specific audio domain
Recommended to evaluate on your specific use case

Citation

If you use this model, please cite the original Whisper paper:

@article{radford2022robust,
  title={Robust speech recognition via large-scale weak supervision},
  author={Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
  journal={International Conference on Machine Learning},
  pages={28492--28518},
  year={2022},
  organization={PMLR}
}

Downloads last month: 4

Safetensors

Model size

0.8B params

Tensor type

F16

Model tree for Dovud-Asadov/whisper-medium-pruned

Base model

islomov/rubaistt_v2_medium

Finetuned

(1)

this model