OpusMT-En-Zh: Optimized for Qualcomm Devices
OpusMT English to Chinese translation model is a state-of-the-art neural machine translation system designed for translating English text into Chinese. This model is based on the Marian transformer architecture and has been optimized for edge inference by splitting into encoder and decoder components with modified attention mechanisms. It exhibits robust performance for real-world translation tasks, making it highly reliable for practical applications. The model supports input sequences up to 256 tokens and can generate Chinese translations with high accuracy.
This is based on the implementation of OpusMT-En-Zh found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.
Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.
Getting Started
There are two ways to deploy this model on your device:
Option 1: Download Pre-Exported Models
Below are pre-exported model assets ready for deployment.
| Runtime | Precision | Chipset | SDK Versions | Download |
|---|---|---|---|---|
| ONNX | float | Universal | QAIRT 2.42, ONNX Runtime 1.24.1 | Download |
| QNN_CONTEXT_BINARY | float | qualcomm-qcs8275 | QAIRT 2.42 | Download |
| QNN_DLC | float | Universal | QAIRT 2.43 | Download |
| TFLITE | float | Universal | QAIRT 2.43, TFLite 2.17.0 | Download |
For more device-specific assets and performance metrics, visit OpusMT-En-Zh on Qualcomm® AI Hub.
Option 2: Export with Custom Configurations
Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:
- Custom weights (e.g., fine-tuned checkpoints)
- Custom input shapes
- Target device and runtime configurations
This option is ideal if you need to customize the model beyond the default configuration provided here.
See our repository for OpusMT-En-Zh on GitHub for usage instructions.
Model Details
Model Type: Model_use_case.text_generation
Model Stats:
- Model checkpoint: Helsinki-NLP/opus-mt-en-zh
- Input resolution: 256 tokens (English text)
- Max input sequence length: 256 tokens
- Max output sequence length: 256 tokens
- Number of parameters (OpusMTEncoder): ~74M
- Model size (OpusMTEncoder) (float): ~280 MB
- Number of parameters (OpusMTDecoder): ~74M
- Model size (OpusMTDecoder) (float): ~280 MB
- Number of encoder layers: 6
- Number of decoder layers: 6
- Attention heads: 8
- Hidden dimension: 512
Performance Summary
| Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit |
|---|---|---|---|---|---|---|
| OpusMTDecoder | ONNX | float | Snapdragon® X2 Elite | 1.737 ms | 161 - 161 MB | NPU |
| OpusMTDecoder | ONNX | float | Snapdragon® X Elite | 3.2 ms | 160 - 160 MB | NPU |
| OpusMTDecoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 2.775 ms | 0 - 399 MB | NPU |
| OpusMTDecoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 3.94 ms | 12 - 14 MB | NPU |
| OpusMTDecoder | ONNX | float | Qualcomm® QCS9075 | 4.239 ms | 12 - 27 MB | NPU |
| OpusMTDecoder | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.349 ms | 0 - 429 MB | NPU |
| OpusMTDecoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 2.233 ms | 1 - 378 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Snapdragon® X2 Elite | 2.071 ms | 6 - 6 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Snapdragon® X Elite | 2.965 ms | 6 - 6 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 2.353 ms | 6 - 193 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 6.844 ms | 2 - 167 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 3.04 ms | 1 - 2 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Qualcomm® SA8775P | 3.956 ms | 2 - 168 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Qualcomm® QCS9075 | 3.367 ms | 8 - 16 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 3.986 ms | 4 - 173 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Qualcomm® SA7255P | 6.844 ms | 2 - 167 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Qualcomm® SA8295P | 4.138 ms | 6 - 157 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.13 ms | 0 - 167 MB | NPU |
| OpusMTDecoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 1.909 ms | 0 - 171 MB | NPU |
| OpusMTDecoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 2.773 ms | 0 - 377 MB | NPU |
| OpusMTDecoder | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 6.721 ms | 0 - 230 MB | NPU |
| OpusMTDecoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 3.424 ms | 0 - 3 MB | NPU |
| OpusMTDecoder | TFLITE | float | Qualcomm® SA8775P | 4.321 ms | 0 - 229 MB | NPU |
| OpusMTDecoder | TFLITE | float | Qualcomm® QCS9075 | 4.205 ms | 0 - 186 MB | NPU |
| OpusMTDecoder | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 4.709 ms | 0 - 354 MB | NPU |
| OpusMTDecoder | TFLITE | float | Qualcomm® SA7255P | 6.721 ms | 0 - 230 MB | NPU |
| OpusMTDecoder | TFLITE | float | Qualcomm® SA8295P | 4.918 ms | 0 - 212 MB | NPU |
| OpusMTDecoder | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.52 ms | 0 - 406 MB | NPU |
| OpusMTDecoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 2.132 ms | 0 - 374 MB | NPU |
| OpusMTEncoder | ONNX | float | Snapdragon® X2 Elite | 2.22 ms | 108 - 108 MB | NPU |
| OpusMTEncoder | ONNX | float | Snapdragon® X Elite | 5.061 ms | 108 - 108 MB | NPU |
| OpusMTEncoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 3.397 ms | 0 - 266 MB | NPU |
| OpusMTEncoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 4.755 ms | 0 - 123 MB | NPU |
| OpusMTEncoder | ONNX | float | Qualcomm® QCS9075 | 6.263 ms | 16 - 18 MB | NPU |
| OpusMTEncoder | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.777 ms | 0 - 290 MB | NPU |
| OpusMTEncoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 2.032 ms | 0 - 257 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Snapdragon® X2 Elite | 1.973 ms | 0 - 0 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Snapdragon® X Elite | 3.889 ms | 0 - 0 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 2.542 ms | 0 - 89 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 12.607 ms | 0 - 52 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 3.521 ms | 0 - 2 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Qualcomm® SA8775P | 19.246 ms | 0 - 52 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Qualcomm® QCS9075 | 4.675 ms | 0 - 5 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 4.768 ms | 0 - 86 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Qualcomm® SA7255P | 12.607 ms | 0 - 52 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Qualcomm® SA8295P | 5.349 ms | 0 - 54 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.139 ms | 0 - 53 MB | NPU |
| OpusMTEncoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 1.612 ms | 0 - 58 MB | NPU |
| OpusMTEncoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 2.663 ms | 0 - 260 MB | NPU |
| OpusMTEncoder | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 12.774 ms | 6 - 99 MB | NPU |
| OpusMTEncoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 3.682 ms | 0 - 3 MB | NPU |
| OpusMTEncoder | TFLITE | float | Qualcomm® SA8775P | 4.836 ms | 6 - 100 MB | NPU |
| OpusMTEncoder | TFLITE | float | Qualcomm® QCS9075 | 5.137 ms | 6 - 120 MB | NPU |
| OpusMTEncoder | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 5.08 ms | 6 - 258 MB | NPU |
| OpusMTEncoder | TFLITE | float | Qualcomm® SA7255P | 12.774 ms | 6 - 99 MB | NPU |
| OpusMTEncoder | TFLITE | float | Qualcomm® SA8295P | 5.662 ms | 6 - 95 MB | NPU |
| OpusMTEncoder | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.237 ms | 0 - 279 MB | NPU |
| OpusMTEncoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 1.774 ms | 0 - 243 MB | NPU |
License
- The license for the original implementation of OpusMT-En-Zh can be found here.
References
Community
- Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
- For questions or feedback please reach out to us.
