Multimodal Models
Collection
25 items
•
Updated
•
1
This SDK enables efficient Open-Vocabulary-Object-Detection using YOLO-Worldv2 Large, optimized for Axera’s NPU-based SoC platforms including AX650 Series, AX630C Series, AX8850 Series, or Axera's dedicated AI accelerator.
For those who are interested in model conversion, you can try to export axmodel through
| Model | Input Shape | Latency (ms) | CMM Usage (MB) |
|---|---|---|---|
| yolo_u16_ax650.axmodel | 1 x 640 x 640 x 3 | 9.522 ms | 21 MB |
| clip_b1_u16_ax650.axmodel | 1 x 77 | 2.997 ms | 137 MB |
| yolo_u16_ax630c.axmodel | 1 x 640 x 640 x 3 | 43.450 ms | 31 MB |
| clip_b1_u16_ax630c.axmodel | 1 x 77 | 10.703 ms | 134 MB |
Download all files from this repository to the device
root@ax650 ~/root/YOLO-World-V2 # tree -L 2
.
|-- README.md
|-- config.json
|-- football.jpg
|-- install
| |-- bin
| `-- lib
|-- models
| |-- clip_b1_u16_ax630c.axmodel
| |-- clip_b1_u16_ax650.axmodel
| |-- yolo_u16_ax630c.axmodel
| `-- yolo_u16_ax650.axmodel
|-- pyyoloworld
| |-- __pycache__
| |-- example.py
| |-- gardio_example.jpg
| |-- gradio_example.py
| |-- host.jpg
| |-- libyoloworld.so
| |-- pyaxdev.py
| |-- pyyoloworld.py
| |-- requirements.txt
| `-- result_host.jpg
|-- result.png
`-- vocab.txt
6 directories, 18 files
pip install -r pyyoloworld/requirements.txt
root@ax650 ~/root/YOLO-World-V2 # cp install/lib/host_650/libyoloworld.so ./pyyoloworld/
root@ax650 ~/root/YOLO-World-V2 # cd pyyoloworld/
root@ax650 ~/root/YOLO-World-V2/pyyoloworld # python3 gradio_example.py --yoloworld ../models/yolo_u16_ax650.axmodel --tenc ../models/clip_b1_u16_ax650.axmodel --vocab ../vocab.txt --dev_type host
Trying to load: /root/root/YOLO-World-V2/pyyoloworld/aarch64/libyoloworld.so
❌ Failed to load: /root/root/YOLO-World-V2/pyyoloworld/aarch64/libyoloworld.so
/root/root/YOLO-World-V2/pyyoloworld/aarch64/libyoloworld.so: cannot open shared object file: No such file or directory
🔍 File not found. Please verify that libclip.so exists and the path is correct.
Trying to load: /root/root/YOLO-World-V2/pyyoloworld/libyoloworld.so
open libaxcl_rt.so failed
unsupport axcl
✅ Successfully loaded: /root/root/YOLO-World-V2/pyyoloworld/libyoloworld.so
sh: line 1: axcl-smi: command not found
input size: 2
name: images [unknown] [unknown]
1 x 640 x 640 x 3 size: 1228800
name: txt_feats [unknown] [unknown]
1 x 4 x 512 size: 8192
output size: 3
name: stride8
1 x 80 x 80 x 68 size: 1740800
name: stride16
1 x 40 x 40 x 68 size: 435200
name: stride32
1 x 20 x 20 x 68 size: 108800
[I][ yw_create][ 408]: num_classes: 4, num_features: 512, input w: 640, h: 640
is_output_nhwc: 1
input size: 1
name: text_token [unknown] [unknown]
1 x 77 size: 308
output size: 1
name: 2202
1 x 1 x 512 size: 2048
[I][ load_text_encoder][ 44]: text feature len 512
[I][ load_tokenizer][ 60]: text token len 77
* Running on local URL: http://0.0.0.0:7860
* To create a public link, set `share=True` in `launch()`.
Input:person, dog, car, horse and the test image
Result:
What is M.2 Accelerator card?, Show this DEMO based on Raspberry PI 5.
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libstdc++.so.6
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ cp install/lib/axcl_aarch64/libyoloworld.so pyyoloworld/
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ cd pyyoloworld/
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg/pyyoloworld $ python gradio_example.py --yoloworld ../models/yolo_u16_ax650.axmodel --tenc ../models/clip_b1_u16_ax650.axmodel --vocab ../vocab.txt --dev_type axcl
Trying to load: /home/axera/samples/yoloworldv2-new.hg/pyyoloworld/aarch64/libyoloworld.so
✅ Successfully loaded: /home/axera/samples/yoloworldv2-new.hg/pyyoloworld/libyoloworld.so
[I][ run][ 31]: AXCLWorker start with devid 0
input size: 2
name: images [unknown] [unknown]
1 x 640 x 640 x 3 size: 1228800
name: txt_feats [unknown] [unknown]
1 x 4 x 512 size: 8192
output size: 3
name: stride8
1 x 80 x 80 x 68 size: 1740800
name: stride16
1 x 40 x 40 x 68 size: 435200
name: stride32
1 x 20 x 20 x 68 size: 108800
[I][ yw_create][ 408]: num_classes: 4, num_features: 512, input w: 640, h: 640
is_output_nhwc: 1
input size: 1
name: text_token [unknown] [unknown]
1 x 77 size: 308
output size: 1
name: 2202
1 x 1 x 512 size: 2048
[I][ load_text_encoder][ 44]: text feature len 512
[I][ load_tokenizer][ 60]: text token len 77
* Running on local URL: http://0.0.0.0:7860
* To create a public link, set `share=True` in `launch()`.
If your Raspberry PI 5 IP Address is 192.168.1.100, so using this URL http://192.168.1.100:7860 with your WebApp.
Input:man, shoes, ball, person and the test image
Result: