qqc1989 commited on
Commit
f741ef6
Β·
verified Β·
1 Parent(s): f752476

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +151 -3
README.md CHANGED
@@ -1,3 +1,151 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ base_model:
6
+ - google/siglip-so400m-patch14-384
7
+ pipeline_tag: zero-shot-image-classification
8
+ tags:
9
+ - siglip
10
+ - Int8
11
+ ---
12
+
13
+ # SigLIP (shape-optimized model)
14
+
15
+ SigLIP model pre-trained on WebLi at resolution 384x384. It was introduced in the paper Sigmoid Loss for Language Image Pre-Training by Zhai et al. and first released in this repository.
16
+
17
+ The Original repo is https://huggingface.co/google/siglip-so400m-patch14-384.
18
+
19
+ This model of SigLIP has been converted to run on the Axera NPU using **w8a16** quantization.
20
+
21
+ This model has been optimized with the following LoRA:
22
+
23
+ Compatible with Pulsar2 version: 3.4
24
+
25
+ ## Convert tools links:
26
+
27
+ For those who are interested in model conversion, you can try to export axmodel through
28
+
29
+
30
+ - [The repo of AXera Platform](https://github.com/AXERA-TECH/SigLIP.axera), which you can get the detial of guide
31
+
32
+ - [Pulsar2 Link, How to Convert ONNX to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/pulsar2/introduction.html)
33
+
34
+
35
+ ## Support Platform
36
+
37
+ - AX650
38
+ - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
39
+ - [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
40
+
41
+
42
+ | Models | Raspberry Pi5 Only CPU | Intel i7-13700 | Raspberry Pi5 + M.2 Card |
43
+ | --------------------- | ---------------------- | -------------- | ------------------------ |
44
+ | Image Encoder | 8.3 s | 1.2 s | 0.19 s |
45
+ | Text Encoder | 1.3 s | 0.3 s | 0.05 s |
46
+
47
+ ## How to use
48
+
49
+ Download all files from this repository to the device
50
+
51
+ ```
52
+ (axcl) axera@raspberrypi:~/samples/siglip $ tree -L 2
53
+ .
54
+ β”œβ”€β”€ 000000039769.jpg
55
+ β”œβ”€β”€ ax650
56
+ β”‚Β Β  β”œβ”€β”€ siglip_text_u16.axmodel
57
+ β”‚Β Β  └── siglip_vision_u16_fcu8.axmodel
58
+ β”œβ”€β”€ config.json
59
+ β”œβ”€β”€ onnx
60
+ β”‚Β Β  β”œβ”€β”€ siglip-so400m-patch14-384_text.onnx
61
+ β”‚Β Β  └── siglip-so400m-patch14-384_vision.onnx
62
+ β”œβ”€β”€ python
63
+ β”‚Β Β  β”œβ”€β”€ inference_axmodel.py
64
+ β”‚Β Β  β”œβ”€β”€ inference_onnx.py
65
+ β”‚Β Β  └── requirements.txt
66
+ └── tokenizer
67
+ β”œβ”€β”€ config.json
68
+ β”œβ”€β”€ preprocessor_config.json
69
+ β”œβ”€β”€ special_tokens_map.json
70
+ β”œβ”€β”€ spiece.model
71
+ β”œβ”€β”€ tokenizer_config.json
72
+ └── tokenizer.json
73
+
74
+ 5 directories, 15 files
75
+ ```
76
+
77
+ ### python env requirement
78
+
79
+ #### pyaxengine
80
+
81
+ https://github.com/AXERA-TECH/pyaxengine
82
+
83
+ ```
84
+ wget https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3rc0/axengine-0.1.3-py3-none-any.whl
85
+ pip install axengine-0.1.3-py3-none-any.whl
86
+ ```
87
+
88
+ #### others
89
+
90
+ ```
91
+ pip install -r python/requirements.txt
92
+ ```
93
+
94
+ ## Inputs
95
+
96
+ **Test**
97
+ ```
98
+ "a photo of 2 cats", "a photo of 2 dogs"
99
+ ```
100
+
101
+ **Image**
102
+ ![](000000039769.jpg)
103
+
104
+ ## Inference with AX650 Host, such as M4N-Dock(爱芯派Pro)
105
+
106
+ ```
107
+ root@ax650:/mnt/qtang/inner/SigLIP.axera# python3 python/inference_axmodel.py
108
+ [INFO] Available providers: ['AxEngineExecutionProvider']
109
+ [INFO] Using provider: AxEngineExecutionProvider
110
+ [INFO] Chip type: ChipType.MC50
111
+ [INFO] VNPU type: VNPUType.DISABLED
112
+ [INFO] Engine version: 2.7.2a
113
+ [INFO] Model type: 2 (triple core)
114
+ [INFO] Compiler version: 3.4-dirty 739e2b35-dirty
115
+ Model loading time: 3.86 seconds
116
+ [INFO] Using provider: AxEngineExecutionProvider
117
+ [INFO] Model type: 2 (triple core)
118
+ [INFO] Compiler version: 3.4-dirty 739e2b35-dirty
119
+ Model loading time: 3.22 seconds
120
+ Total model loading time: 7.08 seconds
121
+ Model inference time: 0.19 seconds
122
+ Model inference time: 0.05 seconds
123
+ Total inference time: 0.24 seconds
124
+ 49.4% that image 0 is 'a photo of 2 cats'
125
+ root@ax650:/mnt/qtang/inner/SigLIP.axera#
126
+ ```
127
+
128
+ ## Inference with M.2 Accelerator card
129
+
130
+ [What is M.2 Accelerator card?](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html), Show this DEMO based on Raspberry PI 5.
131
+
132
+ ```
133
+ (axcl) axera@raspberrypi:~/samples/siglip $ python python/inference_axmodel.py
134
+ [INFO] Available providers: ['AXCLRTExecutionProvider']
135
+ [INFO] Using provider: AXCLRTExecutionProvider
136
+ [INFO] SOC Name: AX650N
137
+ [INFO] VNPU type: VNPUType.DISABLED
138
+ [INFO] Compiler version: 3.4-dirty 739e2b35-dirty
139
+ Model loading time: 12.31 seconds
140
+ [INFO] Using provider: AXCLRTExecutionProvider
141
+ [INFO] SOC Name: AX650N
142
+ [INFO] VNPU type: VNPUType.DISABLED
143
+ [INFO] Compiler version: 3.4-dirty 739e2b35-dirty
144
+ Model loading time: 12.37 seconds
145
+ Total model loading time: 24.68 seconds
146
+ Model inference time: 0.19 seconds
147
+ Model inference time: 0.05 seconds
148
+ Total inference time: 0.24 seconds
149
+ 52.5% that image 0 is 'a photo of 2 cats'
150
+ (axcl) axera@raspberrypi:~/samples/siglip $
151
+ ```