English
rk-transformers
bert
exbert
rknn
rockchip
npu
rk3588
eacortes commited on
Commit
93aaaa6
·
verified ·
1 Parent(s): 4113b79

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. README.md +15 -19
  2. config.json +366 -0
  3. model_b4_s256.rknn +1 -1
  4. model_b4_s512.rknn +1 -1
  5. rknn/model_w8a8.rknn +2 -2
README.md CHANGED
@@ -8,6 +8,8 @@ tags:
8
  - rk-transformers
9
  - rk3588
10
  license: apache-2.0
 
 
11
  model_name: bert-base-uncased
12
  base_model: google-bert/bert-base-uncased
13
  library_name: rk-transformers
@@ -23,7 +25,7 @@ library_name: rk-transformers
23
  - **Original Model:** [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)
24
  - **Target Platform:** rk3588
25
  - **rknn-toolkit2 Version:** 2.3.2
26
- - **rk-transformers Version:** 0.1.0
27
 
28
  ### Available Model Files
29
 
@@ -42,40 +44,32 @@ library_name: rk-transformers
42
 
43
  ### Installation
44
 
45
- Install `rk-transformers` to use this model:
46
 
47
  ```bash
48
- pip install rk-transformers
49
  ```
50
 
51
- #### RKTransformers API
52
 
53
  ```python
54
- from rktransformers import RKRTModelForFeatureExtraction
55
  from transformers import AutoTokenizer
56
 
57
- # Load tokenizer and model
58
  tokenizer = AutoTokenizer.from_pretrained("rk-transformers/bert-base-uncased")
59
- model = RKRTModelForFeatureExtraction.from_pretrained(
60
  "rk-transformers/bert-base-uncased",
61
  platform="rk3588",
62
  core_mask="auto",
63
  )
64
 
65
- # Tokenize and run inference
66
- inputs = tokenizer(
67
- ["Sample text for encoding"],
68
- padding="max_length",
69
- max_length=256,
70
- truncation=True,
71
- return_tensors="np"
72
- )
73
-
74
  outputs = model(**inputs)
75
- print(outputs.shape)
 
76
 
77
  # Load specific optimized/quantized model file
78
- model = RKRTModelForFeatureExtraction.from_pretrained(
79
  "rk-transformers/bert-base-uncased",
80
  platform="rk3588",
81
  file_name="rknn/model_w8a8.rknn"
@@ -84,10 +78,12 @@ model = RKRTModelForFeatureExtraction.from_pretrained(
84
 
85
  ## Configuration
86
 
87
- The full configuration for all exported RKNN models is available in the [rknn.json](./rknn.json) file.
88
 
89
  </details>
90
 
 
 
91
  # BERT base model (uncased)
92
 
93
  Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in
 
8
  - rk-transformers
9
  - rk3588
10
  license: apache-2.0
11
+ datasets:
12
+ - sentence-transformers/natural-questions
13
  model_name: bert-base-uncased
14
  base_model: google-bert/bert-base-uncased
15
  library_name: rk-transformers
 
25
  - **Original Model:** [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)
26
  - **Target Platform:** rk3588
27
  - **rknn-toolkit2 Version:** 2.3.2
28
+ - **rk-transformers Version:** 0.3.0
29
 
30
  ### Available Model Files
31
 
 
44
 
45
  ### Installation
46
 
47
+ Install `rk-transformers` with inference dependencies to use this model:
48
 
49
  ```bash
50
+ pip install rk-transformers[inference]
51
  ```
52
 
53
+ #### RK-Transformers API
54
 
55
  ```python
56
+ from rktransformers import RKModelForMaskedLM
57
  from transformers import AutoTokenizer
58
 
 
59
  tokenizer = AutoTokenizer.from_pretrained("rk-transformers/bert-base-uncased")
60
+ model = RKModelForMaskedLM.from_pretrained(
61
  "rk-transformers/bert-base-uncased",
62
  platform="rk3588",
63
  core_mask="auto",
64
  )
65
 
66
+ inputs = tokenizer("The capital of France is [MASK].", return_tensors="np")
 
 
 
 
 
 
 
 
67
  outputs = model(**inputs)
68
+ logits = outputs.logits
69
+ print(logits.shape)
70
 
71
  # Load specific optimized/quantized model file
72
+ model = RKModelForMaskedLM.from_pretrained(
73
  "rk-transformers/bert-base-uncased",
74
  platform="rk3588",
75
  file_name="rknn/model_w8a8.rknn"
 
78
 
79
  ## Configuration
80
 
81
+ The full configuration for all exported RKNN models is available in the [config.json](./config.json) file.
82
 
83
  </details>
84
 
85
+ ---
86
+
87
  # BERT base model (uncased)
88
 
89
  Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in
config.json CHANGED
@@ -17,6 +17,372 @@
17
  "num_hidden_layers": 12,
18
  "pad_token_id": 0,
19
  "position_embedding_type": "absolute",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  "torch_dtype": "float32",
21
  "transformers_version": "4.55.4",
22
  "type_vocab_size": 2,
 
17
  "num_hidden_layers": 12,
18
  "pad_token_id": 0,
19
  "position_embedding_type": "absolute",
20
+ "rknn": {
21
+ "model.rknn": {
22
+ "batch_size": 1,
23
+ "custom_string": null,
24
+ "dynamic_input": null,
25
+ "float_dtype": "float16",
26
+ "inputs_yuv_fmt": null,
27
+ "max_seq_length": 512,
28
+ "mean_values": null,
29
+ "model_input_names": [
30
+ "input_ids",
31
+ "attention_mask",
32
+ "token_type_ids"
33
+ ],
34
+ "opset": 19,
35
+ "optimization": {
36
+ "compress_weight": false,
37
+ "enable_flash_attention": true,
38
+ "model_pruning": false,
39
+ "optimization_level": 0,
40
+ "remove_reshape": false,
41
+ "remove_weight": false,
42
+ "sparse_infer": false
43
+ },
44
+ "quantization": {
45
+ "auto_hybrid_cos_thresh": 0.98,
46
+ "auto_hybrid_euc_thresh": null,
47
+ "dataset_columns": null,
48
+ "dataset_name": null,
49
+ "dataset_size": 128,
50
+ "dataset_split": null,
51
+ "dataset_subset": null,
52
+ "do_quantization": false,
53
+ "quant_img_RGB2BGR": false,
54
+ "quantized_algorithm": "normal",
55
+ "quantized_dtype": "w8a8",
56
+ "quantized_hybrid_level": 0,
57
+ "quantized_method": "channel"
58
+ },
59
+ "rktransformers_version": "0.3.0",
60
+ "single_core_mode": false,
61
+ "std_values": null,
62
+ "target_platform": "rk3588",
63
+ "task": "auto",
64
+ "task_kwargs": null
65
+ },
66
+ "model_b1_s256.rknn": {
67
+ "batch_size": 1,
68
+ "custom_string": null,
69
+ "dynamic_input": null,
70
+ "float_dtype": "float16",
71
+ "inputs_yuv_fmt": null,
72
+ "max_seq_length": 256,
73
+ "mean_values": null,
74
+ "model_input_names": [
75
+ "input_ids",
76
+ "attention_mask",
77
+ "token_type_ids"
78
+ ],
79
+ "opset": 19,
80
+ "optimization": {
81
+ "compress_weight": false,
82
+ "enable_flash_attention": true,
83
+ "model_pruning": false,
84
+ "optimization_level": 0,
85
+ "remove_reshape": false,
86
+ "remove_weight": false,
87
+ "sparse_infer": false
88
+ },
89
+ "quantization": {
90
+ "auto_hybrid_cos_thresh": 0.98,
91
+ "auto_hybrid_euc_thresh": null,
92
+ "dataset_columns": null,
93
+ "dataset_name": null,
94
+ "dataset_size": 128,
95
+ "dataset_split": null,
96
+ "dataset_subset": null,
97
+ "do_quantization": false,
98
+ "quant_img_RGB2BGR": false,
99
+ "quantized_algorithm": "normal",
100
+ "quantized_dtype": "w8a8",
101
+ "quantized_hybrid_level": 0,
102
+ "quantized_method": "channel"
103
+ },
104
+ "rktransformers_version": "0.3.0",
105
+ "single_core_mode": false,
106
+ "std_values": null,
107
+ "target_platform": "rk3588",
108
+ "task": "auto",
109
+ "task_kwargs": null
110
+ },
111
+ "model_b4_s256.rknn": {
112
+ "batch_size": 4,
113
+ "custom_string": null,
114
+ "dynamic_input": null,
115
+ "float_dtype": "float16",
116
+ "inputs_yuv_fmt": null,
117
+ "max_seq_length": 256,
118
+ "mean_values": null,
119
+ "model_input_names": [
120
+ "input_ids",
121
+ "attention_mask",
122
+ "token_type_ids"
123
+ ],
124
+ "opset": 19,
125
+ "optimization": {
126
+ "compress_weight": false,
127
+ "enable_flash_attention": true,
128
+ "model_pruning": false,
129
+ "optimization_level": 0,
130
+ "remove_reshape": false,
131
+ "remove_weight": false,
132
+ "sparse_infer": false
133
+ },
134
+ "quantization": {
135
+ "auto_hybrid_cos_thresh": 0.98,
136
+ "auto_hybrid_euc_thresh": null,
137
+ "dataset_columns": null,
138
+ "dataset_name": null,
139
+ "dataset_size": 128,
140
+ "dataset_split": null,
141
+ "dataset_subset": null,
142
+ "do_quantization": false,
143
+ "quant_img_RGB2BGR": false,
144
+ "quantized_algorithm": "normal",
145
+ "quantized_dtype": "w8a8",
146
+ "quantized_hybrid_level": 0,
147
+ "quantized_method": "channel"
148
+ },
149
+ "rktransformers_version": "0.3.0",
150
+ "single_core_mode": false,
151
+ "std_values": null,
152
+ "target_platform": "rk3588",
153
+ "task": "auto",
154
+ "task_kwargs": null
155
+ },
156
+ "model_b4_s512.rknn": {
157
+ "batch_size": 4,
158
+ "custom_string": null,
159
+ "dynamic_input": null,
160
+ "float_dtype": "float16",
161
+ "inputs_yuv_fmt": null,
162
+ "max_seq_length": 512,
163
+ "mean_values": null,
164
+ "model_input_names": [
165
+ "input_ids",
166
+ "attention_mask",
167
+ "token_type_ids"
168
+ ],
169
+ "opset": 19,
170
+ "optimization": {
171
+ "compress_weight": false,
172
+ "enable_flash_attention": true,
173
+ "model_pruning": false,
174
+ "optimization_level": 0,
175
+ "remove_reshape": false,
176
+ "remove_weight": false,
177
+ "sparse_infer": false
178
+ },
179
+ "quantization": {
180
+ "auto_hybrid_cos_thresh": 0.98,
181
+ "auto_hybrid_euc_thresh": null,
182
+ "dataset_columns": null,
183
+ "dataset_name": null,
184
+ "dataset_size": 128,
185
+ "dataset_split": null,
186
+ "dataset_subset": null,
187
+ "do_quantization": false,
188
+ "quant_img_RGB2BGR": false,
189
+ "quantized_algorithm": "normal",
190
+ "quantized_dtype": "w8a8",
191
+ "quantized_hybrid_level": 0,
192
+ "quantized_method": "channel"
193
+ },
194
+ "rktransformers_version": "0.3.0",
195
+ "single_core_mode": false,
196
+ "std_values": null,
197
+ "target_platform": "rk3588",
198
+ "task": "auto",
199
+ "task_kwargs": null
200
+ },
201
+ "rknn/model_o1.rknn": {
202
+ "batch_size": 1,
203
+ "custom_string": null,
204
+ "dynamic_input": null,
205
+ "float_dtype": "float16",
206
+ "inputs_yuv_fmt": null,
207
+ "max_seq_length": 512,
208
+ "mean_values": null,
209
+ "model_input_names": [
210
+ "input_ids",
211
+ "attention_mask",
212
+ "token_type_ids"
213
+ ],
214
+ "opset": 19,
215
+ "optimization": {
216
+ "compress_weight": false,
217
+ "enable_flash_attention": true,
218
+ "model_pruning": false,
219
+ "optimization_level": 1,
220
+ "remove_reshape": false,
221
+ "remove_weight": false,
222
+ "sparse_infer": false
223
+ },
224
+ "quantization": {
225
+ "auto_hybrid_cos_thresh": 0.98,
226
+ "auto_hybrid_euc_thresh": null,
227
+ "dataset_columns": null,
228
+ "dataset_name": null,
229
+ "dataset_size": 128,
230
+ "dataset_split": null,
231
+ "dataset_subset": null,
232
+ "do_quantization": false,
233
+ "quant_img_RGB2BGR": false,
234
+ "quantized_algorithm": "normal",
235
+ "quantized_dtype": "w8a8",
236
+ "quantized_hybrid_level": 0,
237
+ "quantized_method": "channel"
238
+ },
239
+ "rktransformers_version": "0.3.0",
240
+ "single_core_mode": false,
241
+ "std_values": null,
242
+ "target_platform": "rk3588",
243
+ "task": "auto",
244
+ "task_kwargs": null
245
+ },
246
+ "rknn/model_o2.rknn": {
247
+ "batch_size": 1,
248
+ "custom_string": null,
249
+ "dynamic_input": null,
250
+ "float_dtype": "float16",
251
+ "inputs_yuv_fmt": null,
252
+ "max_seq_length": 512,
253
+ "mean_values": null,
254
+ "model_input_names": [
255
+ "input_ids",
256
+ "attention_mask",
257
+ "token_type_ids"
258
+ ],
259
+ "opset": 19,
260
+ "optimization": {
261
+ "compress_weight": false,
262
+ "enable_flash_attention": true,
263
+ "model_pruning": false,
264
+ "optimization_level": 2,
265
+ "remove_reshape": false,
266
+ "remove_weight": false,
267
+ "sparse_infer": false
268
+ },
269
+ "quantization": {
270
+ "auto_hybrid_cos_thresh": 0.98,
271
+ "auto_hybrid_euc_thresh": null,
272
+ "dataset_columns": null,
273
+ "dataset_name": null,
274
+ "dataset_size": 128,
275
+ "dataset_split": null,
276
+ "dataset_subset": null,
277
+ "do_quantization": false,
278
+ "quant_img_RGB2BGR": false,
279
+ "quantized_algorithm": "normal",
280
+ "quantized_dtype": "w8a8",
281
+ "quantized_hybrid_level": 0,
282
+ "quantized_method": "channel"
283
+ },
284
+ "rktransformers_version": "0.3.0",
285
+ "single_core_mode": false,
286
+ "std_values": null,
287
+ "target_platform": "rk3588",
288
+ "task": "auto",
289
+ "task_kwargs": null
290
+ },
291
+ "rknn/model_o3.rknn": {
292
+ "batch_size": 1,
293
+ "custom_string": null,
294
+ "dynamic_input": null,
295
+ "float_dtype": "float16",
296
+ "inputs_yuv_fmt": null,
297
+ "max_seq_length": 512,
298
+ "mean_values": null,
299
+ "model_input_names": [
300
+ "input_ids",
301
+ "attention_mask",
302
+ "token_type_ids"
303
+ ],
304
+ "opset": 19,
305
+ "optimization": {
306
+ "compress_weight": false,
307
+ "enable_flash_attention": true,
308
+ "model_pruning": false,
309
+ "optimization_level": 3,
310
+ "remove_reshape": false,
311
+ "remove_weight": false,
312
+ "sparse_infer": false
313
+ },
314
+ "quantization": {
315
+ "auto_hybrid_cos_thresh": 0.98,
316
+ "auto_hybrid_euc_thresh": null,
317
+ "dataset_columns": null,
318
+ "dataset_name": null,
319
+ "dataset_size": 128,
320
+ "dataset_split": null,
321
+ "dataset_subset": null,
322
+ "do_quantization": false,
323
+ "quant_img_RGB2BGR": false,
324
+ "quantized_algorithm": "normal",
325
+ "quantized_dtype": "w8a8",
326
+ "quantized_hybrid_level": 0,
327
+ "quantized_method": "channel"
328
+ },
329
+ "rktransformers_version": "0.3.0",
330
+ "single_core_mode": false,
331
+ "std_values": null,
332
+ "target_platform": "rk3588",
333
+ "task": "auto",
334
+ "task_kwargs": null
335
+ },
336
+ "rknn/model_w8a8.rknn": {
337
+ "batch_size": 1,
338
+ "custom_string": null,
339
+ "dynamic_input": null,
340
+ "float_dtype": "float16",
341
+ "inputs_yuv_fmt": null,
342
+ "max_seq_length": 512,
343
+ "mean_values": null,
344
+ "model_input_names": [
345
+ "input_ids",
346
+ "attention_mask",
347
+ "token_type_ids"
348
+ ],
349
+ "opset": 19,
350
+ "optimization": {
351
+ "compress_weight": false,
352
+ "enable_flash_attention": true,
353
+ "model_pruning": false,
354
+ "optimization_level": 0,
355
+ "remove_reshape": false,
356
+ "remove_weight": false,
357
+ "sparse_infer": false
358
+ },
359
+ "quantization": {
360
+ "auto_hybrid_cos_thresh": 0.98,
361
+ "auto_hybrid_euc_thresh": null,
362
+ "dataset_columns": [
363
+ "answer"
364
+ ],
365
+ "dataset_name": "sentence-transformers/natural-questions",
366
+ "dataset_size": 1024,
367
+ "dataset_split": [
368
+ "train"
369
+ ],
370
+ "dataset_subset": null,
371
+ "do_quantization": true,
372
+ "quant_img_RGB2BGR": false,
373
+ "quantized_algorithm": "normal",
374
+ "quantized_dtype": "w8a8",
375
+ "quantized_hybrid_level": 0,
376
+ "quantized_method": "channel"
377
+ },
378
+ "rktransformers_version": "0.3.0",
379
+ "single_core_mode": false,
380
+ "std_values": null,
381
+ "target_platform": "rk3588",
382
+ "task": "auto",
383
+ "task_kwargs": null
384
+ }
385
+ },
386
  "torch_dtype": "float32",
387
  "transformers_version": "4.55.4",
388
  "type_vocab_size": 2,
model_b4_s256.rknn CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:abad409b4f884e6a49c4d1a2144dd90f5a70ec9a615014da2fc62747fb755457
3
  size 283424646
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:38f32b5d5c75fe48a8ec7e6b2e40797ba3c2f31268ab4fc45f3a6c743eb6fa44
3
  size 283424646
model_b4_s512.rknn CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0c777bf44463b5738af0693ce5a5fe54cf4f4516c6bff68e7979f89d8c11ee80
3
  size 294075782
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b36b02fc89525575f596032dcba6150ad8eafcc3eb97f1f0f16a0b871d98eb0
3
  size 294075782
rknn/model_w8a8.rknn CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2b35b9cf8db6760f45b2fd9cc700e5bd88f8af73a69ecc87fe70799aa8ecd5b0
3
- size 140070675
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:695fef86da5879cbe7359404db7a2549ab8e9fdaaaa80bf91dc67d82247cfe9a
3
+ size 140071443