Shreetej commited on
Commit
1d42126
·
verified ·
1 Parent(s): d9d241b

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - tiiuae/Falcon-H1-0.5B-Instruct
4
+ library_name: transformers
5
+ tags:
6
+ - bnb-my-repo
7
+ - falcon-h1
8
+ license: other
9
+ license_name: falcon-llm-license
10
+ license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html
11
+ inference: true
12
+ ---
13
+ # tiiuae/Falcon-H1-0.5B-Instruct (Quantized)
14
+
15
+ ## Description
16
+ This model is a quantized version of the original model [`tiiuae/Falcon-H1-0.5B-Instruct`](https://huggingface.co/tiiuae/Falcon-H1-0.5B-Instruct).
17
+
18
+ It's quantized using the BitsAndBytes library to 4-bit using the [bnb-my-repo](https://huggingface.co/spaces/bnb-community/bnb-my-repo) space.
19
+
20
+ ## Quantization Details
21
+ - **Quantization Type**: int4
22
+ - **bnb_4bit_quant_type**: nf4
23
+ - **bnb_4bit_use_double_quant**: True
24
+ - **bnb_4bit_compute_dtype**: bfloat16
25
+ - **bnb_4bit_quant_storage**: uint8
26
+
27
+
28
+
29
+ # 📄 Original Model Information
30
+
31
+
32
+
33
+ <img src="https://huggingface.co/datasets/tiiuae/documentation-images/resolve/main/falcon_mamba/falcon-h1-logo.png" alt="drawing" width="800"/>
34
+
35
+ # Table of Contents
36
+
37
+ 0. [TL;DR](#TL;DR)
38
+ 1. [Model Details](#model-details)
39
+ 2. [Training Details](#training-details)
40
+ 3. [Usage](#usage)
41
+ 4. [Evaluation](#evaluation)
42
+ 5. [Citation](#citation)
43
+
44
+ # TL;DR
45
+
46
+ # Model Details
47
+
48
+ ## Model Description
49
+
50
+ - **Developed by:** [https://www.tii.ae](https://www.tii.ae)
51
+ - **Model type:** Causal decoder-only
52
+ - **Architecture:** Hybrid Transformers + Mamba architecture
53
+ - **Language(s) (NLP):** English
54
+ - **License:** Falcon-LLM License
55
+
56
+ # Training details
57
+
58
+ For more details about the training protocol of this model, please refer to the [Falcon-H1 technical blogpost](https://falcon-lm.github.io/blog/falcon-h1/).
59
+
60
+ # Usage
61
+
62
+ Currently to use this model you can either rely on Hugging Face `transformers`, `vLLM` or `llama.cpp` library.
63
+
64
+ ## Inference
65
+
66
+ Make sure to install the latest version of `transformers` or `vllm`, eventually install these packages from source:
67
+
68
+ ```bash
69
+ pip install git+https://github.com/huggingface/transformers.git
70
+ ```
71
+
72
+ For vLLM, make sure to install `vllm>=0.9.0`:
73
+
74
+ ```bash
75
+ pip install "vllm>=0.9.0"
76
+ ```
77
+
78
+ ### 🤗 transformers
79
+
80
+ Refer to the snippet below to run H1 models using 🤗 transformers:
81
+
82
+ ```python
83
+ import torch
84
+ from transformers import AutoModelForCausalLM, AutoTokenizer
85
+
86
+ model_id = "tiiuae/Falcon-H1-1B-Base"
87
+
88
+ model = AutoModelForCausalLM.from_pretrained(
89
+ model_id,
90
+ torch_dtype=torch.bfloat16,
91
+ device_map="auto"
92
+ )
93
+
94
+ # Perform text generation
95
+ ```
96
+
97
+ ### vLLM
98
+
99
+ For vLLM, simply start a server by executing the command below:
100
+
101
+ ```
102
+ # pip install vllm>=0.9.0
103
+ vllm serve tiiuae/Falcon-H1-1B-Instruct --tensor-parallel-size 2 --data-parallel-size 1
104
+ ```
105
+
106
+ ### `llama.cpp`
107
+
108
+ You can find all GGUF files compatible with `llama.cpp` under [our official collection](https://huggingface.co/collections/tiiuae/falcon-h1-6819f2795bc406da60fab8df)
109
+
110
+ # Evaluation
111
+
112
+ Falcon-H1 series perform very well on a variety of tasks, including reasoning tasks.
113
+
114
+ | Tasks | Falcon-H1-0.5B | Qwen3-0.6B | Qwen2.5-0.5B | Gemma3-1B | Llama3.2-1B | Falcon3-1B |
115
+ | --- | --- | --- | --- | --- | --- | --- |
116
+ | **General** | | | | | |
117
+ | BBH | **42.91** | 32.95 | 33.26 | 35.86 | 33.21 | 34.47 |
118
+ | ARC-C | 37.8 | 31.06 | 33.28 | 34.13 | 34.64 | **43.09** |
119
+ | TruthfulQA | 44.12 | **51.65** | 46.19 | 42.17 | 42.08 | 42.31 |
120
+ | HellaSwag | 51.93 | 42.17 | 52.38 | 42.24 | 55.3 | **58.53** |
121
+ | MMLU | **53.4** | 42.98 | 46.07 | 40.87 | 45.93 | 46.1 |
122
+ | **Math** | | | | | |
123
+ | GSM8k | **68.39** | 42.61 | 38.51 | 42.38 | 44.28 | 44.05 |
124
+ | MATH-500 | **58.4** | 46.0 | 27.8 | 45.4 | 13.2 | 19.8 |
125
+ | AMC-23 | **33.13** | 27.97 | 12.5 | 19.22 | 7.19 | 6.87 |
126
+ | AIME-24 | **3.75** | 2.71 | 0.62 | 0.42 | 1.46 | 0.41 |
127
+ | AIME-25 | **4.38** | 1.67 | 0.21 | 1.25 | 0.0 | 0.21 |
128
+ | **Science** | | | | | |
129
+ | GPQA | **29.95** | 26.09 | 26.85 | 28.19 | 26.59 | 26.76 |
130
+ | GPQA_Diamond | 27.95 | 25.08 | 24.24 | 21.55 | 25.08 | **31.31** |
131
+ | MMLU-Pro | **31.03** | 16.95 | 18.73 | 14.46 | 16.2 | 18.49 |
132
+ | MMLU-stem | **54.55** | 39.3 | 39.83 | 35.39 | 39.16 | 39.64 |
133
+ | **Code** | | | | | |
134
+ | HumanEval | **51.83** | 41.46 | 36.59 | 40.85 | 34.15 | 22.56 |
135
+ | HumanEval+ | **45.12** | 37.19 | 32.32 | 37.2 | 29.88 | 20.73 |
136
+ | MBPP | 42.59 | 56.08 | 46.83 | **57.67** | 33.6 | 20.63 |
137
+ | MBPP+ | 33.07 | 47.08 | 39.68 | **50.0** | 29.37 | 17.2 |
138
+ | LiveCodeBench | 7.05 | **9.78** | 2.94 | 5.09 | 2.35 | 0.78 |
139
+ | CRUXEval | **25.75** | 23.63 | 14.88 | 12.7 | 0.06 | 15.58 |
140
+ | **Instruction Following** | | | | | |
141
+ | IFEval | **72.07** | 62.16 | 32.11 | 61.48 | 55.34 | 54.26 |
142
+ | Alpaca-Eval | 10.79 | 9.59 | 3.26 | **17.87** | 9.38 | 6.98 |
143
+ | MTBench | **7.06** | 5.75 | 4.71 | 7.03 | 6.37 | 6.03 |
144
+ | LiveBench | 20.8 | **27.78** | 14.27 | 18.79 | 14.97 | 14.1 |
145
+
146
+ You can check more in detail on our [our release blogpost](https://falcon-lm.github.io/blog/falcon-h1/), detailed benchmarks.
147
+
148
+ # Useful links
149
+
150
+ - View [our release blogpost](https://falcon-lm.github.io/blog/falcon-h1/).
151
+ - Feel free to join [our discord server](https://discord.gg/trwMYP9PYm) if you have any questions or to interact with our researchers and developers.
152
+
153
+ # Citation
154
+
155
+ If the Falcon-H1 family of models were helpful to your work, feel free to give us a cite.
156
+
157
+ ```
158
+ @misc{tiifalconh1,
159
+ title = {Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance},
160
+ url = {https://falcon-lm.github.io/blog/falcon-h1},
161
+ author = {Falcon-LLM Team},
162
+ month = {May},
163
+ year = {2025}
164
+ }
165
+ ```
chat_template.jinja ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {{bos_token}}
2
+ {%- if tools %}
3
+ {{- '<|im_start|>system\n' }}
4
+ {%- if messages[0].role == 'system' %}
5
+ {{- messages[0].content + '\n\n' }}
6
+ {%- endif %}
7
+ {{- "You are a function calling AI model. You are provided with function signature within <tools> </tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions.\n<tools>\n" }}
8
+ {%- for tool in tools %}[{{- tool | tojson }}]{%- endfor %}
9
+ {{- "\n</tools>\nFor each function call, return a json object with function name and arguments within <tool_call> </tool_call> tags with the following schema:\n<tool_call>\n{'arguments': <args-dict>, 'name': <function-name>}\n</tool_call>\n" }}
10
+ {%- else %}
11
+ {%- if messages[0].role == 'system' %}
12
+ {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
13
+ {%- endif %}
14
+ {%- endif %}{% for message in messages %}{%- if message.role != 'system' %}{{'<|im_start|>' + message['role'] + '
15
+ ' + message['content'] + '<|im_end|>' + '
16
+ '}}{%- endif %}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
17
+ ' }}{% endif %}
config.json ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "FalconH1Model"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "attention_in_multiplier": 1.0,
8
+ "attention_out_multiplier": 0.9375,
9
+ "attn_layer_indices": null,
10
+ "bos_token_id": 1,
11
+ "embedding_multiplier": 5.656854249492381,
12
+ "eos_token_id": 11,
13
+ "head_dim": 64,
14
+ "hidden_act": "silu",
15
+ "hidden_size": 1024,
16
+ "initializer_range": 0.02,
17
+ "intermediate_size": 2048,
18
+ "key_multiplier": 0.39062499999999994,
19
+ "lm_head_multiplier": 0.0390625,
20
+ "mamba_chunk_size": 128,
21
+ "mamba_conv_bias": true,
22
+ "mamba_d_conv": 4,
23
+ "mamba_d_head": 64,
24
+ "mamba_d_ssm": 1536,
25
+ "mamba_d_state": 128,
26
+ "mamba_expand": 2,
27
+ "mamba_n_groups": 1,
28
+ "mamba_n_heads": 24,
29
+ "mamba_norm_before_gate": false,
30
+ "mamba_proj_bias": false,
31
+ "mamba_rms_norm": false,
32
+ "mamba_use_mlp": true,
33
+ "max_position_embeddings": 16384,
34
+ "mlp_bias": false,
35
+ "mlp_expansion_factor": 8,
36
+ "mlp_multipliers": [
37
+ 0.8838834764831844,
38
+ 0.5859375
39
+ ],
40
+ "model_type": "falcon_h1",
41
+ "num_attention_heads": 8,
42
+ "num_hidden_layers": 36,
43
+ "num_key_value_heads": 2,
44
+ "num_logits_to_keep": 1,
45
+ "pad_token_id": 0,
46
+ "projectors_bias": false,
47
+ "quantization_config": {
48
+ "_load_in_4bit": true,
49
+ "_load_in_8bit": false,
50
+ "bnb_4bit_compute_dtype": "bfloat16",
51
+ "bnb_4bit_quant_storage": "uint8",
52
+ "bnb_4bit_quant_type": "nf4",
53
+ "bnb_4bit_use_double_quant": true,
54
+ "llm_int8_enable_fp32_cpu_offload": false,
55
+ "llm_int8_has_fp16_weight": false,
56
+ "llm_int8_skip_modules": null,
57
+ "llm_int8_threshold": 6.0,
58
+ "load_in_4bit": true,
59
+ "load_in_8bit": false,
60
+ "quant_method": "bitsandbytes"
61
+ },
62
+ "rms_norm_eps": 1e-05,
63
+ "rope_scaling": null,
64
+ "rope_theta": 100000000000.0,
65
+ "ssm_in_multiplier": 1.25,
66
+ "ssm_multipliers": [
67
+ 0.3535533905932738,
68
+ 0.25,
69
+ 0.3535533905932738,
70
+ 0.5,
71
+ 0.3535533905932738
72
+ ],
73
+ "ssm_out_multiplier": 0.23570226039551587,
74
+ "tie_word_embeddings": false,
75
+ "torch_dtype": "bfloat16",
76
+ "transformers_version": "4.53.1",
77
+ "use_cache": true,
78
+ "vocab_size": 32784
79
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0855331660331d59b01c8a4a05b1f9a284970243a54a28cb72148841d70104a0
3
+ size 302732604
special_tokens_map.json ADDED
@@ -0,0 +1,356 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|pad|>",
4
+ ">>ABSTRACT<<",
5
+ ">>INTRODUCTION<<",
6
+ ">>SUMMARY<<",
7
+ ">>COMMENT<<",
8
+ ">>ANSWER<<",
9
+ ">>QUESTION<<",
10
+ ">>DOMAIN<<",
11
+ ">>PREFIX<<",
12
+ ">>SUFFIX<<",
13
+ ">>MIDDLE<<",
14
+ "<|finetune_right_pad_id|>",
15
+ "<|start_header_id|>",
16
+ "<|end_header_id|>",
17
+ "<|eom_id|>",
18
+ "<|eot_id|>",
19
+ "<|begin_of_text|>",
20
+ ">>TITLE<<",
21
+ "<tool_response>",
22
+ "</tool_response>",
23
+ "<tool_call>",
24
+ "</tool_call>",
25
+ "<schema>",
26
+ "</schema>",
27
+ "<scratch_pad>",
28
+ "</scratch_pad>",
29
+ "<thinking>",
30
+ "</thinking>",
31
+ "<explanation>",
32
+ "</explanation>",
33
+ "<file_sep>",
34
+ "<repo_name>",
35
+ "<|im_end|>",
36
+ "<|im_start|>",
37
+ ">>UNUSED_119<<",
38
+ ">>UNUSED_120<<",
39
+ "<|system|>",
40
+ ">>UNUSED_218<<",
41
+ ">>UNUSED_219<<",
42
+ ">>UNUSED_220<<",
43
+ ">>UNUSED_221<<",
44
+ ">>UNUSED_222<<",
45
+ ">>UNUSED_223<<",
46
+ ">>UNUSED_224<<",
47
+ ">>UNUSED_225<<",
48
+ ">>UNUSED_226<<",
49
+ ">>UNUSED_227<<",
50
+ ">>UNUSED_228<<",
51
+ ">>UNUSED_229<<",
52
+ ">>UNUSED_230<<",
53
+ ">>UNUSED_231<<",
54
+ ">>UNUSED_232<<",
55
+ ">>UNUSED_233<<",
56
+ ">>UNUSED_234<<",
57
+ ">>UNUSED_235<<",
58
+ ">>UNUSED_236<<",
59
+ ">>UNUSED_237<<",
60
+ ">>UNUSED_238<<",
61
+ ">>UNUSED_239<<",
62
+ ">>UNUSED_240<<",
63
+ ">>UNUSED_241<<",
64
+ ">>UNUSED_242<<",
65
+ ">>UNUSED_243<<",
66
+ ">>UNUSED_244<<",
67
+ ">>UNUSED_245<<",
68
+ ">>UNUSED_246<<",
69
+ ">>UNUSED_247<<",
70
+ ">>UNUSED_248<<",
71
+ ">>UNUSED_249<<",
72
+ ">>UNUSED_250<<",
73
+ ">>UNUSED_251<<",
74
+ ">>UNUSED_252<<",
75
+ ">>UNUSED_253<<",
76
+ ">>UNUSED_254<<",
77
+ ">>UNUSED_255<<",
78
+ ">>UNUSED_256<<",
79
+ ">>UNUSED_257<<",
80
+ ">>UNUSED_258<<",
81
+ ">>UNUSED_259<<",
82
+ ">>UNUSED_260<<",
83
+ ">>UNUSED_261<<",
84
+ ">>UNUSED_262<<",
85
+ ">>UNUSED_263<<",
86
+ ">>UNUSED_264<<",
87
+ ">>UNUSED_265<<",
88
+ ">>UNUSED_266<<",
89
+ ">>UNUSED_267<<",
90
+ ">>UNUSED_268<<",
91
+ ">>UNUSED_269<<",
92
+ ">>UNUSED_270<<",
93
+ ">>UNUSED_271<<",
94
+ ">>UNUSED_272<<",
95
+ ">>UNUSED_273<<",
96
+ ">>UNUSED_274<<",
97
+ ">>UNUSED_275<<",
98
+ ">>UNUSED_276<<",
99
+ ">>UNUSED_277<<",
100
+ ">>UNUSED_278<<",
101
+ ">>UNUSED_279<<",
102
+ ">>UNUSED_280<<",
103
+ ">>UNUSED_281<<",
104
+ ">>UNUSED_282<<",
105
+ ">>UNUSED_283<<",
106
+ ">>UNUSED_284<<",
107
+ ">>UNUSED_285<<",
108
+ ">>UNUSED_286<<",
109
+ ">>UNUSED_287<<",
110
+ ">>UNUSED_288<<",
111
+ ">>UNUSED_289<<",
112
+ ">>UNUSED_290<<",
113
+ ">>UNUSED_291<<",
114
+ ">>UNUSED_292<<",
115
+ ">>UNUSED_293<<",
116
+ ">>UNUSED_294<<",
117
+ ">>UNUSED_295<<",
118
+ ">>UNUSED_296<<",
119
+ ">>UNUSED_297<<",
120
+ ">>UNUSED_298<<",
121
+ ">>UNUSED_299<<",
122
+ ">>UNUSED_300<<",
123
+ ">>UNUSED_301<<",
124
+ ">>UNUSED_302<<",
125
+ ">>UNUSED_303<<",
126
+ ">>UNUSED_304<<",
127
+ ">>UNUSED_305<<",
128
+ ">>UNUSED_306<<",
129
+ ">>UNUSED_307<<",
130
+ ">>UNUSED_308<<",
131
+ ">>UNUSED_309<<",
132
+ ">>UNUSED_310<<",
133
+ ">>UNUSED_311<<",
134
+ ">>UNUSED_312<<",
135
+ ">>UNUSED_313<<",
136
+ ">>UNUSED_314<<",
137
+ ">>UNUSED_315<<",
138
+ ">>UNUSED_316<<",
139
+ ">>UNUSED_317<<",
140
+ ">>UNUSED_318<<",
141
+ ">>UNUSED_319<<",
142
+ ">>UNUSED_320<<",
143
+ ">>UNUSED_321<<",
144
+ ">>UNUSED_322<<",
145
+ ">>UNUSED_323<<",
146
+ ">>UNUSED_324<<",
147
+ ">>UNUSED_325<<",
148
+ ">>UNUSED_326<<",
149
+ ">>UNUSED_327<<",
150
+ ">>UNUSED_328<<",
151
+ ">>UNUSED_329<<",
152
+ ">>UNUSED_330<<",
153
+ ">>UNUSED_331<<",
154
+ ">>UNUSED_332<<",
155
+ ">>UNUSED_333<<",
156
+ ">>UNUSED_334<<",
157
+ ">>UNUSED_335<<",
158
+ ">>UNUSED_336<<",
159
+ ">>UNUSED_337<<",
160
+ ">>UNUSED_338<<",
161
+ ">>UNUSED_339<<",
162
+ ">>UNUSED_340<<",
163
+ ">>UNUSED_341<<",
164
+ ">>UNUSED_342<<",
165
+ ">>UNUSED_343<<",
166
+ ">>UNUSED_344<<",
167
+ ">>UNUSED_345<<",
168
+ ">>UNUSED_346<<",
169
+ ">>UNUSED_347<<",
170
+ ">>UNUSED_348<<",
171
+ ">>UNUSED_349<<",
172
+ ">>UNUSED_350<<",
173
+ ">>UNUSED_351<<",
174
+ ">>UNUSED_352<<",
175
+ ">>UNUSED_353<<",
176
+ ">>UNUSED_354<<",
177
+ ">>UNUSED_355<<",
178
+ ">>UNUSED_356<<",
179
+ ">>UNUSED_357<<",
180
+ ">>UNUSED_358<<",
181
+ ">>UNUSED_359<<",
182
+ ">>UNUSED_360<<",
183
+ ">>UNUSED_361<<",
184
+ ">>UNUSED_362<<",
185
+ ">>UNUSED_363<<",
186
+ ">>UNUSED_364<<",
187
+ ">>UNUSED_365<<",
188
+ ">>UNUSED_366<<",
189
+ ">>UNUSED_367<<",
190
+ ">>UNUSED_368<<",
191
+ ">>UNUSED_369<<",
192
+ ">>UNUSED_370<<",
193
+ ">>UNUSED_371<<",
194
+ ">>UNUSED_372<<",
195
+ ">>UNUSED_373<<",
196
+ ">>UNUSED_374<<",
197
+ ">>UNUSED_375<<",
198
+ ">>UNUSED_376<<",
199
+ ">>UNUSED_377<<",
200
+ ">>UNUSED_378<<",
201
+ ">>UNUSED_379<<",
202
+ ">>UNUSED_380<<",
203
+ ">>UNUSED_381<<",
204
+ ">>UNUSED_382<<",
205
+ ">>UNUSED_383<<",
206
+ ">>UNUSED_384<<",
207
+ ">>UNUSED_385<<",
208
+ ">>UNUSED_386<<",
209
+ ">>UNUSED_387<<",
210
+ ">>UNUSED_388<<",
211
+ ">>UNUSED_389<<",
212
+ ">>UNUSED_390<<",
213
+ ">>UNUSED_391<<",
214
+ ">>UNUSED_392<<",
215
+ ">>UNUSED_393<<",
216
+ ">>UNUSED_394<<",
217
+ ">>UNUSED_395<<",
218
+ ">>UNUSED_396<<",
219
+ ">>UNUSED_397<<",
220
+ ">>UNUSED_398<<",
221
+ ">>UNUSED_399<<",
222
+ ">>UNUSED_400<<",
223
+ ">>UNUSED_401<<",
224
+ ">>UNUSED_402<<",
225
+ ">>UNUSED_403<<",
226
+ ">>UNUSED_404<<",
227
+ ">>UNUSED_405<<",
228
+ ">>UNUSED_406<<",
229
+ ">>UNUSED_407<<",
230
+ ">>UNUSED_408<<",
231
+ ">>UNUSED_409<<",
232
+ ">>UNUSED_410<<",
233
+ ">>UNUSED_411<<",
234
+ ">>UNUSED_412<<",
235
+ ">>UNUSED_413<<",
236
+ ">>UNUSED_414<<",
237
+ ">>UNUSED_415<<",
238
+ ">>UNUSED_416<<",
239
+ ">>UNUSED_417<<",
240
+ ">>UNUSED_418<<",
241
+ ">>UNUSED_419<<",
242
+ ">>UNUSED_420<<",
243
+ ">>UNUSED_421<<",
244
+ ">>UNUSED_422<<",
245
+ ">>UNUSED_423<<",
246
+ ">>UNUSED_424<<",
247
+ ">>UNUSED_425<<",
248
+ ">>UNUSED_426<<",
249
+ ">>UNUSED_427<<",
250
+ ">>UNUSED_428<<",
251
+ ">>UNUSED_429<<",
252
+ ">>UNUSED_430<<",
253
+ ">>UNUSED_431<<",
254
+ ">>UNUSED_432<<",
255
+ ">>UNUSED_433<<",
256
+ ">>UNUSED_434<<",
257
+ ">>UNUSED_435<<",
258
+ ">>UNUSED_436<<",
259
+ ">>UNUSED_437<<",
260
+ ">>UNUSED_438<<",
261
+ ">>UNUSED_439<<",
262
+ ">>UNUSED_440<<",
263
+ ">>UNUSED_441<<",
264
+ ">>UNUSED_442<<",
265
+ ">>UNUSED_443<<",
266
+ ">>UNUSED_444<<",
267
+ ">>UNUSED_445<<",
268
+ ">>UNUSED_446<<",
269
+ ">>UNUSED_447<<",
270
+ ">>UNUSED_448<<",
271
+ ">>UNUSED_449<<",
272
+ ">>UNUSED_450<<",
273
+ ">>UNUSED_451<<",
274
+ ">>UNUSED_452<<",
275
+ ">>UNUSED_453<<",
276
+ ">>UNUSED_454<<",
277
+ ">>UNUSED_455<<",
278
+ ">>UNUSED_456<<",
279
+ ">>UNUSED_457<<",
280
+ ">>UNUSED_458<<",
281
+ ">>UNUSED_459<<",
282
+ ">>UNUSED_460<<",
283
+ ">>UNUSED_461<<",
284
+ ">>UNUSED_462<<",
285
+ ">>UNUSED_463<<",
286
+ ">>UNUSED_464<<",
287
+ ">>UNUSED_465<<",
288
+ ">>UNUSED_466<<",
289
+ ">>UNUSED_467<<",
290
+ ">>UNUSED_468<<",
291
+ ">>UNUSED_469<<",
292
+ ">>UNUSED_470<<",
293
+ ">>UNUSED_471<<",
294
+ ">>UNUSED_472<<",
295
+ ">>UNUSED_473<<",
296
+ ">>UNUSED_474<<",
297
+ ">>UNUSED_475<<",
298
+ ">>UNUSED_476<<",
299
+ ">>UNUSED_477<<",
300
+ ">>UNUSED_478<<",
301
+ ">>UNUSED_479<<",
302
+ ">>UNUSED_480<<",
303
+ ">>UNUSED_481<<",
304
+ ">>UNUSED_482<<",
305
+ ">>UNUSED_483<<",
306
+ ">>UNUSED_484<<",
307
+ ">>UNUSED_485<<",
308
+ ">>UNUSED_486<<",
309
+ ">>UNUSED_487<<",
310
+ ">>UNUSED_488<<",
311
+ ">>UNUSED_489<<",
312
+ ">>UNUSED_490<<",
313
+ ">>UNUSED_491<<",
314
+ ">>UNUSED_492<<",
315
+ ">>UNUSED_493<<",
316
+ ">>UNUSED_494<<",
317
+ ">>UNUSED_495<<",
318
+ ">>UNUSED_496<<",
319
+ ">>UNUSED_497<<",
320
+ ">>UNUSED_498<<",
321
+ ">>UNUSED_499<<",
322
+ ">>UNUSED_500<<",
323
+ ">>UNUSED_501<<",
324
+ ">>UNUSED_502<<",
325
+ ">>UNUSED_503<<",
326
+ ">>UNUSED_504<<",
327
+ ">>UNUSED_505<<",
328
+ ">>UNUSED_506<<",
329
+ ">>UNUSED_507<<",
330
+ ">>UNUSED_508<<",
331
+ ">>UNUSED_509<<",
332
+ ">>UNUSED_510<<",
333
+ ">>UNUSED_511<<"
334
+ ],
335
+ "bos_token": {
336
+ "content": "<|begin_of_text|>",
337
+ "lstrip": false,
338
+ "normalized": false,
339
+ "rstrip": false,
340
+ "single_word": false
341
+ },
342
+ "eos_token": {
343
+ "content": "<|end_of_text|>",
344
+ "lstrip": false,
345
+ "normalized": false,
346
+ "rstrip": false,
347
+ "single_word": false
348
+ },
349
+ "pad_token": {
350
+ "content": "<pad>",
351
+ "lstrip": false,
352
+ "normalized": false,
353
+ "rstrip": false,
354
+ "single_word": false
355
+ }
356
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff