lwoollett commited on
Commit
82b0c22
·
verified ·
1 Parent(s): 42ffcde

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,689 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0
  <br> replacement: String;
1
  <br> startIndex: Integer;
2
  <br> bIgnoreCase: Boolean): String;<br>```<br>The **replaceFrom__** method of the [String](string_type.htm) primitive type replaces only the first occurrence of the substring specified in the **target** parameter with the substring specified in the **replacement** parameter, starting from the specified **startIndex** parameter.<br><br>Case‑sensitivity is ignored if you set the value of the **bIgnoreCase** parameter to **true**. Set this parameter to **false** if you want the substring replacement to be case‑sensitive.<br><br>This method raises exception 1413 (_Index used in string operation is out of bounds_) if the value specified in the **startIndex** parameter is less than **1** or it is greater than the length of the original string. In addition, it returns the original receiver String if the value specified in the **target** parameter has a length of zero (**...</code> |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  <br> keywordList: Integer;
4
  <br> keywords: String);<br>```<br>The **changeKeywords** method of the [JadeTextEdit](../control_class/jadetextedit_class.htm) class modifies one or more of the current keyword lists. The keyword lists are used by the current language lexical analyzer to classify the tokens found in the text. For the Jade language, this includes keywords, class names, constant names, and so on.<br><br>The value of the **action** parameter can be one of the **JadeTextEdit** class constants listed in the following table.<br><br>| Class Constant | Value | Description |<br>| ---- | ---- | ---- |<br>| KEYWORDS_ADD | 2 | Adds the keywords specified in thekeywordsparameter to the list specified in thekeywordListparameter. |<br>| KEYWORDS_DELETE | 3 | Deletes the words specified in thekeywordsparameter from the list specified in thekeywordListparameter. |<br>| KEYWORDS_SET | 1 | Clears the list specified in thekeywordListparam...</code> |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:10217
11
+ - loss:CachedMultipleNegativesRankingLoss
12
+ base_model: nomic-ai/modernbert-embed-base
13
+ widget:
14
+ - source_sentence: What integer value is assigned to the global constant SDS_SecondaryType
15
+ in JADE?
16
+ sentences:
17
+ - '#### drawWidth
18
+
19
+
20
+ **Type:** - Integer
21
+
22
+
23
+ **Availability:** - Read or write at run time only
24
+
25
+
26
+ The **drawWidth **property of the [Window](../window_class/window_class.htm) class
27
+ contains the line width for output from graphics methods on a form or control.
28
+
29
+
30
+ Set the **drawWidth** property to a value in the range **1** through **32,767**. This
31
+ value represents the width of the line in pixels. The default value is **1**
32
+ pixel wide.
33
+
34
+
35
+ Increase the value of the **drawWidth** property to increase the width of the
36
+ line.'
37
+ - '#### JadeDynamicObjectTypes Category Global Constants
38
+
39
+
40
+ The global constants listed in the following table define symbolic names for the
41
+ values of the [JadeDynamicObject](../../encyclosys1/jadedynamicobject_class/jadedynamicobject_class.htm#jadedynamicobjectclass)
42
+ class [type](../../encyclosys1/jadedynamicobject_class/type.htm#typejadedynamicobject)
43
+ attribute of dynamic objects returned from [JadeDatabaseAdmin](../../encyclosys1/jadedatabaseadmin_class/jadedatabaseadmin_class.htm#jadedatabaseadminclass)
44
+ class query methods.
45
+
46
+
47
+ | Global Constant | Integer Value |
48
+
49
+ | ---- | ---- |
50
+
51
+ | SDS_PrimaryType | 1 |
52
+
53
+ | SDS_SecondaryProxyType | 2 |
54
+
55
+ | SDS_SecondaryType | 3 |
56
+
57
+ | SDS_TransactionType | 4 |'
58
+ - "#### sortOrder\n\n**Type:** - Integer\n\n**Availability:** - Read or write at\
59
+ \ run time only\n\nThe **sortOrder **property of the [JadeTableColumn](jadetablecolumn_class.htm)\
60
+ \ class contains the precedence of the column referenced by this object when sorting,\
61
+ \ in the range **1** through **3**, or it contains zero (**0**) to remove sorting\
62
+ \ on the current column.\n\nFor a description of this property, see the [Table](../../encyclowin/control_class/table_class.htm#tableclass)\
63
+ \ control [sortColumn](../../encyclowin/window__form__and_control_properties/sortcolumn.htm#sortcolumnwin)\
64
+ \ property. See also the [JadeTableColumn](jadetablecolumn_class.htm) class [sortAsc](sortasc.htm),\
65
+ \ [sortCased](sortcased.htm), and [sortType](sorttype.htm) properties, which are\
66
+ \ dependent on the column already being recorded as a sort column by the **sortOrder**\
67
+ \ property.\n\nThe code fragment in the following example shows the use of the\
68
+ \ **sortOrder** property.\n\n```\ntable1.accessColumn(2).sortOrder := 1; //\
69
+ \ first column in sort\r\ntable1.accessColumn(4).sortOrder := 2; // second column\r\
70
+ \ntable1.accessColumn(5).sortOrder := 3; // third column\n```"
71
+ - source_sentence: How are values in the ByteArray referenced?
72
+ sentences:
73
+ - "#### findAllElementsByNameNS\n\n```\nfindAllElementsByNameNS(namespaceURI: String;\r\
74
+ \n localName: String;\r\n elements:\
75
+ \ JadeXMLElementArray input);\n```\nThe **findAllElementsByNameNS **method\
76
+ \ of the [JadeXMLElement](jadexmlelement_class.htm) class fills the elements array\
77
+ \ with all descendant elements that have the values specified in the **namespaceURI**\
78
+ \ and **localName** parameters, respectively.\n\nAs the search uses the collection\
79
+ \ sequence, the elements may not be in the document sequence.\n\nIf you want to\
80
+ \ match all namespaces or local names, specify an asterisk character (**'*'**)\
81
+ \ in the **namespaceURI** or **localName** parameter. Note, however, that if\
82
+ \ you specify **\"*\"** in the **localName** parameter, the access method uses\
83
+ \ the document sequence to locate the requested elements rather than the collection\
84
+ \ sequence that optimizes performance."
85
+ - '## ByteArray Class
86
+
87
+
88
+ The **ByteArray** class is an ordered collection of [Byte](../../encycloprim/byte_type/byte_type.htm#byte)
89
+ values in which the values are referenced by their position in the collection.
90
+
91
+
92
+ Byte arrays inherit the methods defined in the [Array](../array_class/array_class.htm)
93
+ class.
94
+
95
+
96
+ The bracket (**[ ]**) subscript operators enable you to assign values to and receive
97
+ values from a **Byte** array.
98
+
99
+
100
+ For details about the methods defined in the **ByteArray** class, see "[ByteArray
101
+ Methods](bytearray_methods.htm)", in the following section.
102
+
103
+
104
+ [Array](../array_class/array_class.htm)
105
+
106
+
107
+ (None)'
108
+ - '#### Exposing Properties for a Selected Class
109
+
110
+
111
+ To expose all properties for a selected class
112
+
113
+
114
+ - Right‑click on the class row in the **Classes** table and then select the **Expose
115
+ Properties for Selected Class** command from the popup menu that is displayed.
116
+
117
+
118
+ This command does _not_ automatically add methods or constants to the C# exposure,
119
+ even if the **Show Methods** or **Show Constants** option is checked. (For details,
120
+ see "[Toggling the Display of Methods](toggling_the_display_of_methods.htm)" or
121
+ "[Toggling the Display Constants](toggling_the_display_of_constants.htm)", later
122
+ in this chapter.)
123
+
124
+
125
+ All properties in that class are then exposed for inclusion in the C# exposure;
126
+ that is, each property check box in the **Features** pane is checked, indicating
127
+ that the properties for that class will be generated in the C# class library.
128
+
129
+
130
+ You can tailor the property selection by unchecking the check box of any property
131
+ that you want to exclude from the exposure.'
132
+ - source_sentence: How can you resolve opening database error 14544 in single user
133
+ mode?
134
+ sentences:
135
+ - "#### Changing Lock Type\n\nA type upgrade can queue and potentially time out,\
136
+ \ causing a [JoobObjectLockedException](joobobjectlockedexception.htm) to be thrown,\
137
+ \ if the requested type is not compatible with existing locks. For example, this\
138
+ \ could happen when upgrading a shared lock to exclusive.\n\nLock type downgrades\
139
+ \ will never be queued, as the strength is being lowered so there will be no lock\
140
+ \ incompatibilities.\n\nWhen a Jade session is in transaction state, requests\
141
+ \ to downgrade lock type are ignored. The lock maintains its current type. However,\
142
+ \ lock types can be upgraded regardless of transaction state.\n\nWhen a lock type\
143
+ \ is being upgraded from shared to update, the object is unlocked before the update\
144
+ \ lock is requested. This happens even if the Jade session is in transaction state,\
145
+ \ and is the only situation where an object is unlocked while in transaction state.\
146
+ \ The reason for doing this is to prevent potential deadlocks, as discussed in\
147
+ \ more detail under \"[Avoiding Deadlock Exceptions](avoiding_deadlock_exceptions.htm)\"\
148
+ , later in this chapter.\n\nThe following code fragment gives examples of upgrading\
149
+ \ and downgrading lock types.\n\n```\nTimeSpan timeOut = TimeSpan.FromSeconds(10);\r\
150
+ \ncontext.Lock(obj1, LockType.Shared, LockDuration.Transaction, timeOut);\r\n\
151
+ context.Lock(obj1, LockType.Reserve, LockDuration.Transaction, timeOut);\r\n \
152
+ \ // The lock is now upgraded from shared to reserve.\r\
153
+ \ncontext.Lock(coll, LockType.Exclusive, LockDuration.Transaction, timeOut);\r\
154
+ \n \r\nusing (System.Data.IDbTransaction tran = context.BeginTransaction())\r\
155
+ \n{\r\n context.Lock(obj1, LockType.Exclusive, LockDuration.Transaction,\r\n\
156
+ \ timeOut); // The lock type is upgraded to exclusive, as\r\
157
+ \n // locks can be upgraded (but not downgraded)\r\
158
+ \n // when in transaction state.\r\n foreach\
159
+ \ (C1 obj2 in coll)\r\n {\r\n // The exclusive lock on coll is not downgraded\
160
+ \ by the implicit shared\r\n // lock associated with foreach, because transaction\
161
+ \ state is in effect.\r\n }\r\n context.Lock(obj1, LockType.Shared, LockDuration.Transaction,\
162
+ \ timeOut);\r\n // The lock type is not downgraded, but remains\
163
+ \ as exclusive.\r\n tran.Commit(); // All transaction duration locks are\
164
+ \ released.\r\n}\n```"
165
+ - '### 1411 - Attempt to add unknown system file
166
+
167
+
168
+ Cause
169
+
170
+
171
+ This error occurs if the system schema maintenance function attempts to add a
172
+ new unknown system file.
173
+
174
+
175
+ Action
176
+
177
+
178
+ This is an internal error. If your Jade licenses include support, contact your
179
+ local Jade support center or Jade Support.'
180
+ - '### 14544 - A concurrent process has already opened the same database
181
+
182
+
183
+ Cause
184
+
185
+
186
+ This error occurs if you attempt to open a database that is already open in single
187
+ user (exclusive) mode.
188
+
189
+
190
+ Action
191
+
192
+
193
+ Determine in which mode the database should be opened; that is, single user or
194
+ multiuser mode.'
195
+ - source_sentence: What is the cause of the 3323 DbCrypt error?
196
+ sentences:
197
+ - '### 3323 - DbCrypt memory allocation failure
198
+
199
+
200
+ Cause
201
+
202
+
203
+ This error occurs if a memory allocation error occurs in the use of the database
204
+ encryption module.
205
+
206
+
207
+ Action
208
+
209
+
210
+ If your Jade licenses include support, contact your local Jade support center
211
+ or Jade Support.'
212
+ - '### 3028 - Database file is in use by another process
213
+
214
+
215
+ Cause
216
+
217
+
218
+ This error occurs if you attempt to open a database file that is already open
219
+ by another process.
220
+
221
+
222
+ Action
223
+
224
+
225
+ Refer to the Jade messages log file (**jommsg.log**) for information about the
226
+ file. Generally, another program is accessing the file or the database as a whole.'
227
+ - '### Where Do Jade Methods Execute?
228
+
229
+
230
+ Jade methods execute only in Jade nodes. A Jade node is the fundamental building
231
+ block of Jade''s distributed architecture. Each node contains the Jade Object
232
+ Manager (JOM), the Jade Interpreter, various caches, and one or more Jade processes.
233
+
234
+
235
+ The Jade thin client is _not_ a Jade node; Jade methods do not execute there,
236
+ although a great deal of effort has been expended to make it look as though they
237
+ do.
238
+
239
+
240
+ In most production systems, there is one database server node (**jadrap.exe**,
241
+ **jadrapb.exe**, or **jadserv.exe**), one or more application server nodes (**jadapp.exe**
242
+ or **jadappb.exe**), and one or more fat/standard client nodes (**jade.exe**)
243
+ for background processing, web services, or HTML forms.
244
+
245
+
246
+ When **jade.exe** is run in single user mode, there is one node only.'
247
+ - source_sentence: Which subclasses are associated with the JadeXMLCharacterData class?
248
+ sentences:
249
+ - '## JadeXMLCharacterData Class
250
+
251
+
252
+ The **JadeXMLCharacterData** class is the abstract superclass of character-based
253
+ nodes in an XML document tree; that is, the text, **CDATA**, and comment nodes.
254
+
255
+
256
+ For details about the property defined in the **JadeXMLCharacterData** class,
257
+ see "[JadeXMLCharacterData Property](jadexmlcharacterdata_property.htm)", in the
258
+ following section.
259
+
260
+
261
+ [JadeXMLNode](../jadexmlnode_class/jadexmlnode_class.htm)
262
+
263
+
264
+ [JadeXMLCDATA](../jadexmlcdata_class/jadexmlcdata_class.htm), [JadeXMLComment](../jadexmlcomment_class/jadexmlcomment_class.htm),
265
+ [JadeXMLText](../jadexmltext_class/jadexmltext_class.htm)'
266
+ - "### Minimizing the Working Set\n\nIn loops where there are multiple filters,\
267
+ \ apply the cheapest filters first and then the filters that reduce the working\
268
+ \ set the most. For example, consider the following code fragment, which finds\
269
+ \ sales of appliances in a specified city.\n\n```\nwhile iter.next(tran) do\r\n\
270
+ \ if tran.type = Type_Sale\r\n and tran.myBranch.myLocation.city = targetCity\r\
271
+ \n and tran.myProduct.isAppliance then\r\n <do something with tran>\r\
272
+ \n endif;\r\nendwhile;\n```\nIn this example, **tran.type** should be checked\
273
+ \ first, because it is the cheapest. The **tran** object must be fetched to evaluate\
274
+ \ all of the other conditions, so we may as well check the **type** attribute\
275
+ \ first. If we did the **isAppliance** check first, we would have to fetch all\
276
+ \ of the product objects for the transactions that were not sales. Regardless\
277
+ \ of how many transactions are sales and how many products are appliances, it\
278
+ \ will save time to check **tran.type** first.\n\nNow, assume that:\n\n- 80 percent\
279
+ \ of transactions are sales\n\n- 15 percent, on average, are likely to be in the\
280
+ \ target city\n\n- 90 percent of the products are appliances\n\nIt pays to check\
281
+ \ the city first, even though it means fetching the branch and location objects\
282
+ \ for the non‑appliance products. There are very few non‑appliance products, so\
283
+ \ the number of extra fetches is small. By contrast, checking for non‑appliance\
284
+ \ products for all other cities would result in a large number of extra fetches.\n\
285
+ \nIt doesn't matter if the filters are conditions of an [if](../../devref/ch1languageref/if_instruction.htm#if)\
286
+ \ instruction, multiple [if](../../devref/ch1languageref/if_instruction.htm#if)\
287
+ \ instructions, or multiple conditions in the [where](../../devref/ch1languageref/where_clause_optimization.htm#whereoptimization)\
288
+ \ clause of a [while](../../devref/ch1languageref/while_instruction.htm#while)\
289
+ \ statement; the end result is the same.\n\nThis code fragment example is simple\
290
+ \ and concise, to convey the concept. In the real world, each successive filter\
291
+ \ may be in another method, another class, or even another schema. It may take\
292
+ \ a bit of investigation to find all of the filters involved in a single loop."
293
+ - '##### responseType
294
+
295
+
296
+ Use the **responseType** parameter of the [beginNotification](beginnotification.htm)
297
+ method to specify the frequency with which the subscribed event was notified.
298
+
299
+
300
+ The valid values for the **responseType** parameter, represented by global constants
301
+ in the [NotificationResponses](../../encycloprim/appaglobalconstants/notificationresponses_category.htm#notificationresponsescategory)
302
+ category, are listed in the following table.
303
+
304
+
305
+ | Global Constant | Integer Value | Sends a notification… |
306
+
307
+ | ---- | ---- | ---- |
308
+
309
+ | Response_Cancel | 1 | When the object receives a matching event and then cancels
310
+ the notification |
311
+
312
+ | Response_Continuous | 0 | Whenever the object receives a matching event |
313
+
314
+ | Response_Suspend | 2 | When the object receives a matching event and then suspends
315
+ notification until the user refreshes the local copy of the object |'
316
+ pipeline_tag: sentence-similarity
317
+ library_name: sentence-transformers
318
+ ---
319
+
320
+ # Beep boop
321
+
322
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) on the jade_embeddings_train_25.04.04 dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
323
+
324
+ ## Model Details
325
+
326
+ ### Model Description
327
+ - **Model Type:** Sentence Transformer
328
+ - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
329
+ - **Maximum Sequence Length:** 8192 tokens
330
+ - **Output Dimensionality:** 768 dimensions
331
+ - **Similarity Function:** Cosine Similarity
332
+ - **Training Dataset:**
333
+ - jade_embeddings_train_25.04.04
334
+ - **Language:** en
335
+ - **License:** apache-2.0
336
+
337
+ ### Model Sources
338
+
339
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
340
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
341
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
342
+
343
+ ### Full Model Architecture
344
+
345
+ ```
346
+ SentenceTransformer(
347
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
348
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
349
+ (2): Normalize()
350
+ )
351
+ ```
352
+
353
+ ## Usage
354
+
355
+ ### Direct Usage (Sentence Transformers)
356
+
357
+ First install the Sentence Transformers library:
358
+
359
+ ```bash
360
+ pip install -U sentence-transformers
361
+ ```
362
+
363
+ Then you can load this model and run inference.
364
+ ```python
365
+ from sentence_transformers import SentenceTransformer
366
+
367
+ # Download from the 🤗 Hub
368
+ model = SentenceTransformer("lwoollett/jade-ft-14-bert-static")
369
+ # Run inference
370
+ sentences = [
371
+ 'Which subclasses are associated with the JadeXMLCharacterData class?',
372
+ '## JadeXMLCharacterData Class\n\nThe **JadeXMLCharacterData** class is the abstract superclass of character-based nodes in an XML document tree; that is, the text, **CDATA**, and comment nodes.\n\nFor details about the property defined in the **JadeXMLCharacterData** class, see "[JadeXMLCharacterData Property](jadexmlcharacterdata_property.htm)", in the following section.\n\n[JadeXMLNode](../jadexmlnode_class/jadexmlnode_class.htm)\n\n[JadeXMLCDATA](../jadexmlcdata_class/jadexmlcdata_class.htm), [JadeXMLComment](../jadexmlcomment_class/jadexmlcomment_class.htm), [JadeXMLText](../jadexmltext_class/jadexmltext_class.htm)',
373
+ "### Minimizing the Working Set\n\nIn loops where there are multiple filters, apply the cheapest filters first and then the filters that reduce the working set the most. For example, consider the following code fragment, which finds sales of appliances in a specified city.\n\n```\nwhile iter.next(tran) do\r\n if tran.type = Type_Sale\r\n and tran.myBranch.myLocation.city = targetCity\r\n and tran.myProduct.isAppliance then\r\n <do something with tran>\r\n endif;\r\nendwhile;\n```\nIn this example, **tran.type** should be checked first, because it is the cheapest. The **tran** object must be fetched to evaluate all of the other conditions, so we may as well check the **type** attribute first. If we did the **isAppliance** check first, we would have to fetch all of the product objects for the transactions that were not sales. Regardless of how many transactions are sales and how many products are appliances, it will save time to check **tran.type** first.\n\nNow, assume that:\n\n- 80 percent of transactions are sales\n\n- 15 percent, on average, are likely to be in the target city\n\n- 90 percent of the products are appliances\n\nIt pays to check the city first, even though it means fetching the branch and location objects for the non‑appliance products. There are very few non‑appliance products, so the number of extra fetches is small. By contrast, checking for non‑appliance products for all other cities would result in a large number of extra fetches.\n\nIt doesn't matter if the filters are conditions of an [if](../../devref/ch1languageref/if_instruction.htm#if) instruction, multiple [if](../../devref/ch1languageref/if_instruction.htm#if) instructions, or multiple conditions in the [where](../../devref/ch1languageref/where_clause_optimization.htm#whereoptimization) clause of a [while](../../devref/ch1languageref/while_instruction.htm#while) statement; the end result is the same.\n\nThis code fragment example is simple and concise, to convey the concept. In the real world, each successive filter may be in another method, another class, or even another schema. It may take a bit of investigation to find all of the filters involved in a single loop.",
374
+ ]
375
+ embeddings = model.encode(sentences)
376
+ print(embeddings.shape)
377
+ # [3, 768]
378
+
379
+ # Get the similarity scores for the embeddings
380
+ similarities = model.similarity(embeddings, embeddings)
381
+ print(similarities.shape)
382
+ # [3, 3]
383
+ ```
384
+
385
+ <!--
386
+ ### Direct Usage (Transformers)
387
+
388
+ <details><summary>Click to see the direct usage in Transformers</summary>
389
+
390
+ </details>
391
+ -->
392
+
393
+ <!--
394
+ ### Downstream Usage (Sentence Transformers)
395
+
396
+ You can finetune this model on your own dataset.
397
+
398
+ <details><summary>Click to expand</summary>
399
+
400
+ </details>
401
+ -->
402
+
403
+ <!--
404
+ ### Out-of-Scope Use
405
+
406
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
407
+ -->
408
+
409
+ <!--
410
+ ## Bias, Risks and Limitations
411
+
412
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
413
+ -->
414
+
415
+ <!--
416
+ ### Recommendations
417
+
418
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
419
+ -->
420
+
421
+ ## Training Details
422
+
423
+ ### Training Dataset
424
+
425
+ #### jade_embeddings_train_25.04.04
426
+
427
+ * Dataset: jade_embeddings_train_25.04.04
428
+ * Size: 10,217 training samples
429
+ * Columns: <code>anchor</code> and <code>positive</code>
430
+ * Approximate statistics based on the first 1000 samples:
431
+ | | anchor | positive |
432
+ |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
433
+ | type | string | string |
434
+ | details | <ul><li>min: 8 tokens</li><li>mean: 17.17 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 27 tokens</li><li>mean: 363.15 tokens</li><li>max: 6303 tokens</li></ul> |
435
+ * Samples:
436
+ | anchor | positive |
437
+ |:------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
438
+ | <code>What is the format for defining a Byte constant in JADE?</code> | <code>##### Constant Definition Tips<br><br>When defining a constant value, the value of a constant can be a simple literal value or an expression constructed using literals and other constants. For details about literal types, see "[Literals](../../devref/ch1languageref/literals.htm#literalsexpr)", in Chapter - 1 of the _Developer's Reference_.<br><br>You can define the value for a constant whose primitive type is not a specific literal format by using a typecast of a [String](../../encycloprim/string_type/string_type.htm#string) literal or in the case of a [Byte](../../encycloprim/byte_type/byte_type.htm#byte), a small [Integer](../../encycloprim/integer_type/integer_type.htm#integer) literal, as shown in the examples in the following table.<br><br>| Primitive Type | Value Expression |<br>| ---- | ---- |<br>| Date | "31/12/2007".Date |<br>| Time | "14:34:23.123".Time |<br>| TimeStamp | "31/12/2007, 14:34:23:123".TimeStamp |<br>| Point | "1,7".Point |<br>| Byte | 0.Byte |<br><br>For details about typecasting, see "[Type Casts](../...</code> |
439
+ | <code>How does the replaceFrom__ method handle case sensitivity?</code> | <code>#### replaceFrom__<br><br>```<br>replaceFrom__(target: String;
440
  <br> replacement: String;
441
  <br> startIndex: Integer;
442
  <br> bIgnoreCase: Boolean): String;<br>```<br>The **replaceFrom__** method of the [String](string_type.htm) primitive type replaces only the first occurrence of the substring specified in the **target** parameter with the substring specified in the **replacement** parameter, starting from the specified **startIndex** parameter.<br><br>Case‑sensitivity is ignored if you set the value of the **bIgnoreCase** parameter to **true**. Set this parameter to **false** if you want the substring replacement to be case‑sensitive.<br><br>This method raises exception 1413 (_Index used in string operation is out of bounds_) if the value specified in the **startIndex** parameter is less than **1** or it is greater than the length of the original string. In addition, it returns the original receiver String if the value specified in the **target** parameter has a length of zero (**...</code> |
443
+ | <code>What does the global constant Ex_Continue do?</code> | <code>## Exceptions Category<br><br>The global constants for exceptions are listed in the following table.<br><br>| Global Constant | Integer Value | Description |<br>| ---- | ---- | ---- |<br>| Ex_Abort_Action | 1 | Causes the currently executing methods to be aborted. |<br>| Ex_Continue | 0 | Resumes execution from the next expression after the expression that caused the exception. |<br>| Ex_Pass_Back | -1 | Passes control back to the prior local exception handler for this type of exception, or if a local handler is not found, a global exception handler for this type of exception. |<br>| Ex_Resume_Method_Epilog | 3 | Passes control back to the method that armed the exception handler. Execution resumes at the start of the method epilog or at the end of the method if there is no epilog section. Execution resumes at the next statement in the epilog if the exception was raised while executing the epilog. If there were no messages on the execution stack when the handler was armed, the effect of theEx_Resume_Method_Epilog...</code> |
444
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
445
+ ```json
446
+ {
447
+ "scale": 20.0,
448
+ "similarity_fct": "cos_sim",
449
+ "mini_batch_size": 32
450
+ }
451
+ ```
452
+
453
+ ### Evaluation Dataset
454
+
455
+ #### jade_embeddings_train_25.04.04
456
+
457
+ * Dataset: jade_embeddings_train_25.04.04
458
+ * Size: 1,136 evaluation samples
459
+ * Columns: <code>anchor</code> and <code>positive</code>
460
+ * Approximate statistics based on the first 1000 samples:
461
+ | | anchor | positive |
462
+ |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
463
+ | type | string | string |
464
+ | details | <ul><li>min: 8 tokens</li><li>mean: 17.07 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 25 tokens</li><li>mean: 365.93 tokens</li><li>max: 3397 tokens</li></ul> |
465
+ * Samples:
466
+ | anchor | positive |
467
+ |:------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
468
+ | <code>What is the keyword list constant value for JADE_SYSTEMVARS?</code> | <code>### changeKeywords<br><br>```<br>changeKeywords(action: Integer;
469
  <br> keywordList: Integer;
470
  <br> keywords: String);<br>```<br>The **changeKeywords** method of the [JadeTextEdit](../control_class/jadetextedit_class.htm) class modifies one or more of the current keyword lists. The keyword lists are used by the current language lexical analyzer to classify the tokens found in the text. For the Jade language, this includes keywords, class names, constant names, and so on.<br><br>The value of the **action** parameter can be one of the **JadeTextEdit** class constants listed in the following table.<br><br>| Class Constant | Value | Description |<br>| ---- | ---- | ---- |<br>| KEYWORDS_ADD | 2 | Adds the keywords specified in thekeywordsparameter to the list specified in thekeywordListparameter. |<br>| KEYWORDS_DELETE | 3 | Deletes the words specified in thekeywordsparameter from the list specified in thekeywordListparameter. |<br>| KEYWORDS_SET | 1 | Clears the list specified in thekeywordListparam...</code> |
471
+ | <code>What should you click to abandon the deletion of a report in JADE?</code> | <code>#### Delete Report Command<br><br>Use the **Delete Report** command from the File menu to delete a report.<br><br>To delete a report<br><br>1. Select the **Delete Report** command from the File menu. The Delete Report dialog, shown in the following image, is then displayed.<br><br>[](../images/reportdelete_feb2022.png)<br><br>2. Select the report that you want to delete from the **Report** list box or enter the name in the **Report name** text box.<br><br>3. Filter the list of report names in the **Reports** list box in one or both of the following ways.<br><br> - To display only those reports that contain that text in their report description, enter text in the **Text contains** text box. For example, only those reports that mention **Pay** in their description are displayed if you enter **Pay**, providing a refined selection list.<br><br> - To display only those reports modified during a specified period, select a last modified period from the **Last modified** list box. For example, only those reports that were modified in...</code> |
472
+ | <code>What types of objects can be set for the userGroupObject in JadeMultiWorkerTcpTransport?</code> | <code>#### userGroupObject<br><br>**Type:** - Object<br><br>The **userGroupObject** property of the [JadeMultiWorkerTcpTransport](jademultiworkertcptransport_class.htm) class contains a reference to an object that you can associate with the transport group between event callbacks.<br><br>You must set the value of this property to a shared transient or a persistent object, as it must be visible to other workers.<br><br>The default value is **null**.<br><br>To prevent an object leak, it is your responsibility to delete this object, if required, in your implementation of the [closedEvent](../jademultiworkertcptransportif_interface/closedevent.htm) method in the receiver class.</code> |
473
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
474
+ ```json
475
+ {
476
+ "scale": 20.0,
477
+ "similarity_fct": "cos_sim",
478
+ "mini_batch_size": 32
479
+ }
480
+ ```
481
+
482
+ ### Training Hyperparameters
483
+ #### Non-Default Hyperparameters
484
+
485
+ - `eval_strategy`: steps
486
+ - `per_device_train_batch_size`: 18
487
+ - `per_device_eval_batch_size`: 18
488
+ - `num_train_epochs`: 4
489
+ - `warmup_ratio`: 0.1
490
+ - `bf16`: True
491
+ - `batch_sampler`: no_duplicates
492
+
493
+ #### All Hyperparameters
494
+ <details><summary>Click to expand</summary>
495
+
496
+ - `overwrite_output_dir`: False
497
+ - `do_predict`: False
498
+ - `eval_strategy`: steps
499
+ - `prediction_loss_only`: True
500
+ - `per_device_train_batch_size`: 18
501
+ - `per_device_eval_batch_size`: 18
502
+ - `per_gpu_train_batch_size`: None
503
+ - `per_gpu_eval_batch_size`: None
504
+ - `gradient_accumulation_steps`: 1
505
+ - `eval_accumulation_steps`: None
506
+ - `torch_empty_cache_steps`: None
507
+ - `learning_rate`: 5e-05
508
+ - `weight_decay`: 0.0
509
+ - `adam_beta1`: 0.9
510
+ - `adam_beta2`: 0.999
511
+ - `adam_epsilon`: 1e-08
512
+ - `max_grad_norm`: 1.0
513
+ - `num_train_epochs`: 4
514
+ - `max_steps`: -1
515
+ - `lr_scheduler_type`: linear
516
+ - `lr_scheduler_kwargs`: {}
517
+ - `warmup_ratio`: 0.1
518
+ - `warmup_steps`: 0
519
+ - `log_level`: passive
520
+ - `log_level_replica`: warning
521
+ - `log_on_each_node`: True
522
+ - `logging_nan_inf_filter`: True
523
+ - `save_safetensors`: True
524
+ - `save_on_each_node`: False
525
+ - `save_only_model`: False
526
+ - `restore_callback_states_from_checkpoint`: False
527
+ - `no_cuda`: False
528
+ - `use_cpu`: False
529
+ - `use_mps_device`: False
530
+ - `seed`: 42
531
+ - `data_seed`: None
532
+ - `jit_mode_eval`: False
533
+ - `use_ipex`: False
534
+ - `bf16`: True
535
+ - `fp16`: False
536
+ - `fp16_opt_level`: O1
537
+ - `half_precision_backend`: auto
538
+ - `bf16_full_eval`: False
539
+ - `fp16_full_eval`: False
540
+ - `tf32`: None
541
+ - `local_rank`: 0
542
+ - `ddp_backend`: None
543
+ - `tpu_num_cores`: None
544
+ - `tpu_metrics_debug`: False
545
+ - `debug`: []
546
+ - `dataloader_drop_last`: False
547
+ - `dataloader_num_workers`: 0
548
+ - `dataloader_prefetch_factor`: None
549
+ - `past_index`: -1
550
+ - `disable_tqdm`: False
551
+ - `remove_unused_columns`: True
552
+ - `label_names`: None
553
+ - `load_best_model_at_end`: False
554
+ - `ignore_data_skip`: False
555
+ - `fsdp`: []
556
+ - `fsdp_min_num_params`: 0
557
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
558
+ - `tp_size`: 0
559
+ - `fsdp_transformer_layer_cls_to_wrap`: None
560
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
561
+ - `deepspeed`: None
562
+ - `label_smoothing_factor`: 0.0
563
+ - `optim`: adamw_torch
564
+ - `optim_args`: None
565
+ - `adafactor`: False
566
+ - `group_by_length`: False
567
+ - `length_column_name`: length
568
+ - `ddp_find_unused_parameters`: None
569
+ - `ddp_bucket_cap_mb`: None
570
+ - `ddp_broadcast_buffers`: False
571
+ - `dataloader_pin_memory`: True
572
+ - `dataloader_persistent_workers`: False
573
+ - `skip_memory_metrics`: True
574
+ - `use_legacy_prediction_loop`: False
575
+ - `push_to_hub`: False
576
+ - `resume_from_checkpoint`: None
577
+ - `hub_model_id`: None
578
+ - `hub_strategy`: every_save
579
+ - `hub_private_repo`: None
580
+ - `hub_always_push`: False
581
+ - `gradient_checkpointing`: False
582
+ - `gradient_checkpointing_kwargs`: None
583
+ - `include_inputs_for_metrics`: False
584
+ - `include_for_metrics`: []
585
+ - `eval_do_concat_batches`: True
586
+ - `fp16_backend`: auto
587
+ - `push_to_hub_model_id`: None
588
+ - `push_to_hub_organization`: None
589
+ - `mp_parameters`:
590
+ - `auto_find_batch_size`: False
591
+ - `full_determinism`: False
592
+ - `torchdynamo`: None
593
+ - `ray_scope`: last
594
+ - `ddp_timeout`: 1800
595
+ - `torch_compile`: False
596
+ - `torch_compile_backend`: None
597
+ - `torch_compile_mode`: None
598
+ - `include_tokens_per_second`: False
599
+ - `include_num_input_tokens_seen`: False
600
+ - `neftune_noise_alpha`: None
601
+ - `optim_target_modules`: None
602
+ - `batch_eval_metrics`: False
603
+ - `eval_on_start`: False
604
+ - `use_liger_kernel`: False
605
+ - `eval_use_gather_object`: False
606
+ - `average_tokens_across_devices`: False
607
+ - `prompts`: None
608
+ - `batch_sampler`: no_duplicates
609
+ - `multi_dataset_batch_sampler`: proportional
610
+
611
+ </details>
612
+
613
+ ### Training Logs
614
+ | Epoch | Step | Training Loss | Validation Loss |
615
+ |:------:|:----:|:-------------:|:---------------:|
616
+ | 0.1761 | 100 | 0.0851 | 0.0243 |
617
+ | 0.3521 | 200 | 0.0262 | 0.0211 |
618
+ | 0.5282 | 300 | 0.0275 | 0.0217 |
619
+ | 0.7042 | 400 | 0.0216 | 0.0256 |
620
+ | 0.8803 | 500 | 0.0283 | 0.0241 |
621
+ | 1.0563 | 600 | 0.0226 | 0.0195 |
622
+ | 1.2324 | 700 | 0.0113 | 0.0170 |
623
+ | 1.4085 | 800 | 0.0114 | 0.0204 |
624
+ | 1.5845 | 900 | 0.0165 | 0.0182 |
625
+ | 1.7606 | 1000 | 0.0129 | 0.0219 |
626
+ | 1.9366 | 1100 | 0.0126 | 0.0181 |
627
+ | 2.1127 | 1200 | 0.0069 | 0.0207 |
628
+ | 2.2887 | 1300 | 0.0045 | 0.0212 |
629
+ | 2.4648 | 1400 | 0.0046 | 0.0187 |
630
+ | 2.6408 | 1500 | 0.0056 | 0.0206 |
631
+ | 2.8169 | 1600 | 0.0084 | 0.0196 |
632
+ | 2.9930 | 1700 | 0.005 | 0.0214 |
633
+ | 3.1690 | 1800 | 0.0056 | 0.0202 |
634
+ | 3.3451 | 1900 | 0.0088 | 0.0190 |
635
+ | 3.5211 | 2000 | 0.0026 | 0.0202 |
636
+ | 3.6972 | 2100 | 0.0064 | 0.0205 |
637
+ | 3.8732 | 2200 | 0.006 | 0.0202 |
638
+
639
+
640
+ ### Framework Versions
641
+ - Python: 3.11.11
642
+ - Sentence Transformers: 4.0.2
643
+ - Transformers: 4.51.0
644
+ - PyTorch: 2.8.0.dev20250319+cu128
645
+ - Accelerate: 1.6.0
646
+ - Datasets: 3.5.0
647
+ - Tokenizers: 0.21.1
648
+
649
+ ## Citation
650
+
651
+ ### BibTeX
652
+
653
+ #### Sentence Transformers
654
+ ```bibtex
655
+ @inproceedings{reimers-2019-sentence-bert,
656
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
657
+ author = "Reimers, Nils and Gurevych, Iryna",
658
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
659
+ month = "11",
660
+ year = "2019",
661
+ publisher = "Association for Computational Linguistics",
662
+ url = "https://arxiv.org/abs/1908.10084",
663
+ }
664
+ ```
665
+
666
+ #### CachedMultipleNegativesRankingLoss
667
+ ```bibtex
668
+ @misc{gao2021scaling,
669
+ title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
670
+ author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
671
+ year={2021},
672
+ eprint={2101.06983},
673
+ archivePrefix={arXiv},
674
+ primaryClass={cs.LG}
675
+ }
676
+ ```
677
+
678
+ <!--
679
+ ## Glossary
680
+
681
+ *Clearly define terms in order to be accessible across audiences.*
682
+ -->
683
+
684
+ <!--
685
+ ## Model Card Authors
686
+
687
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
688
+ -->
689
+
690
+ <!--
691
+ ## Model Card Contact
692
+
693
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
694
+ -->
config.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertModel"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 50281,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "mean",
12
+ "cls_token_id": 50281,
13
+ "decoder_bias": true,
14
+ "deterministic_flash_attn": false,
15
+ "embedding_dropout": 0.0,
16
+ "eos_token_id": 50282,
17
+ "global_attn_every_n_layers": 3,
18
+ "global_rope_theta": 160000.0,
19
+ "gradient_checkpointing": false,
20
+ "hidden_activation": "gelu",
21
+ "hidden_size": 768,
22
+ "initializer_cutoff_factor": 2.0,
23
+ "initializer_range": 0.02,
24
+ "intermediate_size": 1152,
25
+ "layer_norm_eps": 1e-05,
26
+ "local_attention": 128,
27
+ "local_rope_theta": 10000.0,
28
+ "max_position_embeddings": 8192,
29
+ "mlp_bias": false,
30
+ "mlp_dropout": 0.0,
31
+ "model_type": "modernbert",
32
+ "norm_bias": false,
33
+ "norm_eps": 1e-05,
34
+ "num_attention_heads": 12,
35
+ "num_hidden_layers": 22,
36
+ "pad_token_id": 50283,
37
+ "position_embedding_type": "absolute",
38
+ "repad_logits_with_grad": false,
39
+ "sep_token_id": 50282,
40
+ "sparse_pred_ignore_index": -100,
41
+ "sparse_prediction": false,
42
+ "torch_dtype": "float32",
43
+ "transformers_version": "4.51.0",
44
+ "vocab_size": 50368
45
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "4.0.2",
4
+ "transformers": "4.51.0",
5
+ "pytorch": "2.8.0.dev20250319+cu128"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4e72aaa68be41a4e344c0788bd3f82185c04b4b62967245439d8d53706f57703
3
+ size 596070136
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizer",
944
+ "unk_token": "[UNK]"
945
+ }