File size: 16,319 Bytes
8b0fdba
 
 
 
 
 
55f82e1
8b0fdba
 
 
 
 
 
 
 
 
 
 
55f82e1
8b0fdba
55f82e1
8b0fdba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
---
library_name: peft
license: gemma
base_model: google/gemma-3-1b-it
tags:
- llama-factory
- prompt-tuning
- generated_from_trainer
model-index:
- name: train_cola_1744902668
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# train_cola_1744902668

This model is a fine-tuned version of [google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it) on the cola dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1678
- Num Input Tokens Seen: 31253176

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.3
- train_batch_size: 4
- eval_batch_size: 4
- seed: 123
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- training_steps: 40000

### Training results

| Training Loss | Epoch   | Step  | Validation Loss | Input Tokens Seen |
|:-------------:|:-------:|:-----:|:---------------:|:-----------------:|
| 0.1892        | 0.4158  | 200   | 0.1759          | 156832            |
| 0.1761        | 0.8316  | 400   | 0.1766          | 313248            |
| 0.1908        | 1.2474  | 600   | 0.1760          | 469520            |
| 0.1916        | 1.6632  | 800   | 0.1733          | 625360            |
| 0.2106        | 2.0790  | 1000  | 0.2431          | 782304            |
| 0.1766        | 2.4948  | 1200  | 0.1769          | 938560            |
| 0.1824        | 2.9106  | 1400  | 0.1802          | 1094144           |
| 0.1715        | 3.3264  | 1600  | 0.1725          | 1250544           |
| 0.1738        | 3.7422  | 1800  | 0.1740          | 1407440           |
| 0.1712        | 4.1580  | 2000  | 0.1722          | 1563512           |
| 0.1908        | 4.5738  | 2200  | 0.1778          | 1719064           |
| 0.1939        | 4.9896  | 2400  | 0.1738          | 1875384           |
| 0.1686        | 5.4054  | 2600  | 0.1725          | 2031440           |
| 0.2206        | 5.8212  | 2800  | 0.1757          | 2187952           |
| 0.1866        | 6.2370  | 3000  | 0.1879          | 2344864           |
| 0.195         | 6.6528  | 3200  | 0.1778          | 2500448           |
| 0.1815        | 7.0686  | 3400  | 0.1770          | 2656400           |
| 0.2141        | 7.4844  | 3600  | 0.1975          | 2812912           |
| 0.1898        | 7.9002  | 3800  | 0.1730          | 2968816           |
| 0.1875        | 8.3160  | 4000  | 0.1763          | 3124448           |
| 0.1849        | 8.7318  | 4200  | 0.1751          | 3280320           |
| 0.1759        | 9.1476  | 4400  | 0.1704          | 3437072           |
| 0.1862        | 9.5634  | 4600  | 0.1741          | 3593520           |
| 0.1744        | 9.9792  | 4800  | 0.1727          | 3750544           |
| 0.1889        | 10.3950 | 5000  | 0.1786          | 3905920           |
| 0.1979        | 10.8108 | 5200  | 0.1741          | 4063008           |
| 0.3261        | 11.2266 | 5400  | 0.2146          | 4219472           |
| 0.2037        | 11.6424 | 5600  | 0.1849          | 4376048           |
| 0.2023        | 12.0582 | 5800  | 0.1837          | 4531752           |
| 0.1861        | 12.4740 | 6000  | 0.1730          | 4687112           |
| 0.1808        | 12.8898 | 6200  | 0.1733          | 4843464           |
| 0.1634        | 13.3056 | 6400  | 0.1743          | 4999648           |
| 0.1949        | 13.7214 | 6600  | 0.1838          | 5157152           |
| 0.1732        | 14.1372 | 6800  | 0.1745          | 5312328           |
| 0.1911        | 14.5530 | 7000  | 0.1755          | 5468680           |
| 0.1892        | 14.9688 | 7200  | 0.1737          | 5624776           |
| 0.1809        | 15.3846 | 7400  | 0.1740          | 5782032           |
| 0.1398        | 15.8004 | 7600  | 0.1886          | 5938000           |
| 0.1898        | 16.2162 | 7800  | 0.1742          | 6094536           |
| 0.2028        | 16.6320 | 8000  | 0.1770          | 6250760           |
| 0.1852        | 17.0478 | 8200  | 0.1751          | 6406616           |
| 0.2004        | 17.4636 | 8400  | 0.1750          | 6563416           |
| 0.1864        | 17.8794 | 8600  | 0.1751          | 6719288           |
| 0.1946        | 18.2952 | 8800  | 0.1763          | 6875592           |
| 0.1888        | 18.7110 | 9000  | 0.1749          | 7032392           |
| 0.2042        | 19.1268 | 9200  | 0.1795          | 7188120           |
| 0.1852        | 19.5426 | 9400  | 0.1741          | 7344760           |
| 0.1864        | 19.9584 | 9600  | 0.1731          | 7501144           |
| 0.1864        | 20.3742 | 9800  | 0.1762          | 7657160           |
| 0.1789        | 20.7900 | 10000 | 0.1731          | 7813128           |
| 0.2023        | 21.2058 | 10200 | 0.1973          | 7969880           |
| 0.1886        | 21.6216 | 10400 | 0.1759          | 8126392           |
| 0.2094        | 22.0374 | 10600 | 0.1769          | 8282480           |
| 0.1749        | 22.4532 | 10800 | 0.1802          | 8438992           |
| 0.1569        | 22.8690 | 11000 | 0.1760          | 8595376           |
| 0.1799        | 23.2848 | 11200 | 0.1762          | 8751352           |
| 0.1903        | 23.7006 | 11400 | 0.1789          | 8907960           |
| 0.1785        | 24.1164 | 11600 | 0.1732          | 9064424           |
| 0.1802        | 24.5322 | 11800 | 0.1758          | 9220456           |
| 0.1867        | 24.9480 | 12000 | 0.1889          | 9376488           |
| 0.184         | 25.3638 | 12200 | 0.1762          | 9533208           |
| 0.1826        | 25.7796 | 12400 | 0.1718          | 9689464           |
| 0.174         | 26.1954 | 12600 | 0.1694          | 9845048           |
| 0.1737        | 26.6112 | 12800 | 0.1707          | 10001784          |
| 0.1743        | 27.0270 | 13000 | 0.1727          | 10157800          |
| 0.1624        | 27.4428 | 13200 | 0.1760          | 10313128          |
| 0.1916        | 27.8586 | 13400 | 0.1812          | 10469384          |
| 0.172         | 28.2744 | 13600 | 0.1711          | 10625944          |
| 0.186         | 28.6902 | 13800 | 0.1713          | 10782456          |
| 0.1862        | 29.1060 | 14000 | 0.1763          | 10938304          |
| 0.1521        | 29.5218 | 14200 | 0.1687          | 11094528          |
| 0.1747        | 29.9376 | 14400 | 0.1739          | 11250976          |
| 0.1877        | 30.3534 | 14600 | 0.1750          | 11406672          |
| 0.1558        | 30.7692 | 14800 | 0.1678          | 11562768          |
| 0.1929        | 31.1850 | 15000 | 0.1769          | 11719016          |
| 0.1385        | 31.6008 | 15200 | 0.1810          | 11875368          |
| 0.1526        | 32.0166 | 15400 | 0.1699          | 12031048          |
| 0.1432        | 32.4324 | 15600 | 0.1799          | 12187432          |
| 0.1738        | 32.8482 | 15800 | 0.1811          | 12343432          |
| 0.1474        | 33.2640 | 16000 | 0.1961          | 12500472          |
| 0.1419        | 33.6798 | 16200 | 0.1844          | 12656248          |
| 0.1238        | 34.0956 | 16400 | 0.1820          | 12811752          |
| 0.1519        | 34.5114 | 16600 | 0.2128          | 12968104          |
| 0.1403        | 34.9272 | 16800 | 0.1748          | 13124392          |
| 0.1346        | 35.3430 | 17000 | 0.2009          | 13281144          |
| 0.1533        | 35.7588 | 17200 | 0.1861          | 13437720          |
| 0.1159        | 36.1746 | 17400 | 0.1999          | 13594448          |
| 0.1177        | 36.5904 | 17600 | 0.2015          | 13750544          |
| 0.1252        | 37.0062 | 17800 | 0.2047          | 13906304          |
| 0.1007        | 37.4220 | 18000 | 0.1954          | 14062784          |
| 0.1034        | 37.8378 | 18200 | 0.2058          | 14219168          |
| 0.1451        | 38.2536 | 18400 | 0.2479          | 14375024          |
| 0.1229        | 38.6694 | 18600 | 0.2142          | 14530800          |
| 0.1156        | 39.0852 | 18800 | 0.2563          | 14687808          |
| 0.0901        | 39.5010 | 19000 | 0.2544          | 14843360          |
| 0.1124        | 39.9168 | 19200 | 0.2212          | 14999808          |
| 0.0692        | 40.3326 | 19400 | 0.2647          | 15155496          |
| 0.0897        | 40.7484 | 19600 | 0.2470          | 15311688          |
| 0.0574        | 41.1642 | 19800 | 0.2609          | 15468264          |
| 0.0993        | 41.5800 | 20000 | 0.2509          | 15624072          |
| 0.1137        | 41.9958 | 20200 | 0.2673          | 15780456          |
| 0.0493        | 42.4116 | 20400 | 0.2795          | 15936432          |
| 0.0628        | 42.8274 | 20600 | 0.2706          | 16092272          |
| 0.0504        | 43.2432 | 20800 | 0.2714          | 16249048          |
| 0.0923        | 43.6590 | 21000 | 0.2801          | 16405368          |
| 0.0469        | 44.0748 | 21200 | 0.2986          | 16561000          |
| 0.0622        | 44.4906 | 21400 | 0.3108          | 16718312          |
| 0.0369        | 44.9064 | 21600 | 0.3106          | 16874632          |
| 0.0219        | 45.3222 | 21800 | 0.3151          | 17031680          |
| 0.0441        | 45.7380 | 22000 | 0.2985          | 17188288          |
| 0.0393        | 46.1538 | 22200 | 0.3217          | 17345048          |
| 0.0483        | 46.5696 | 22400 | 0.3248          | 17501560          |
| 0.0441        | 46.9854 | 22600 | 0.3119          | 17657336          |
| 0.021         | 47.4012 | 22800 | 0.3619          | 17813576          |
| 0.0484        | 47.8170 | 23000 | 0.3292          | 17970024          |
| 0.0212        | 48.2328 | 23200 | 0.3445          | 18126280          |
| 0.0324        | 48.6486 | 23400 | 0.3528          | 18282568          |
| 0.0211        | 49.0644 | 23600 | 0.3431          | 18438872          |
| 0.0192        | 49.4802 | 23800 | 0.3763          | 18595416          |
| 0.0395        | 49.8960 | 24000 | 0.3581          | 18751672          |
| 0.0153        | 50.3119 | 24200 | 0.3559          | 18906848          |
| 0.0156        | 50.7277 | 24400 | 0.3942          | 19064192          |
| 0.0146        | 51.1435 | 24600 | 0.3581          | 19219856          |
| 0.0072        | 51.5593 | 24800 | 0.3812          | 19376464          |
| 0.0306        | 51.9751 | 25000 | 0.3594          | 19532272          |
| 0.0183        | 52.3909 | 25200 | 0.3847          | 19688288          |
| 0.042         | 52.8067 | 25400 | 0.3807          | 19844672          |
| 0.0111        | 53.2225 | 25600 | 0.4034          | 20001552          |
| 0.0134        | 53.6383 | 25800 | 0.4040          | 20157424          |
| 0.0161        | 54.0541 | 26000 | 0.4079          | 20313440          |
| 0.0283        | 54.4699 | 26200 | 0.4114          | 20469664          |
| 0.0275        | 54.8857 | 26400 | 0.4072          | 20625984          |
| 0.0094        | 55.3015 | 26600 | 0.3832          | 20781904          |
| 0.0043        | 55.7173 | 26800 | 0.4030          | 20938512          |
| 0.0086        | 56.1331 | 27000 | 0.4173          | 21095008          |
| 0.0156        | 56.5489 | 27200 | 0.4307          | 21251264          |
| 0.0067        | 56.9647 | 27400 | 0.4135          | 21407744          |
| 0.0018        | 57.3805 | 27600 | 0.4302          | 21564560          |
| 0.0109        | 57.7963 | 27800 | 0.4219          | 21720560          |
| 0.0023        | 58.2121 | 28000 | 0.4231          | 21877024          |
| 0.0032        | 58.6279 | 28200 | 0.4331          | 22033344          |
| 0.0016        | 59.0437 | 28400 | 0.4292          | 22189872          |
| 0.0016        | 59.4595 | 28600 | 0.4532          | 22345712          |
| 0.0011        | 59.8753 | 28800 | 0.4611          | 22502352          |
| 0.0024        | 60.2911 | 29000 | 0.4540          | 22658440          |
| 0.0018        | 60.7069 | 29200 | 0.4585          | 22814056          |
| 0.0008        | 61.1227 | 29400 | 0.4533          | 22970680          |
| 0.0011        | 61.5385 | 29600 | 0.4666          | 23126776          |
| 0.0015        | 61.9543 | 29800 | 0.4590          | 23283064          |
| 0.0011        | 62.3701 | 30000 | 0.4888          | 23440000          |
| 0.011         | 62.7859 | 30200 | 0.4815          | 23596224          |
| 0.0006        | 63.2017 | 30400 | 0.4677          | 23751880          |
| 0.0008        | 63.6175 | 30600 | 0.4747          | 23907624          |
| 0.0007        | 64.0333 | 30800 | 0.4808          | 24063864          |
| 0.0008        | 64.4491 | 31000 | 0.4713          | 24219608          |
| 0.0006        | 64.8649 | 31200 | 0.4786          | 24376856          |
| 0.0014        | 65.2807 | 31400 | 0.4934          | 24533352          |
| 0.0057        | 65.6965 | 31600 | 0.4814          | 24688616          |
| 0.0138        | 66.1123 | 31800 | 0.4938          | 24844832          |
| 0.0053        | 66.5281 | 32000 | 0.5094          | 25002240          |
| 0.0098        | 66.9439 | 32200 | 0.4888          | 25158144          |
| 0.0004        | 67.3597 | 32400 | 0.5089          | 25314384          |
| 0.0022        | 67.7755 | 32600 | 0.4987          | 25470704          |
| 0.0005        | 68.1913 | 32800 | 0.5064          | 25627200          |
| 0.0005        | 68.6071 | 33000 | 0.5112          | 25783456          |
| 0.0003        | 69.0229 | 33200 | 0.5100          | 25940304          |
| 0.0072        | 69.4387 | 33400 | 0.5179          | 26096432          |
| 0.0004        | 69.8545 | 33600 | 0.5046          | 26253360          |
| 0.0055        | 70.2703 | 33800 | 0.5280          | 26408736          |
| 0.0003        | 70.6861 | 34000 | 0.5112          | 26565056          |
| 0.0011        | 71.1019 | 34200 | 0.5128          | 26721176          |
| 0.0003        | 71.5177 | 34400 | 0.5245          | 26877368          |
| 0.0004        | 71.9335 | 34600 | 0.5249          | 27033912          |
| 0.0002        | 72.3493 | 34800 | 0.5251          | 27190376          |
| 0.0003        | 72.7651 | 35000 | 0.5276          | 27347112          |
| 0.0028        | 73.1809 | 35200 | 0.5359          | 27503480          |
| 0.0003        | 73.5967 | 35400 | 0.5336          | 27660280          |
| 0.0003        | 74.0125 | 35600 | 0.5346          | 27815536          |
| 0.0056        | 74.4283 | 35800 | 0.5351          | 27971600          |
| 0.0002        | 74.8441 | 36000 | 0.5344          | 28127664          |
| 0.0002        | 75.2599 | 36200 | 0.5388          | 28284736          |
| 0.0002        | 75.6757 | 36400 | 0.5420          | 28440672          |
| 0.0002        | 76.0915 | 36600 | 0.5390          | 28596968          |
| 0.0028        | 76.5073 | 36800 | 0.5407          | 28753672          |
| 0.0035        | 76.9231 | 37000 | 0.5401          | 28909800          |
| 0.0002        | 77.3389 | 37200 | 0.5440          | 29066104          |
| 0.0002        | 77.7547 | 37400 | 0.5418          | 29222328          |
| 0.0034        | 78.1705 | 37600 | 0.5432          | 29378344          |
| 0.0002        | 78.5863 | 37800 | 0.5416          | 29534888          |
| 0.0002        | 79.0021 | 38000 | 0.5451          | 29690392          |
| 0.0002        | 79.4179 | 38200 | 0.5444          | 29846936          |
| 0.0002        | 79.8337 | 38400 | 0.5451          | 30002424          |
| 0.0002        | 80.2495 | 38600 | 0.5490          | 30158536          |
| 0.0027        | 80.6653 | 38800 | 0.5468          | 30314984          |
| 0.0002        | 81.0811 | 39000 | 0.5461          | 30471288          |
| 0.0002        | 81.4969 | 39200 | 0.5447          | 30628024          |
| 0.0002        | 81.9127 | 39400 | 0.5456          | 30784376          |
| 0.0034        | 82.3285 | 39600 | 0.5451          | 30940904          |
| 0.0002        | 82.7443 | 39800 | 0.5456          | 31097352          |
| 0.0002        | 83.1601 | 40000 | 0.5434          | 31253176          |


### Framework versions

- PEFT 0.15.1
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1