cds_o_fr_13
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.9683
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 256
- eval_batch_size: 256
- seed: 13
- optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 6.7811 | 1.0 | 88 | 5.2394 |
| 4.5139 | 2.0 | 176 | 3.9491 |
| 3.6757 | 3.0 | 264 | 3.5220 |
| 3.3963 | 4.0 | 352 | 3.3253 |
| 3.231 | 5.0 | 440 | 3.2012 |
| 3.1208 | 6.0 | 528 | 3.1181 |
| 3.0376 | 7.0 | 616 | 3.0612 |
| 2.9745 | 8.0 | 704 | 3.0232 |
| 2.9249 | 9.0 | 792 | 2.9973 |
| 2.8817 | 10.0 | 880 | 2.9814 |
| 2.8433 | 11.0 | 968 | 2.9646 |
| 2.8071 | 12.0 | 1056 | 2.9579 |
| 2.7723 | 13.0 | 1144 | 2.9513 |
| 2.7393 | 14.0 | 1232 | 2.9497 |
| 2.7064 | 15.0 | 1320 | 2.9499 |
| 2.6751 | 16.0 | 1408 | 2.9523 |
| 2.6443 | 17.0 | 1496 | 2.9577 |
| 2.616 | 18.0 | 1584 | 2.9620 |
| 2.5925 | 19.0 | 1672 | 2.9667 |
| 2.5748 | 20.0 | 1760 | 2.9683 |
Framework versions
- Transformers 4.56.1
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.0
- Downloads last month
- 348