odrl2text-llama32-3b-lora-v1

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the odrl_to_text_train dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0973

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.00015
  • train_batch_size: 6
  • eval_batch_size: 3
  • seed: 42
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 18
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 4.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.3448 0.2423 50 0.3231
0.2736 0.4847 100 0.2603
0.2656 0.7270 150 0.2355
0.2486 0.9693 200 0.2155
0.2204 1.2084 250 0.2006
0.2165 1.4507 300 0.1872
0.1989 1.6931 350 0.1765
0.1955 1.9354 400 0.1635
0.1513 2.1745 450 0.1477
0.1623 2.4168 500 0.1369
0.146 2.6591 550 0.1256
0.1543 2.9015 600 0.1169
0.1047 3.1405 650 0.1062
0.1081 3.3829 700 0.1011
0.1083 3.6252 750 0.0982
0.1122 3.8675 800 0.0973

Framework versions

  • PEFT 0.15.2
  • Transformers 4.50.1
  • Pytorch 2.7.1+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Yusra2677/odrl2text-llama32-3b-lora-v1

Adapter
(545)
this model