train_copa_789_1757596139

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

Loss: 0.0711
Num Input Tokens Seen: 281984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2013	0.5	45	0.1095	14240
0.2033	1.0	90	0.0885	28192
0.0778	1.5	135	0.0860	42080
0.1119	2.0	180	0.0777	56192
0.0346	2.5	225	0.0823	70048
0.1199	3.0	270	0.0711	84192
0.0165	3.5	315	0.1047	98304
0.0248	4.0	360	0.1218	112544
0.003	4.5	405	0.1436	126784
0.0269	5.0	450	0.1350	140960
0.0008	5.5	495	0.1389	155200
0.027	6.0	540	0.1530	169216
0.0006	6.5	585	0.1628	183232
0.0002	7.0	630	0.1684	197248
0.0006	7.5	675	0.1641	211424
0.1687	8.0	720	0.1717	225440
0.0001	8.5	765	0.1706	239392
0.0014	9.0	810	0.1723	253632
0.0004	9.5	855	0.1679	267680
0.0003	10.0	900	0.1652	281984

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_789_1757596139

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2015)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard