e07743f410eda39cf39493c9b3d4e31b

This model is a fine-tuned version of FacebookAI/roberta-large on the nyu-mll/glue [stsb] dataset. It achieves the following results on the evaluation set:

Loss: 2.3530
Data Size: 1.0
Epoch Runtime: 32.1148
Mse: 2.3538
Mae: 1.2955
R2: -0.0529

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Mse	Mae	R2
No log	0	0	5.6761	0	2.5462	5.6773	1.9704	-1.5396
No log	1	179	2.7595	0.0078	3.0620	2.7604	1.3911	-0.2348
No log	2	358	2.8781	0.0156	3.5878	2.8789	1.3871	-0.2878
No log	3	537	3.2949	0.0312	4.9291	3.2960	1.5181	-0.4744
No log	4	716	2.2694	0.0625	5.8134	2.2703	1.2947	-0.0156
No log	5	895	2.6448	0.125	7.8811	2.6455	1.3378	-0.1834
0.1512	6	1074	2.6214	0.25	11.4330	2.6222	1.3342	-0.1730
2.2155	7	1253	2.2897	0.5	17.8023	2.2905	1.2867	-0.0246
2.1806	8.0	1432	2.3530	1.0	32.1148	2.3538	1.2955	-0.0529

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.3.0
Tokenizers 0.22.1

Downloads last month: 3

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for contemmcm/e07743f410eda39cf39493c9b3d4e31b

Base model

FacebookAI/roberta-large

Finetuned

(426)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard