abarbosa commited on
Commit
50869da
·
verified ·
1 Parent(s): db17626

Pushing fine-tuned model to Hugging Face Hub

Browse files
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ language:
4
+ - pt
5
+ - en
6
+ tags:
7
+ - aes
8
+ datasets:
9
+ - kamel-usp/aes_enem_dataset
10
+ base_model: microsoft/Phi-3.5-mini-instruct
11
+ metrics:
12
+ - accuracy
13
+ - qwk
14
+ library_name: peft
15
+ model-index:
16
+ - name: phi35-balanced-C5
17
+ results:
18
+ - task:
19
+ type: text-classification
20
+ name: Automated Essay Score
21
+ dataset:
22
+ name: Automated Essay Score ENEM Dataset
23
+ type: kamel-usp/aes_enem_dataset
24
+ config: JBCS2025
25
+ split: test
26
+ metrics:
27
+ - name: Macro F1 (ignoring nan)
28
+ type: f1
29
+ value: 0.2833097701518754
30
+ - name: QWK
31
+ type: qwk
32
+ value: 0.5186813186813186
33
+ - name: Weighted Macro F1
34
+ type: f1
35
+ value: 0.3238493856342826
36
+ ---
37
+ # Model ID: phi35-balanced-C5
38
+ ## Results
39
+ | | test_data |
40
+ |:-----------------------------|------------:|
41
+ | eval_accuracy | 0.355072 |
42
+ | eval_RMSE | 58.1851 |
43
+ | eval_QWK | 0.518681 |
44
+ | eval_Macro_F1 | 0.28331 |
45
+ | eval_Macro_F1_(ignoring_nan) | 0.28331 |
46
+ | eval_Weighted_F1 | 0.323849 |
47
+ | eval_Micro_F1 | 0.355072 |
48
+ | eval_HDIV | 0.0869565 |
49
+
adapter_config.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "microsoft/Phi-3.5-mini-instruct",
5
+ "bias": "none",
6
+ "corda_config": null,
7
+ "eva_config": null,
8
+ "exclude_modules": null,
9
+ "fan_in_fan_out": false,
10
+ "inference_mode": true,
11
+ "init_lora_weights": true,
12
+ "layer_replication": null,
13
+ "layers_pattern": null,
14
+ "layers_to_transform": null,
15
+ "loftq_config": {},
16
+ "lora_alpha": 16,
17
+ "lora_bias": false,
18
+ "lora_dropout": 0.05,
19
+ "megatron_config": null,
20
+ "megatron_core": "megatron.core",
21
+ "modules_to_save": [
22
+ "classifier",
23
+ "score"
24
+ ],
25
+ "peft_type": "LORA",
26
+ "r": 8,
27
+ "rank_pattern": {},
28
+ "revision": null,
29
+ "target_modules": [
30
+ "down_proj",
31
+ "gate_up_proj",
32
+ "o_proj",
33
+ "qkv_proj"
34
+ ],
35
+ "task_type": "SEQ_CLS",
36
+ "trainable_token_indices": null,
37
+ "use_dora": false,
38
+ "use_rslora": false
39
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0c7a4ee4d09dff521d28bb68b4b0cc92ab17b1a6a1d57a2fb099604cbba958c6
3
+ size 50402728
run_experiment.log ADDED
@@ -0,0 +1,2480 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [2025-03-25 12:24:03,262][__main__][INFO] - cache_dir: /media/data/tmp
2
+ dataset:
3
+ name: kamel-usp/aes_enem_dataset
4
+ split: JBCS2025
5
+ training_params:
6
+ seed: 42
7
+ num_train_epochs: 20
8
+ logging_steps: 100
9
+ metric_for_best_model: QWK
10
+ bf16: true
11
+ post_training_results:
12
+ model_path: /workspace/jbcs2025/outputs/2025-03-25/11-12-15
13
+ experiments:
14
+ model:
15
+ name: microsoft/Phi-3.5-mini-instruct
16
+ type: phi35_classification_lora
17
+ num_labels: 6
18
+ output_dir: ./results/phi35-balanced/C5
19
+ logging_dir: ./logs/phi35-balanced/C5
20
+ best_model_dir: ./results/phi35-balanced/C5/best_model
21
+ lora_r: 8
22
+ lora_dropout: 0.05
23
+ lora_alpha: 16
24
+ lora_target_modules: all-linear
25
+ dataset:
26
+ grade_index: 4
27
+ training_id: phi35-balanced-C5
28
+ training_params:
29
+ weight_decay: 0.01
30
+ warmup_ratio: 0.1
31
+ learning_rate: 5.0e-05
32
+ train_batch_size: 2
33
+ eval_batch_size: 16
34
+ gradient_accumulation_steps: 8
35
+ gradient_checkpointing: false
36
+
37
+ [2025-03-25 12:24:03,264][__main__][INFO] - Starting the Fine Tuning training process.
38
+ [2025-03-25 12:24:08,620][transformers.tokenization_utils_base][INFO] - loading file tokenizer.model from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/tokenizer.model
39
+ [2025-03-25 12:24:08,620][transformers.tokenization_utils_base][INFO] - loading file tokenizer.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/tokenizer.json
40
+ [2025-03-25 12:24:08,620][transformers.tokenization_utils_base][INFO] - loading file added_tokens.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/added_tokens.json
41
+ [2025-03-25 12:24:08,620][transformers.tokenization_utils_base][INFO] - loading file special_tokens_map.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/special_tokens_map.json
42
+ [2025-03-25 12:24:08,620][transformers.tokenization_utils_base][INFO] - loading file tokenizer_config.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/tokenizer_config.json
43
+ [2025-03-25 12:24:08,620][transformers.tokenization_utils_base][INFO] - loading file chat_template.jinja from cache at None
44
+ [2025-03-25 12:24:08,693][transformers.tokenization_utils_base][INFO] - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
45
+ [2025-03-25 12:24:08,699][__main__][INFO] - Tokenizer function parameters- Padding:longest; Truncation: False
46
+ [2025-03-25 12:24:09,480][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
47
+ [2025-03-25 12:24:09,481][transformers.configuration_utils][INFO] - Model config Phi3Config {
48
+ "architectures": [
49
+ "Phi3ForCausalLM"
50
+ ],
51
+ "attention_bias": false,
52
+ "attention_dropout": 0.0,
53
+ "auto_map": {
54
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
55
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
56
+ },
57
+ "bos_token_id": 1,
58
+ "embd_pdrop": 0.0,
59
+ "eos_token_id": 32000,
60
+ "hidden_act": "silu",
61
+ "hidden_size": 3072,
62
+ "id2label": {
63
+ "0": 0,
64
+ "1": 40,
65
+ "2": 80,
66
+ "3": 120,
67
+ "4": 160,
68
+ "5": 200
69
+ },
70
+ "initializer_range": 0.02,
71
+ "intermediate_size": 8192,
72
+ "label2id": {
73
+ "0": 0,
74
+ "40": 1,
75
+ "80": 2,
76
+ "120": 3,
77
+ "160": 4,
78
+ "200": 5
79
+ },
80
+ "max_position_embeddings": 131072,
81
+ "model_type": "phi3",
82
+ "num_attention_heads": 32,
83
+ "num_hidden_layers": 32,
84
+ "num_key_value_heads": 32,
85
+ "original_max_position_embeddings": 4096,
86
+ "pad_token_id": 32000,
87
+ "partial_rotary_factor": 1.0,
88
+ "resid_pdrop": 0.0,
89
+ "rms_norm_eps": 1e-05,
90
+ "rope_scaling": {
91
+ "long_factor": [
92
+ 1.0800000429153442,
93
+ 1.1100000143051147,
94
+ 1.1399999856948853,
95
+ 1.340000033378601,
96
+ 1.5899999141693115,
97
+ 1.600000023841858,
98
+ 1.6200000047683716,
99
+ 2.620000123977661,
100
+ 3.2300000190734863,
101
+ 3.2300000190734863,
102
+ 4.789999961853027,
103
+ 7.400000095367432,
104
+ 7.700000286102295,
105
+ 9.09000015258789,
106
+ 12.199999809265137,
107
+ 17.670000076293945,
108
+ 24.46000099182129,
109
+ 28.57000160217285,
110
+ 30.420001983642578,
111
+ 30.840002059936523,
112
+ 32.590003967285156,
113
+ 32.93000411987305,
114
+ 42.320003509521484,
115
+ 44.96000289916992,
116
+ 50.340003967285156,
117
+ 50.45000457763672,
118
+ 57.55000305175781,
119
+ 57.93000411987305,
120
+ 58.21000289916992,
121
+ 60.1400032043457,
122
+ 62.61000442504883,
123
+ 62.62000274658203,
124
+ 62.71000289916992,
125
+ 63.1400032043457,
126
+ 63.1400032043457,
127
+ 63.77000427246094,
128
+ 63.93000411987305,
129
+ 63.96000289916992,
130
+ 63.970001220703125,
131
+ 64.02999877929688,
132
+ 64.06999969482422,
133
+ 64.08000183105469,
134
+ 64.12000274658203,
135
+ 64.41000366210938,
136
+ 64.4800033569336,
137
+ 64.51000213623047,
138
+ 64.52999877929688,
139
+ 64.83999633789062
140
+ ],
141
+ "short_factor": [
142
+ 1.0,
143
+ 1.0199999809265137,
144
+ 1.0299999713897705,
145
+ 1.0299999713897705,
146
+ 1.0499999523162842,
147
+ 1.0499999523162842,
148
+ 1.0499999523162842,
149
+ 1.0499999523162842,
150
+ 1.0499999523162842,
151
+ 1.0699999332427979,
152
+ 1.0999999046325684,
153
+ 1.1099998950958252,
154
+ 1.1599998474121094,
155
+ 1.1599998474121094,
156
+ 1.1699998378753662,
157
+ 1.2899998426437378,
158
+ 1.339999794960022,
159
+ 1.679999828338623,
160
+ 1.7899998426437378,
161
+ 1.8199998140335083,
162
+ 1.8499997854232788,
163
+ 1.8799997568130493,
164
+ 1.9099997282028198,
165
+ 1.9399996995925903,
166
+ 1.9899996519088745,
167
+ 2.0199997425079346,
168
+ 2.0199997425079346,
169
+ 2.0199997425079346,
170
+ 2.0199997425079346,
171
+ 2.0199997425079346,
172
+ 2.0199997425079346,
173
+ 2.0299997329711914,
174
+ 2.0299997329711914,
175
+ 2.0299997329711914,
176
+ 2.0299997329711914,
177
+ 2.0299997329711914,
178
+ 2.0299997329711914,
179
+ 2.0299997329711914,
180
+ 2.0299997329711914,
181
+ 2.0299997329711914,
182
+ 2.0799996852874756,
183
+ 2.0899996757507324,
184
+ 2.189999580383301,
185
+ 2.2199995517730713,
186
+ 2.5899994373321533,
187
+ 2.729999542236328,
188
+ 2.749999523162842,
189
+ 2.8399994373321533
190
+ ],
191
+ "type": "longrope"
192
+ },
193
+ "rope_theta": 10000.0,
194
+ "sliding_window": 262144,
195
+ "tie_word_embeddings": false,
196
+ "torch_dtype": "bfloat16",
197
+ "transformers_version": "4.50.0",
198
+ "use_cache": true,
199
+ "vocab_size": 32064
200
+ }
201
+
202
+ [2025-03-25 12:24:09,481][transformers.modeling_utils][INFO] - loading weights file model.safetensors from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/model.safetensors.index.json
203
+ [2025-03-25 12:24:09,482][transformers.modeling_utils][INFO] - Will use torch_dtype=torch.bfloat16 as defined in model's config object
204
+ [2025-03-25 12:24:09,482][transformers.modeling_utils][INFO] - Instantiating Phi3ForSequenceClassification model under default dtype torch.bfloat16.
205
+ [2025-03-25 12:24:31,520][transformers.modeling_utils][INFO] - Some weights of the model checkpoint at microsoft/Phi-3.5-mini-instruct were not used when initializing Phi3ForSequenceClassification: ['lm_head.weight']
206
+ - This IS expected if you are initializing Phi3ForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
207
+ - This IS NOT expected if you are initializing Phi3ForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
208
+ [2025-03-25 12:24:31,520][transformers.modeling_utils][WARNING] - Some weights of Phi3ForSequenceClassification were not initialized from the model checkpoint at microsoft/Phi-3.5-mini-instruct and are newly initialized: ['score.weight']
209
+ You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
210
+ [2025-03-25 12:24:55,874][__main__][INFO] - None
211
+ [2025-03-25 12:24:55,876][transformers.training_args][INFO] - PyTorch: setting up devices
212
+ [2025-03-25 12:24:55,899][__main__][INFO] - Total steps: 620. Number of warmup steps: 62
213
+ [2025-03-25 12:24:55,906][transformers.trainer][INFO] - You have loaded a model on multiple GPUs. `is_model_parallel` attribute will be force-set to `True` to avoid any unexpected behavior such as device placement mismatching.
214
+ [2025-03-25 12:24:55,929][transformers.trainer][INFO] - Using auto half precision backend
215
+ [2025-03-25 12:24:55,930][transformers.trainer][WARNING] - No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
216
+ [2025-03-25 12:24:55,938][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
217
+ [2025-03-25 12:24:55,949][transformers.trainer][INFO] -
218
+ ***** Running Evaluation *****
219
+ [2025-03-25 12:24:55,949][transformers.trainer][INFO] - Num examples = 132
220
+ [2025-03-25 12:24:55,949][transformers.trainer][INFO] - Batch size = 16
221
+ [2025-03-25 12:25:10,916][transformers][INFO] - {'accuracy': 0.2196969696969697, 'RMSE': 79.77240352174657, 'QWK': -0.17865160895299015, 'HDIV': 0.2727272727272727, 'Macro_F1': 0.0818840579710145, 'Micro_F1': 0.2196969696969697, 'Weighted_F1': 0.12490118577075099, 'Macro_F1_(ignoring_nan)': np.float64(0.163768115942029)}
222
+ [2025-03-25 12:25:10,920][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
223
+ [2025-03-25 12:25:11,161][transformers.trainer][INFO] - The following columns in the training set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
224
+ [2025-03-25 12:25:11,188][transformers.trainer][INFO] - ***** Running training *****
225
+ [2025-03-25 12:25:11,188][transformers.trainer][INFO] - Num examples = 500
226
+ [2025-03-25 12:25:11,188][transformers.trainer][INFO] - Num Epochs = 20
227
+ [2025-03-25 12:25:11,188][transformers.trainer][INFO] - Instantaneous batch size per device = 2
228
+ [2025-03-25 12:25:11,188][transformers.trainer][INFO] - Total train batch size (w. parallel, distributed & accumulation) = 16
229
+ [2025-03-25 12:25:11,188][transformers.trainer][INFO] - Gradient Accumulation steps = 8
230
+ [2025-03-25 12:25:11,188][transformers.trainer][INFO] - Total optimization steps = 620
231
+ [2025-03-25 12:25:11,190][transformers.trainer][INFO] - Number of trainable parameters = 12,601,344
232
+ [2025-03-25 12:30:02,780][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
233
+ [2025-03-25 12:30:02,782][transformers.trainer][INFO] -
234
+ ***** Running Evaluation *****
235
+ [2025-03-25 12:30:02,782][transformers.trainer][INFO] - Num examples = 132
236
+ [2025-03-25 12:30:02,782][transformers.trainer][INFO] - Batch size = 16
237
+ [2025-03-25 12:30:17,339][transformers][INFO] - {'accuracy': 0.17424242424242425, 'RMSE': 80.075721739621, 'QWK': 0.03621708165406057, 'HDIV': 0.25757575757575757, 'Macro_F1': 0.10146198830409357, 'Micro_F1': 0.17424242424242425, 'Weighted_F1': 0.09156477051213893, 'Macro_F1_(ignoring_nan)': np.float64(0.20292397660818715)}
238
+ [2025-03-25 12:30:17,339][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
239
+ [2025-03-25 12:30:17,341][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-32
240
+ [2025-03-25 12:30:17,829][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
241
+ [2025-03-25 12:30:17,830][transformers.configuration_utils][INFO] - Model config Phi3Config {
242
+ "architectures": [
243
+ "Phi3ForCausalLM"
244
+ ],
245
+ "attention_bias": false,
246
+ "attention_dropout": 0.0,
247
+ "auto_map": {
248
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
249
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
250
+ },
251
+ "bos_token_id": 1,
252
+ "embd_pdrop": 0.0,
253
+ "eos_token_id": 32000,
254
+ "hidden_act": "silu",
255
+ "hidden_size": 3072,
256
+ "initializer_range": 0.02,
257
+ "intermediate_size": 8192,
258
+ "max_position_embeddings": 131072,
259
+ "model_type": "phi3",
260
+ "num_attention_heads": 32,
261
+ "num_hidden_layers": 32,
262
+ "num_key_value_heads": 32,
263
+ "original_max_position_embeddings": 4096,
264
+ "pad_token_id": 32000,
265
+ "partial_rotary_factor": 1.0,
266
+ "resid_pdrop": 0.0,
267
+ "rms_norm_eps": 1e-05,
268
+ "rope_scaling": {
269
+ "long_factor": [
270
+ 1.0800000429153442,
271
+ 1.1100000143051147,
272
+ 1.1399999856948853,
273
+ 1.340000033378601,
274
+ 1.5899999141693115,
275
+ 1.600000023841858,
276
+ 1.6200000047683716,
277
+ 2.620000123977661,
278
+ 3.2300000190734863,
279
+ 3.2300000190734863,
280
+ 4.789999961853027,
281
+ 7.400000095367432,
282
+ 7.700000286102295,
283
+ 9.09000015258789,
284
+ 12.199999809265137,
285
+ 17.670000076293945,
286
+ 24.46000099182129,
287
+ 28.57000160217285,
288
+ 30.420001983642578,
289
+ 30.840002059936523,
290
+ 32.590003967285156,
291
+ 32.93000411987305,
292
+ 42.320003509521484,
293
+ 44.96000289916992,
294
+ 50.340003967285156,
295
+ 50.45000457763672,
296
+ 57.55000305175781,
297
+ 57.93000411987305,
298
+ 58.21000289916992,
299
+ 60.1400032043457,
300
+ 62.61000442504883,
301
+ 62.62000274658203,
302
+ 62.71000289916992,
303
+ 63.1400032043457,
304
+ 63.1400032043457,
305
+ 63.77000427246094,
306
+ 63.93000411987305,
307
+ 63.96000289916992,
308
+ 63.970001220703125,
309
+ 64.02999877929688,
310
+ 64.06999969482422,
311
+ 64.08000183105469,
312
+ 64.12000274658203,
313
+ 64.41000366210938,
314
+ 64.4800033569336,
315
+ 64.51000213623047,
316
+ 64.52999877929688,
317
+ 64.83999633789062
318
+ ],
319
+ "short_factor": [
320
+ 1.0,
321
+ 1.0199999809265137,
322
+ 1.0299999713897705,
323
+ 1.0299999713897705,
324
+ 1.0499999523162842,
325
+ 1.0499999523162842,
326
+ 1.0499999523162842,
327
+ 1.0499999523162842,
328
+ 1.0499999523162842,
329
+ 1.0699999332427979,
330
+ 1.0999999046325684,
331
+ 1.1099998950958252,
332
+ 1.1599998474121094,
333
+ 1.1599998474121094,
334
+ 1.1699998378753662,
335
+ 1.2899998426437378,
336
+ 1.339999794960022,
337
+ 1.679999828338623,
338
+ 1.7899998426437378,
339
+ 1.8199998140335083,
340
+ 1.8499997854232788,
341
+ 1.8799997568130493,
342
+ 1.9099997282028198,
343
+ 1.9399996995925903,
344
+ 1.9899996519088745,
345
+ 2.0199997425079346,
346
+ 2.0199997425079346,
347
+ 2.0199997425079346,
348
+ 2.0199997425079346,
349
+ 2.0199997425079346,
350
+ 2.0199997425079346,
351
+ 2.0299997329711914,
352
+ 2.0299997329711914,
353
+ 2.0299997329711914,
354
+ 2.0299997329711914,
355
+ 2.0299997329711914,
356
+ 2.0299997329711914,
357
+ 2.0299997329711914,
358
+ 2.0299997329711914,
359
+ 2.0299997329711914,
360
+ 2.0799996852874756,
361
+ 2.0899996757507324,
362
+ 2.189999580383301,
363
+ 2.2199995517730713,
364
+ 2.5899994373321533,
365
+ 2.729999542236328,
366
+ 2.749999523162842,
367
+ 2.8399994373321533
368
+ ],
369
+ "type": "longrope"
370
+ },
371
+ "rope_theta": 10000.0,
372
+ "sliding_window": 262144,
373
+ "tie_word_embeddings": false,
374
+ "torch_dtype": "bfloat16",
375
+ "transformers_version": "4.50.0",
376
+ "use_cache": true,
377
+ "vocab_size": 32064
378
+ }
379
+
380
+ [2025-03-25 12:35:15,786][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
381
+ [2025-03-25 12:35:15,789][transformers.trainer][INFO] -
382
+ ***** Running Evaluation *****
383
+ [2025-03-25 12:35:15,789][transformers.trainer][INFO] - Num examples = 132
384
+ [2025-03-25 12:35:15,789][transformers.trainer][INFO] - Batch size = 16
385
+ [2025-03-25 12:35:30,015][transformers][INFO] - {'accuracy': 0.24242424242424243, 'RMSE': 70.06490497453707, 'QWK': 0.24075441686076227, 'HDIV': 0.16666666666666663, 'Macro_F1': 0.16837419518229546, 'Micro_F1': 0.24242424242424243, 'Weighted_F1': 0.2152421028023728, 'Macro_F1_(ignoring_nan)': np.float64(0.20204903421875456)}
386
+ [2025-03-25 12:35:30,015][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
387
+ [2025-03-25 12:35:30,019][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-64
388
+ [2025-03-25 12:35:30,603][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
389
+ [2025-03-25 12:35:30,604][transformers.configuration_utils][INFO] - Model config Phi3Config {
390
+ "architectures": [
391
+ "Phi3ForCausalLM"
392
+ ],
393
+ "attention_bias": false,
394
+ "attention_dropout": 0.0,
395
+ "auto_map": {
396
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
397
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
398
+ },
399
+ "bos_token_id": 1,
400
+ "embd_pdrop": 0.0,
401
+ "eos_token_id": 32000,
402
+ "hidden_act": "silu",
403
+ "hidden_size": 3072,
404
+ "initializer_range": 0.02,
405
+ "intermediate_size": 8192,
406
+ "max_position_embeddings": 131072,
407
+ "model_type": "phi3",
408
+ "num_attention_heads": 32,
409
+ "num_hidden_layers": 32,
410
+ "num_key_value_heads": 32,
411
+ "original_max_position_embeddings": 4096,
412
+ "pad_token_id": 32000,
413
+ "partial_rotary_factor": 1.0,
414
+ "resid_pdrop": 0.0,
415
+ "rms_norm_eps": 1e-05,
416
+ "rope_scaling": {
417
+ "long_factor": [
418
+ 1.0800000429153442,
419
+ 1.1100000143051147,
420
+ 1.1399999856948853,
421
+ 1.340000033378601,
422
+ 1.5899999141693115,
423
+ 1.600000023841858,
424
+ 1.6200000047683716,
425
+ 2.620000123977661,
426
+ 3.2300000190734863,
427
+ 3.2300000190734863,
428
+ 4.789999961853027,
429
+ 7.400000095367432,
430
+ 7.700000286102295,
431
+ 9.09000015258789,
432
+ 12.199999809265137,
433
+ 17.670000076293945,
434
+ 24.46000099182129,
435
+ 28.57000160217285,
436
+ 30.420001983642578,
437
+ 30.840002059936523,
438
+ 32.590003967285156,
439
+ 32.93000411987305,
440
+ 42.320003509521484,
441
+ 44.96000289916992,
442
+ 50.340003967285156,
443
+ 50.45000457763672,
444
+ 57.55000305175781,
445
+ 57.93000411987305,
446
+ 58.21000289916992,
447
+ 60.1400032043457,
448
+ 62.61000442504883,
449
+ 62.62000274658203,
450
+ 62.71000289916992,
451
+ 63.1400032043457,
452
+ 63.1400032043457,
453
+ 63.77000427246094,
454
+ 63.93000411987305,
455
+ 63.96000289916992,
456
+ 63.970001220703125,
457
+ 64.02999877929688,
458
+ 64.06999969482422,
459
+ 64.08000183105469,
460
+ 64.12000274658203,
461
+ 64.41000366210938,
462
+ 64.4800033569336,
463
+ 64.51000213623047,
464
+ 64.52999877929688,
465
+ 64.83999633789062
466
+ ],
467
+ "short_factor": [
468
+ 1.0,
469
+ 1.0199999809265137,
470
+ 1.0299999713897705,
471
+ 1.0299999713897705,
472
+ 1.0499999523162842,
473
+ 1.0499999523162842,
474
+ 1.0499999523162842,
475
+ 1.0499999523162842,
476
+ 1.0499999523162842,
477
+ 1.0699999332427979,
478
+ 1.0999999046325684,
479
+ 1.1099998950958252,
480
+ 1.1599998474121094,
481
+ 1.1599998474121094,
482
+ 1.1699998378753662,
483
+ 1.2899998426437378,
484
+ 1.339999794960022,
485
+ 1.679999828338623,
486
+ 1.7899998426437378,
487
+ 1.8199998140335083,
488
+ 1.8499997854232788,
489
+ 1.8799997568130493,
490
+ 1.9099997282028198,
491
+ 1.9399996995925903,
492
+ 1.9899996519088745,
493
+ 2.0199997425079346,
494
+ 2.0199997425079346,
495
+ 2.0199997425079346,
496
+ 2.0199997425079346,
497
+ 2.0199997425079346,
498
+ 2.0199997425079346,
499
+ 2.0299997329711914,
500
+ 2.0299997329711914,
501
+ 2.0299997329711914,
502
+ 2.0299997329711914,
503
+ 2.0299997329711914,
504
+ 2.0299997329711914,
505
+ 2.0299997329711914,
506
+ 2.0299997329711914,
507
+ 2.0299997329711914,
508
+ 2.0799996852874756,
509
+ 2.0899996757507324,
510
+ 2.189999580383301,
511
+ 2.2199995517730713,
512
+ 2.5899994373321533,
513
+ 2.729999542236328,
514
+ 2.749999523162842,
515
+ 2.8399994373321533
516
+ ],
517
+ "type": "longrope"
518
+ },
519
+ "rope_theta": 10000.0,
520
+ "sliding_window": 262144,
521
+ "tie_word_embeddings": false,
522
+ "torch_dtype": "bfloat16",
523
+ "transformers_version": "4.50.0",
524
+ "use_cache": true,
525
+ "vocab_size": 32064
526
+ }
527
+
528
+ [2025-03-25 12:35:37,037][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-32] due to args.save_total_limit
529
+ [2025-03-25 12:40:28,041][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
530
+ [2025-03-25 12:40:28,044][transformers.trainer][INFO] -
531
+ ***** Running Evaluation *****
532
+ [2025-03-25 12:40:28,044][transformers.trainer][INFO] - Num examples = 132
533
+ [2025-03-25 12:40:28,044][transformers.trainer][INFO] - Batch size = 16
534
+ [2025-03-25 12:40:42,596][transformers][INFO] - {'accuracy': 0.29545454545454547, 'RMSE': 54.49492609130661, 'QWK': 0.4311145510835913, 'HDIV': 0.05303030303030298, 'Macro_F1': 0.23847544610384777, 'Micro_F1': 0.29545454545454547, 'Weighted_F1': 0.24895876275317505, 'Macro_F1_(ignoring_nan)': np.float64(0.28617053532461734)}
535
+ [2025-03-25 12:40:42,597][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
536
+ [2025-03-25 12:40:42,599][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-96
537
+ [2025-03-25 12:40:43,128][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
538
+ [2025-03-25 12:40:43,129][transformers.configuration_utils][INFO] - Model config Phi3Config {
539
+ "architectures": [
540
+ "Phi3ForCausalLM"
541
+ ],
542
+ "attention_bias": false,
543
+ "attention_dropout": 0.0,
544
+ "auto_map": {
545
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
546
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
547
+ },
548
+ "bos_token_id": 1,
549
+ "embd_pdrop": 0.0,
550
+ "eos_token_id": 32000,
551
+ "hidden_act": "silu",
552
+ "hidden_size": 3072,
553
+ "initializer_range": 0.02,
554
+ "intermediate_size": 8192,
555
+ "max_position_embeddings": 131072,
556
+ "model_type": "phi3",
557
+ "num_attention_heads": 32,
558
+ "num_hidden_layers": 32,
559
+ "num_key_value_heads": 32,
560
+ "original_max_position_embeddings": 4096,
561
+ "pad_token_id": 32000,
562
+ "partial_rotary_factor": 1.0,
563
+ "resid_pdrop": 0.0,
564
+ "rms_norm_eps": 1e-05,
565
+ "rope_scaling": {
566
+ "long_factor": [
567
+ 1.0800000429153442,
568
+ 1.1100000143051147,
569
+ 1.1399999856948853,
570
+ 1.340000033378601,
571
+ 1.5899999141693115,
572
+ 1.600000023841858,
573
+ 1.6200000047683716,
574
+ 2.620000123977661,
575
+ 3.2300000190734863,
576
+ 3.2300000190734863,
577
+ 4.789999961853027,
578
+ 7.400000095367432,
579
+ 7.700000286102295,
580
+ 9.09000015258789,
581
+ 12.199999809265137,
582
+ 17.670000076293945,
583
+ 24.46000099182129,
584
+ 28.57000160217285,
585
+ 30.420001983642578,
586
+ 30.840002059936523,
587
+ 32.590003967285156,
588
+ 32.93000411987305,
589
+ 42.320003509521484,
590
+ 44.96000289916992,
591
+ 50.340003967285156,
592
+ 50.45000457763672,
593
+ 57.55000305175781,
594
+ 57.93000411987305,
595
+ 58.21000289916992,
596
+ 60.1400032043457,
597
+ 62.61000442504883,
598
+ 62.62000274658203,
599
+ 62.71000289916992,
600
+ 63.1400032043457,
601
+ 63.1400032043457,
602
+ 63.77000427246094,
603
+ 63.93000411987305,
604
+ 63.96000289916992,
605
+ 63.970001220703125,
606
+ 64.02999877929688,
607
+ 64.06999969482422,
608
+ 64.08000183105469,
609
+ 64.12000274658203,
610
+ 64.41000366210938,
611
+ 64.4800033569336,
612
+ 64.51000213623047,
613
+ 64.52999877929688,
614
+ 64.83999633789062
615
+ ],
616
+ "short_factor": [
617
+ 1.0,
618
+ 1.0199999809265137,
619
+ 1.0299999713897705,
620
+ 1.0299999713897705,
621
+ 1.0499999523162842,
622
+ 1.0499999523162842,
623
+ 1.0499999523162842,
624
+ 1.0499999523162842,
625
+ 1.0499999523162842,
626
+ 1.0699999332427979,
627
+ 1.0999999046325684,
628
+ 1.1099998950958252,
629
+ 1.1599998474121094,
630
+ 1.1599998474121094,
631
+ 1.1699998378753662,
632
+ 1.2899998426437378,
633
+ 1.339999794960022,
634
+ 1.679999828338623,
635
+ 1.7899998426437378,
636
+ 1.8199998140335083,
637
+ 1.8499997854232788,
638
+ 1.8799997568130493,
639
+ 1.9099997282028198,
640
+ 1.9399996995925903,
641
+ 1.9899996519088745,
642
+ 2.0199997425079346,
643
+ 2.0199997425079346,
644
+ 2.0199997425079346,
645
+ 2.0199997425079346,
646
+ 2.0199997425079346,
647
+ 2.0199997425079346,
648
+ 2.0299997329711914,
649
+ 2.0299997329711914,
650
+ 2.0299997329711914,
651
+ 2.0299997329711914,
652
+ 2.0299997329711914,
653
+ 2.0299997329711914,
654
+ 2.0299997329711914,
655
+ 2.0299997329711914,
656
+ 2.0299997329711914,
657
+ 2.0799996852874756,
658
+ 2.0899996757507324,
659
+ 2.189999580383301,
660
+ 2.2199995517730713,
661
+ 2.5899994373321533,
662
+ 2.729999542236328,
663
+ 2.749999523162842,
664
+ 2.8399994373321533
665
+ ],
666
+ "type": "longrope"
667
+ },
668
+ "rope_theta": 10000.0,
669
+ "sliding_window": 262144,
670
+ "tie_word_embeddings": false,
671
+ "torch_dtype": "bfloat16",
672
+ "transformers_version": "4.50.0",
673
+ "use_cache": true,
674
+ "vocab_size": 32064
675
+ }
676
+
677
+ [2025-03-25 12:40:48,828][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-64] due to args.save_total_limit
678
+ [2025-03-25 12:45:40,249][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
679
+ [2025-03-25 12:45:40,252][transformers.trainer][INFO] -
680
+ ***** Running Evaluation *****
681
+ [2025-03-25 12:45:40,252][transformers.trainer][INFO] - Num examples = 132
682
+ [2025-03-25 12:45:40,252][transformers.trainer][INFO] - Batch size = 16
683
+ [2025-03-25 12:45:54,619][transformers][INFO] - {'accuracy': 0.26515151515151514, 'RMSE': 68.66696087021411, 'QWK': 0.4372205173169662, 'HDIV': 0.15909090909090906, 'Macro_F1': 0.17046749116066365, 'Micro_F1': 0.26515151515151514, 'Weighted_F1': 0.17500122287761904, 'Macro_F1_(ignoring_nan)': np.float64(0.2557012367409955)}
684
+ [2025-03-25 12:45:54,619][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
685
+ [2025-03-25 12:45:54,622][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-128
686
+ [2025-03-25 12:45:55,188][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
687
+ [2025-03-25 12:45:55,189][transformers.configuration_utils][INFO] - Model config Phi3Config {
688
+ "architectures": [
689
+ "Phi3ForCausalLM"
690
+ ],
691
+ "attention_bias": false,
692
+ "attention_dropout": 0.0,
693
+ "auto_map": {
694
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
695
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
696
+ },
697
+ "bos_token_id": 1,
698
+ "embd_pdrop": 0.0,
699
+ "eos_token_id": 32000,
700
+ "hidden_act": "silu",
701
+ "hidden_size": 3072,
702
+ "initializer_range": 0.02,
703
+ "intermediate_size": 8192,
704
+ "max_position_embeddings": 131072,
705
+ "model_type": "phi3",
706
+ "num_attention_heads": 32,
707
+ "num_hidden_layers": 32,
708
+ "num_key_value_heads": 32,
709
+ "original_max_position_embeddings": 4096,
710
+ "pad_token_id": 32000,
711
+ "partial_rotary_factor": 1.0,
712
+ "resid_pdrop": 0.0,
713
+ "rms_norm_eps": 1e-05,
714
+ "rope_scaling": {
715
+ "long_factor": [
716
+ 1.0800000429153442,
717
+ 1.1100000143051147,
718
+ 1.1399999856948853,
719
+ 1.340000033378601,
720
+ 1.5899999141693115,
721
+ 1.600000023841858,
722
+ 1.6200000047683716,
723
+ 2.620000123977661,
724
+ 3.2300000190734863,
725
+ 3.2300000190734863,
726
+ 4.789999961853027,
727
+ 7.400000095367432,
728
+ 7.700000286102295,
729
+ 9.09000015258789,
730
+ 12.199999809265137,
731
+ 17.670000076293945,
732
+ 24.46000099182129,
733
+ 28.57000160217285,
734
+ 30.420001983642578,
735
+ 30.840002059936523,
736
+ 32.590003967285156,
737
+ 32.93000411987305,
738
+ 42.320003509521484,
739
+ 44.96000289916992,
740
+ 50.340003967285156,
741
+ 50.45000457763672,
742
+ 57.55000305175781,
743
+ 57.93000411987305,
744
+ 58.21000289916992,
745
+ 60.1400032043457,
746
+ 62.61000442504883,
747
+ 62.62000274658203,
748
+ 62.71000289916992,
749
+ 63.1400032043457,
750
+ 63.1400032043457,
751
+ 63.77000427246094,
752
+ 63.93000411987305,
753
+ 63.96000289916992,
754
+ 63.970001220703125,
755
+ 64.02999877929688,
756
+ 64.06999969482422,
757
+ 64.08000183105469,
758
+ 64.12000274658203,
759
+ 64.41000366210938,
760
+ 64.4800033569336,
761
+ 64.51000213623047,
762
+ 64.52999877929688,
763
+ 64.83999633789062
764
+ ],
765
+ "short_factor": [
766
+ 1.0,
767
+ 1.0199999809265137,
768
+ 1.0299999713897705,
769
+ 1.0299999713897705,
770
+ 1.0499999523162842,
771
+ 1.0499999523162842,
772
+ 1.0499999523162842,
773
+ 1.0499999523162842,
774
+ 1.0499999523162842,
775
+ 1.0699999332427979,
776
+ 1.0999999046325684,
777
+ 1.1099998950958252,
778
+ 1.1599998474121094,
779
+ 1.1599998474121094,
780
+ 1.1699998378753662,
781
+ 1.2899998426437378,
782
+ 1.339999794960022,
783
+ 1.679999828338623,
784
+ 1.7899998426437378,
785
+ 1.8199998140335083,
786
+ 1.8499997854232788,
787
+ 1.8799997568130493,
788
+ 1.9099997282028198,
789
+ 1.9399996995925903,
790
+ 1.9899996519088745,
791
+ 2.0199997425079346,
792
+ 2.0199997425079346,
793
+ 2.0199997425079346,
794
+ 2.0199997425079346,
795
+ 2.0199997425079346,
796
+ 2.0199997425079346,
797
+ 2.0299997329711914,
798
+ 2.0299997329711914,
799
+ 2.0299997329711914,
800
+ 2.0299997329711914,
801
+ 2.0299997329711914,
802
+ 2.0299997329711914,
803
+ 2.0299997329711914,
804
+ 2.0299997329711914,
805
+ 2.0299997329711914,
806
+ 2.0799996852874756,
807
+ 2.0899996757507324,
808
+ 2.189999580383301,
809
+ 2.2199995517730713,
810
+ 2.5899994373321533,
811
+ 2.729999542236328,
812
+ 2.749999523162842,
813
+ 2.8399994373321533
814
+ ],
815
+ "type": "longrope"
816
+ },
817
+ "rope_theta": 10000.0,
818
+ "sliding_window": 262144,
819
+ "tie_word_embeddings": false,
820
+ "torch_dtype": "bfloat16",
821
+ "transformers_version": "4.50.0",
822
+ "use_cache": true,
823
+ "vocab_size": 32064
824
+ }
825
+
826
+ [2025-03-25 12:46:01,623][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-96] due to args.save_total_limit
827
+ [2025-03-25 12:50:52,724][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
828
+ [2025-03-25 12:50:52,726][transformers.trainer][INFO] -
829
+ ***** Running Evaluation *****
830
+ [2025-03-25 12:50:52,726][transformers.trainer][INFO] - Num examples = 132
831
+ [2025-03-25 12:50:52,726][transformers.trainer][INFO] - Batch size = 16
832
+ [2025-03-25 12:51:07,212][transformers][INFO] - {'accuracy': 0.2878787878787879, 'RMSE': 65.96601512532914, 'QWK': 0.4616712864088698, 'HDIV': 0.14393939393939392, 'Macro_F1': 0.2180973594888689, 'Micro_F1': 0.2878787878787879, 'Weighted_F1': 0.24242798606006155, 'Macro_F1_(ignoring_nan)': np.float64(0.2180973594888689)}
833
+ [2025-03-25 12:51:07,212][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
834
+ [2025-03-25 12:51:07,215][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-160
835
+ [2025-03-25 12:51:08,745][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
836
+ [2025-03-25 12:51:08,746][transformers.configuration_utils][INFO] - Model config Phi3Config {
837
+ "architectures": [
838
+ "Phi3ForCausalLM"
839
+ ],
840
+ "attention_bias": false,
841
+ "attention_dropout": 0.0,
842
+ "auto_map": {
843
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
844
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
845
+ },
846
+ "bos_token_id": 1,
847
+ "embd_pdrop": 0.0,
848
+ "eos_token_id": 32000,
849
+ "hidden_act": "silu",
850
+ "hidden_size": 3072,
851
+ "initializer_range": 0.02,
852
+ "intermediate_size": 8192,
853
+ "max_position_embeddings": 131072,
854
+ "model_type": "phi3",
855
+ "num_attention_heads": 32,
856
+ "num_hidden_layers": 32,
857
+ "num_key_value_heads": 32,
858
+ "original_max_position_embeddings": 4096,
859
+ "pad_token_id": 32000,
860
+ "partial_rotary_factor": 1.0,
861
+ "resid_pdrop": 0.0,
862
+ "rms_norm_eps": 1e-05,
863
+ "rope_scaling": {
864
+ "long_factor": [
865
+ 1.0800000429153442,
866
+ 1.1100000143051147,
867
+ 1.1399999856948853,
868
+ 1.340000033378601,
869
+ 1.5899999141693115,
870
+ 1.600000023841858,
871
+ 1.6200000047683716,
872
+ 2.620000123977661,
873
+ 3.2300000190734863,
874
+ 3.2300000190734863,
875
+ 4.789999961853027,
876
+ 7.400000095367432,
877
+ 7.700000286102295,
878
+ 9.09000015258789,
879
+ 12.199999809265137,
880
+ 17.670000076293945,
881
+ 24.46000099182129,
882
+ 28.57000160217285,
883
+ 30.420001983642578,
884
+ 30.840002059936523,
885
+ 32.590003967285156,
886
+ 32.93000411987305,
887
+ 42.320003509521484,
888
+ 44.96000289916992,
889
+ 50.340003967285156,
890
+ 50.45000457763672,
891
+ 57.55000305175781,
892
+ 57.93000411987305,
893
+ 58.21000289916992,
894
+ 60.1400032043457,
895
+ 62.61000442504883,
896
+ 62.62000274658203,
897
+ 62.71000289916992,
898
+ 63.1400032043457,
899
+ 63.1400032043457,
900
+ 63.77000427246094,
901
+ 63.93000411987305,
902
+ 63.96000289916992,
903
+ 63.970001220703125,
904
+ 64.02999877929688,
905
+ 64.06999969482422,
906
+ 64.08000183105469,
907
+ 64.12000274658203,
908
+ 64.41000366210938,
909
+ 64.4800033569336,
910
+ 64.51000213623047,
911
+ 64.52999877929688,
912
+ 64.83999633789062
913
+ ],
914
+ "short_factor": [
915
+ 1.0,
916
+ 1.0199999809265137,
917
+ 1.0299999713897705,
918
+ 1.0299999713897705,
919
+ 1.0499999523162842,
920
+ 1.0499999523162842,
921
+ 1.0499999523162842,
922
+ 1.0499999523162842,
923
+ 1.0499999523162842,
924
+ 1.0699999332427979,
925
+ 1.0999999046325684,
926
+ 1.1099998950958252,
927
+ 1.1599998474121094,
928
+ 1.1599998474121094,
929
+ 1.1699998378753662,
930
+ 1.2899998426437378,
931
+ 1.339999794960022,
932
+ 1.679999828338623,
933
+ 1.7899998426437378,
934
+ 1.8199998140335083,
935
+ 1.8499997854232788,
936
+ 1.8799997568130493,
937
+ 1.9099997282028198,
938
+ 1.9399996995925903,
939
+ 1.9899996519088745,
940
+ 2.0199997425079346,
941
+ 2.0199997425079346,
942
+ 2.0199997425079346,
943
+ 2.0199997425079346,
944
+ 2.0199997425079346,
945
+ 2.0199997425079346,
946
+ 2.0299997329711914,
947
+ 2.0299997329711914,
948
+ 2.0299997329711914,
949
+ 2.0299997329711914,
950
+ 2.0299997329711914,
951
+ 2.0299997329711914,
952
+ 2.0299997329711914,
953
+ 2.0299997329711914,
954
+ 2.0299997329711914,
955
+ 2.0799996852874756,
956
+ 2.0899996757507324,
957
+ 2.189999580383301,
958
+ 2.2199995517730713,
959
+ 2.5899994373321533,
960
+ 2.729999542236328,
961
+ 2.749999523162842,
962
+ 2.8399994373321533
963
+ ],
964
+ "type": "longrope"
965
+ },
966
+ "rope_theta": 10000.0,
967
+ "sliding_window": 262144,
968
+ "tie_word_embeddings": false,
969
+ "torch_dtype": "bfloat16",
970
+ "transformers_version": "4.50.0",
971
+ "use_cache": true,
972
+ "vocab_size": 32064
973
+ }
974
+
975
+ [2025-03-25 12:51:15,423][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-128] due to args.save_total_limit
976
+ [2025-03-25 12:56:06,556][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
977
+ [2025-03-25 12:56:06,558][transformers.trainer][INFO] -
978
+ ***** Running Evaluation *****
979
+ [2025-03-25 12:56:06,558][transformers.trainer][INFO] - Num examples = 132
980
+ [2025-03-25 12:56:06,558][transformers.trainer][INFO] - Batch size = 16
981
+ [2025-03-25 12:56:20,914][transformers][INFO] - {'accuracy': 0.2878787878787879, 'RMSE': 65.04077974857798, 'QWK': 0.44284262977117705, 'HDIV': 0.14393939393939392, 'Macro_F1': 0.23398413876518773, 'Micro_F1': 0.2878787878787879, 'Weighted_F1': 0.23765550107707453, 'Macro_F1_(ignoring_nan)': np.float64(0.23398413876518773)}
982
+ [2025-03-25 12:56:20,915][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
983
+ [2025-03-25 12:56:20,917][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-192
984
+ [2025-03-25 12:56:21,420][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
985
+ [2025-03-25 12:56:21,421][transformers.configuration_utils][INFO] - Model config Phi3Config {
986
+ "architectures": [
987
+ "Phi3ForCausalLM"
988
+ ],
989
+ "attention_bias": false,
990
+ "attention_dropout": 0.0,
991
+ "auto_map": {
992
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
993
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
994
+ },
995
+ "bos_token_id": 1,
996
+ "embd_pdrop": 0.0,
997
+ "eos_token_id": 32000,
998
+ "hidden_act": "silu",
999
+ "hidden_size": 3072,
1000
+ "initializer_range": 0.02,
1001
+ "intermediate_size": 8192,
1002
+ "max_position_embeddings": 131072,
1003
+ "model_type": "phi3",
1004
+ "num_attention_heads": 32,
1005
+ "num_hidden_layers": 32,
1006
+ "num_key_value_heads": 32,
1007
+ "original_max_position_embeddings": 4096,
1008
+ "pad_token_id": 32000,
1009
+ "partial_rotary_factor": 1.0,
1010
+ "resid_pdrop": 0.0,
1011
+ "rms_norm_eps": 1e-05,
1012
+ "rope_scaling": {
1013
+ "long_factor": [
1014
+ 1.0800000429153442,
1015
+ 1.1100000143051147,
1016
+ 1.1399999856948853,
1017
+ 1.340000033378601,
1018
+ 1.5899999141693115,
1019
+ 1.600000023841858,
1020
+ 1.6200000047683716,
1021
+ 2.620000123977661,
1022
+ 3.2300000190734863,
1023
+ 3.2300000190734863,
1024
+ 4.789999961853027,
1025
+ 7.400000095367432,
1026
+ 7.700000286102295,
1027
+ 9.09000015258789,
1028
+ 12.199999809265137,
1029
+ 17.670000076293945,
1030
+ 24.46000099182129,
1031
+ 28.57000160217285,
1032
+ 30.420001983642578,
1033
+ 30.840002059936523,
1034
+ 32.590003967285156,
1035
+ 32.93000411987305,
1036
+ 42.320003509521484,
1037
+ 44.96000289916992,
1038
+ 50.340003967285156,
1039
+ 50.45000457763672,
1040
+ 57.55000305175781,
1041
+ 57.93000411987305,
1042
+ 58.21000289916992,
1043
+ 60.1400032043457,
1044
+ 62.61000442504883,
1045
+ 62.62000274658203,
1046
+ 62.71000289916992,
1047
+ 63.1400032043457,
1048
+ 63.1400032043457,
1049
+ 63.77000427246094,
1050
+ 63.93000411987305,
1051
+ 63.96000289916992,
1052
+ 63.970001220703125,
1053
+ 64.02999877929688,
1054
+ 64.06999969482422,
1055
+ 64.08000183105469,
1056
+ 64.12000274658203,
1057
+ 64.41000366210938,
1058
+ 64.4800033569336,
1059
+ 64.51000213623047,
1060
+ 64.52999877929688,
1061
+ 64.83999633789062
1062
+ ],
1063
+ "short_factor": [
1064
+ 1.0,
1065
+ 1.0199999809265137,
1066
+ 1.0299999713897705,
1067
+ 1.0299999713897705,
1068
+ 1.0499999523162842,
1069
+ 1.0499999523162842,
1070
+ 1.0499999523162842,
1071
+ 1.0499999523162842,
1072
+ 1.0499999523162842,
1073
+ 1.0699999332427979,
1074
+ 1.0999999046325684,
1075
+ 1.1099998950958252,
1076
+ 1.1599998474121094,
1077
+ 1.1599998474121094,
1078
+ 1.1699998378753662,
1079
+ 1.2899998426437378,
1080
+ 1.339999794960022,
1081
+ 1.679999828338623,
1082
+ 1.7899998426437378,
1083
+ 1.8199998140335083,
1084
+ 1.8499997854232788,
1085
+ 1.8799997568130493,
1086
+ 1.9099997282028198,
1087
+ 1.9399996995925903,
1088
+ 1.9899996519088745,
1089
+ 2.0199997425079346,
1090
+ 2.0199997425079346,
1091
+ 2.0199997425079346,
1092
+ 2.0199997425079346,
1093
+ 2.0199997425079346,
1094
+ 2.0199997425079346,
1095
+ 2.0299997329711914,
1096
+ 2.0299997329711914,
1097
+ 2.0299997329711914,
1098
+ 2.0299997329711914,
1099
+ 2.0299997329711914,
1100
+ 2.0299997329711914,
1101
+ 2.0299997329711914,
1102
+ 2.0299997329711914,
1103
+ 2.0299997329711914,
1104
+ 2.0799996852874756,
1105
+ 2.0899996757507324,
1106
+ 2.189999580383301,
1107
+ 2.2199995517730713,
1108
+ 2.5899994373321533,
1109
+ 2.729999542236328,
1110
+ 2.749999523162842,
1111
+ 2.8399994373321533
1112
+ ],
1113
+ "type": "longrope"
1114
+ },
1115
+ "rope_theta": 10000.0,
1116
+ "sliding_window": 262144,
1117
+ "tie_word_embeddings": false,
1118
+ "torch_dtype": "bfloat16",
1119
+ "transformers_version": "4.50.0",
1120
+ "use_cache": true,
1121
+ "vocab_size": 32064
1122
+ }
1123
+
1124
+ [2025-03-25 13:01:19,187][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1125
+ [2025-03-25 13:01:19,188][transformers.trainer][INFO] -
1126
+ ***** Running Evaluation *****
1127
+ [2025-03-25 13:01:19,189][transformers.trainer][INFO] - Num examples = 132
1128
+ [2025-03-25 13:01:19,189][transformers.trainer][INFO] - Batch size = 16
1129
+ [2025-03-25 13:01:33,419][transformers][INFO] - {'accuracy': 0.30303030303030304, 'RMSE': 62.957417066111944, 'QWK': 0.4823218997361477, 'HDIV': 0.10606060606060608, 'Macro_F1': 0.24116650168839468, 'Micro_F1': 0.30303030303030304, 'Weighted_F1': 0.24695724541810343, 'Macro_F1_(ignoring_nan)': np.float64(0.24116650168839468)}
1130
+ [2025-03-25 13:01:33,419][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1131
+ [2025-03-25 13:01:33,422][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-224
1132
+ [2025-03-25 13:01:34,096][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1133
+ [2025-03-25 13:01:34,097][transformers.configuration_utils][INFO] - Model config Phi3Config {
1134
+ "architectures": [
1135
+ "Phi3ForCausalLM"
1136
+ ],
1137
+ "attention_bias": false,
1138
+ "attention_dropout": 0.0,
1139
+ "auto_map": {
1140
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1141
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1142
+ },
1143
+ "bos_token_id": 1,
1144
+ "embd_pdrop": 0.0,
1145
+ "eos_token_id": 32000,
1146
+ "hidden_act": "silu",
1147
+ "hidden_size": 3072,
1148
+ "initializer_range": 0.02,
1149
+ "intermediate_size": 8192,
1150
+ "max_position_embeddings": 131072,
1151
+ "model_type": "phi3",
1152
+ "num_attention_heads": 32,
1153
+ "num_hidden_layers": 32,
1154
+ "num_key_value_heads": 32,
1155
+ "original_max_position_embeddings": 4096,
1156
+ "pad_token_id": 32000,
1157
+ "partial_rotary_factor": 1.0,
1158
+ "resid_pdrop": 0.0,
1159
+ "rms_norm_eps": 1e-05,
1160
+ "rope_scaling": {
1161
+ "long_factor": [
1162
+ 1.0800000429153442,
1163
+ 1.1100000143051147,
1164
+ 1.1399999856948853,
1165
+ 1.340000033378601,
1166
+ 1.5899999141693115,
1167
+ 1.600000023841858,
1168
+ 1.6200000047683716,
1169
+ 2.620000123977661,
1170
+ 3.2300000190734863,
1171
+ 3.2300000190734863,
1172
+ 4.789999961853027,
1173
+ 7.400000095367432,
1174
+ 7.700000286102295,
1175
+ 9.09000015258789,
1176
+ 12.199999809265137,
1177
+ 17.670000076293945,
1178
+ 24.46000099182129,
1179
+ 28.57000160217285,
1180
+ 30.420001983642578,
1181
+ 30.840002059936523,
1182
+ 32.590003967285156,
1183
+ 32.93000411987305,
1184
+ 42.320003509521484,
1185
+ 44.96000289916992,
1186
+ 50.340003967285156,
1187
+ 50.45000457763672,
1188
+ 57.55000305175781,
1189
+ 57.93000411987305,
1190
+ 58.21000289916992,
1191
+ 60.1400032043457,
1192
+ 62.61000442504883,
1193
+ 62.62000274658203,
1194
+ 62.71000289916992,
1195
+ 63.1400032043457,
1196
+ 63.1400032043457,
1197
+ 63.77000427246094,
1198
+ 63.93000411987305,
1199
+ 63.96000289916992,
1200
+ 63.970001220703125,
1201
+ 64.02999877929688,
1202
+ 64.06999969482422,
1203
+ 64.08000183105469,
1204
+ 64.12000274658203,
1205
+ 64.41000366210938,
1206
+ 64.4800033569336,
1207
+ 64.51000213623047,
1208
+ 64.52999877929688,
1209
+ 64.83999633789062
1210
+ ],
1211
+ "short_factor": [
1212
+ 1.0,
1213
+ 1.0199999809265137,
1214
+ 1.0299999713897705,
1215
+ 1.0299999713897705,
1216
+ 1.0499999523162842,
1217
+ 1.0499999523162842,
1218
+ 1.0499999523162842,
1219
+ 1.0499999523162842,
1220
+ 1.0499999523162842,
1221
+ 1.0699999332427979,
1222
+ 1.0999999046325684,
1223
+ 1.1099998950958252,
1224
+ 1.1599998474121094,
1225
+ 1.1599998474121094,
1226
+ 1.1699998378753662,
1227
+ 1.2899998426437378,
1228
+ 1.339999794960022,
1229
+ 1.679999828338623,
1230
+ 1.7899998426437378,
1231
+ 1.8199998140335083,
1232
+ 1.8499997854232788,
1233
+ 1.8799997568130493,
1234
+ 1.9099997282028198,
1235
+ 1.9399996995925903,
1236
+ 1.9899996519088745,
1237
+ 2.0199997425079346,
1238
+ 2.0199997425079346,
1239
+ 2.0199997425079346,
1240
+ 2.0199997425079346,
1241
+ 2.0199997425079346,
1242
+ 2.0199997425079346,
1243
+ 2.0299997329711914,
1244
+ 2.0299997329711914,
1245
+ 2.0299997329711914,
1246
+ 2.0299997329711914,
1247
+ 2.0299997329711914,
1248
+ 2.0299997329711914,
1249
+ 2.0299997329711914,
1250
+ 2.0299997329711914,
1251
+ 2.0299997329711914,
1252
+ 2.0799996852874756,
1253
+ 2.0899996757507324,
1254
+ 2.189999580383301,
1255
+ 2.2199995517730713,
1256
+ 2.5899994373321533,
1257
+ 2.729999542236328,
1258
+ 2.749999523162842,
1259
+ 2.8399994373321533
1260
+ ],
1261
+ "type": "longrope"
1262
+ },
1263
+ "rope_theta": 10000.0,
1264
+ "sliding_window": 262144,
1265
+ "tie_word_embeddings": false,
1266
+ "torch_dtype": "bfloat16",
1267
+ "transformers_version": "4.50.0",
1268
+ "use_cache": true,
1269
+ "vocab_size": 32064
1270
+ }
1271
+
1272
+ [2025-03-25 13:01:40,528][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-160] due to args.save_total_limit
1273
+ [2025-03-25 13:01:40,534][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-192] due to args.save_total_limit
1274
+ [2025-03-25 13:06:31,623][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1275
+ [2025-03-25 13:06:31,625][transformers.trainer][INFO] -
1276
+ ***** Running Evaluation *****
1277
+ [2025-03-25 13:06:31,626][transformers.trainer][INFO] - Num examples = 132
1278
+ [2025-03-25 13:06:31,626][transformers.trainer][INFO] - Batch size = 16
1279
+ [2025-03-25 13:06:45,998][transformers][INFO] - {'accuracy': 0.3333333333333333, 'RMSE': 57.52469825293227, 'QWK': 0.4667021843367075, 'HDIV': 0.06818181818181823, 'Macro_F1': 0.2829570150982398, 'Micro_F1': 0.3333333333333333, 'Weighted_F1': 0.321143903582232, 'Macro_F1_(ignoring_nan)': np.float64(0.2829570150982398)}
1280
+ [2025-03-25 13:06:45,998][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1281
+ [2025-03-25 13:06:46,001][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-256
1282
+ [2025-03-25 13:06:46,500][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1283
+ [2025-03-25 13:06:46,501][transformers.configuration_utils][INFO] - Model config Phi3Config {
1284
+ "architectures": [
1285
+ "Phi3ForCausalLM"
1286
+ ],
1287
+ "attention_bias": false,
1288
+ "attention_dropout": 0.0,
1289
+ "auto_map": {
1290
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1291
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1292
+ },
1293
+ "bos_token_id": 1,
1294
+ "embd_pdrop": 0.0,
1295
+ "eos_token_id": 32000,
1296
+ "hidden_act": "silu",
1297
+ "hidden_size": 3072,
1298
+ "initializer_range": 0.02,
1299
+ "intermediate_size": 8192,
1300
+ "max_position_embeddings": 131072,
1301
+ "model_type": "phi3",
1302
+ "num_attention_heads": 32,
1303
+ "num_hidden_layers": 32,
1304
+ "num_key_value_heads": 32,
1305
+ "original_max_position_embeddings": 4096,
1306
+ "pad_token_id": 32000,
1307
+ "partial_rotary_factor": 1.0,
1308
+ "resid_pdrop": 0.0,
1309
+ "rms_norm_eps": 1e-05,
1310
+ "rope_scaling": {
1311
+ "long_factor": [
1312
+ 1.0800000429153442,
1313
+ 1.1100000143051147,
1314
+ 1.1399999856948853,
1315
+ 1.340000033378601,
1316
+ 1.5899999141693115,
1317
+ 1.600000023841858,
1318
+ 1.6200000047683716,
1319
+ 2.620000123977661,
1320
+ 3.2300000190734863,
1321
+ 3.2300000190734863,
1322
+ 4.789999961853027,
1323
+ 7.400000095367432,
1324
+ 7.700000286102295,
1325
+ 9.09000015258789,
1326
+ 12.199999809265137,
1327
+ 17.670000076293945,
1328
+ 24.46000099182129,
1329
+ 28.57000160217285,
1330
+ 30.420001983642578,
1331
+ 30.840002059936523,
1332
+ 32.590003967285156,
1333
+ 32.93000411987305,
1334
+ 42.320003509521484,
1335
+ 44.96000289916992,
1336
+ 50.340003967285156,
1337
+ 50.45000457763672,
1338
+ 57.55000305175781,
1339
+ 57.93000411987305,
1340
+ 58.21000289916992,
1341
+ 60.1400032043457,
1342
+ 62.61000442504883,
1343
+ 62.62000274658203,
1344
+ 62.71000289916992,
1345
+ 63.1400032043457,
1346
+ 63.1400032043457,
1347
+ 63.77000427246094,
1348
+ 63.93000411987305,
1349
+ 63.96000289916992,
1350
+ 63.970001220703125,
1351
+ 64.02999877929688,
1352
+ 64.06999969482422,
1353
+ 64.08000183105469,
1354
+ 64.12000274658203,
1355
+ 64.41000366210938,
1356
+ 64.4800033569336,
1357
+ 64.51000213623047,
1358
+ 64.52999877929688,
1359
+ 64.83999633789062
1360
+ ],
1361
+ "short_factor": [
1362
+ 1.0,
1363
+ 1.0199999809265137,
1364
+ 1.0299999713897705,
1365
+ 1.0299999713897705,
1366
+ 1.0499999523162842,
1367
+ 1.0499999523162842,
1368
+ 1.0499999523162842,
1369
+ 1.0499999523162842,
1370
+ 1.0499999523162842,
1371
+ 1.0699999332427979,
1372
+ 1.0999999046325684,
1373
+ 1.1099998950958252,
1374
+ 1.1599998474121094,
1375
+ 1.1599998474121094,
1376
+ 1.1699998378753662,
1377
+ 1.2899998426437378,
1378
+ 1.339999794960022,
1379
+ 1.679999828338623,
1380
+ 1.7899998426437378,
1381
+ 1.8199998140335083,
1382
+ 1.8499997854232788,
1383
+ 1.8799997568130493,
1384
+ 1.9099997282028198,
1385
+ 1.9399996995925903,
1386
+ 1.9899996519088745,
1387
+ 2.0199997425079346,
1388
+ 2.0199997425079346,
1389
+ 2.0199997425079346,
1390
+ 2.0199997425079346,
1391
+ 2.0199997425079346,
1392
+ 2.0199997425079346,
1393
+ 2.0299997329711914,
1394
+ 2.0299997329711914,
1395
+ 2.0299997329711914,
1396
+ 2.0299997329711914,
1397
+ 2.0299997329711914,
1398
+ 2.0299997329711914,
1399
+ 2.0299997329711914,
1400
+ 2.0299997329711914,
1401
+ 2.0299997329711914,
1402
+ 2.0799996852874756,
1403
+ 2.0899996757507324,
1404
+ 2.189999580383301,
1405
+ 2.2199995517730713,
1406
+ 2.5899994373321533,
1407
+ 2.729999542236328,
1408
+ 2.749999523162842,
1409
+ 2.8399994373321533
1410
+ ],
1411
+ "type": "longrope"
1412
+ },
1413
+ "rope_theta": 10000.0,
1414
+ "sliding_window": 262144,
1415
+ "tie_word_embeddings": false,
1416
+ "torch_dtype": "bfloat16",
1417
+ "transformers_version": "4.50.0",
1418
+ "use_cache": true,
1419
+ "vocab_size": 32064
1420
+ }
1421
+
1422
+ [2025-03-25 13:11:43,809][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1423
+ [2025-03-25 13:11:43,811][transformers.trainer][INFO] -
1424
+ ***** Running Evaluation *****
1425
+ [2025-03-25 13:11:43,812][transformers.trainer][INFO] - Num examples = 132
1426
+ [2025-03-25 13:11:43,812][transformers.trainer][INFO] - Batch size = 16
1427
+ [2025-03-25 13:11:58,120][transformers][INFO] - {'accuracy': 0.3333333333333333, 'RMSE': 54.9379815626841, 'QWK': 0.5332575972735019, 'HDIV': 0.06060606060606055, 'Macro_F1': 0.27814158963507446, 'Micro_F1': 0.3333333333333333, 'Weighted_F1': 0.28024515114391496, 'Macro_F1_(ignoring_nan)': np.float64(0.27814158963507446)}
1428
+ [2025-03-25 13:11:58,121][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1429
+ [2025-03-25 13:11:58,124][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-288
1430
+ [2025-03-25 13:11:58,617][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1431
+ [2025-03-25 13:11:58,618][transformers.configuration_utils][INFO] - Model config Phi3Config {
1432
+ "architectures": [
1433
+ "Phi3ForCausalLM"
1434
+ ],
1435
+ "attention_bias": false,
1436
+ "attention_dropout": 0.0,
1437
+ "auto_map": {
1438
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1439
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1440
+ },
1441
+ "bos_token_id": 1,
1442
+ "embd_pdrop": 0.0,
1443
+ "eos_token_id": 32000,
1444
+ "hidden_act": "silu",
1445
+ "hidden_size": 3072,
1446
+ "initializer_range": 0.02,
1447
+ "intermediate_size": 8192,
1448
+ "max_position_embeddings": 131072,
1449
+ "model_type": "phi3",
1450
+ "num_attention_heads": 32,
1451
+ "num_hidden_layers": 32,
1452
+ "num_key_value_heads": 32,
1453
+ "original_max_position_embeddings": 4096,
1454
+ "pad_token_id": 32000,
1455
+ "partial_rotary_factor": 1.0,
1456
+ "resid_pdrop": 0.0,
1457
+ "rms_norm_eps": 1e-05,
1458
+ "rope_scaling": {
1459
+ "long_factor": [
1460
+ 1.0800000429153442,
1461
+ 1.1100000143051147,
1462
+ 1.1399999856948853,
1463
+ 1.340000033378601,
1464
+ 1.5899999141693115,
1465
+ 1.600000023841858,
1466
+ 1.6200000047683716,
1467
+ 2.620000123977661,
1468
+ 3.2300000190734863,
1469
+ 3.2300000190734863,
1470
+ 4.789999961853027,
1471
+ 7.400000095367432,
1472
+ 7.700000286102295,
1473
+ 9.09000015258789,
1474
+ 12.199999809265137,
1475
+ 17.670000076293945,
1476
+ 24.46000099182129,
1477
+ 28.57000160217285,
1478
+ 30.420001983642578,
1479
+ 30.840002059936523,
1480
+ 32.590003967285156,
1481
+ 32.93000411987305,
1482
+ 42.320003509521484,
1483
+ 44.96000289916992,
1484
+ 50.340003967285156,
1485
+ 50.45000457763672,
1486
+ 57.55000305175781,
1487
+ 57.93000411987305,
1488
+ 58.21000289916992,
1489
+ 60.1400032043457,
1490
+ 62.61000442504883,
1491
+ 62.62000274658203,
1492
+ 62.71000289916992,
1493
+ 63.1400032043457,
1494
+ 63.1400032043457,
1495
+ 63.77000427246094,
1496
+ 63.93000411987305,
1497
+ 63.96000289916992,
1498
+ 63.970001220703125,
1499
+ 64.02999877929688,
1500
+ 64.06999969482422,
1501
+ 64.08000183105469,
1502
+ 64.12000274658203,
1503
+ 64.41000366210938,
1504
+ 64.4800033569336,
1505
+ 64.51000213623047,
1506
+ 64.52999877929688,
1507
+ 64.83999633789062
1508
+ ],
1509
+ "short_factor": [
1510
+ 1.0,
1511
+ 1.0199999809265137,
1512
+ 1.0299999713897705,
1513
+ 1.0299999713897705,
1514
+ 1.0499999523162842,
1515
+ 1.0499999523162842,
1516
+ 1.0499999523162842,
1517
+ 1.0499999523162842,
1518
+ 1.0499999523162842,
1519
+ 1.0699999332427979,
1520
+ 1.0999999046325684,
1521
+ 1.1099998950958252,
1522
+ 1.1599998474121094,
1523
+ 1.1599998474121094,
1524
+ 1.1699998378753662,
1525
+ 1.2899998426437378,
1526
+ 1.339999794960022,
1527
+ 1.679999828338623,
1528
+ 1.7899998426437378,
1529
+ 1.8199998140335083,
1530
+ 1.8499997854232788,
1531
+ 1.8799997568130493,
1532
+ 1.9099997282028198,
1533
+ 1.9399996995925903,
1534
+ 1.9899996519088745,
1535
+ 2.0199997425079346,
1536
+ 2.0199997425079346,
1537
+ 2.0199997425079346,
1538
+ 2.0199997425079346,
1539
+ 2.0199997425079346,
1540
+ 2.0199997425079346,
1541
+ 2.0299997329711914,
1542
+ 2.0299997329711914,
1543
+ 2.0299997329711914,
1544
+ 2.0299997329711914,
1545
+ 2.0299997329711914,
1546
+ 2.0299997329711914,
1547
+ 2.0299997329711914,
1548
+ 2.0299997329711914,
1549
+ 2.0299997329711914,
1550
+ 2.0799996852874756,
1551
+ 2.0899996757507324,
1552
+ 2.189999580383301,
1553
+ 2.2199995517730713,
1554
+ 2.5899994373321533,
1555
+ 2.729999542236328,
1556
+ 2.749999523162842,
1557
+ 2.8399994373321533
1558
+ ],
1559
+ "type": "longrope"
1560
+ },
1561
+ "rope_theta": 10000.0,
1562
+ "sliding_window": 262144,
1563
+ "tie_word_embeddings": false,
1564
+ "torch_dtype": "bfloat16",
1565
+ "transformers_version": "4.50.0",
1566
+ "use_cache": true,
1567
+ "vocab_size": 32064
1568
+ }
1569
+
1570
+ [2025-03-25 13:12:05,028][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-224] due to args.save_total_limit
1571
+ [2025-03-25 13:12:05,035][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-256] due to args.save_total_limit
1572
+ [2025-03-25 13:16:56,319][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1573
+ [2025-03-25 13:16:56,321][transformers.trainer][INFO] -
1574
+ ***** Running Evaluation *****
1575
+ [2025-03-25 13:16:56,322][transformers.trainer][INFO] - Num examples = 132
1576
+ [2025-03-25 13:16:56,322][transformers.trainer][INFO] - Batch size = 16
1577
+ [2025-03-25 13:17:10,819][transformers][INFO] - {'accuracy': 0.3484848484848485, 'RMSE': 58.981250230796896, 'QWK': 0.5139088482857729, 'HDIV': 0.10606060606060608, 'Macro_F1': 0.26306055334557943, 'Micro_F1': 0.3484848484848485, 'Weighted_F1': 0.30461075851025127, 'Macro_F1_(ignoring_nan)': np.float64(0.3156726640146953)}
1578
+ [2025-03-25 13:17:10,820][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1579
+ [2025-03-25 13:17:10,822][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-320
1580
+ [2025-03-25 13:17:11,314][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1581
+ [2025-03-25 13:17:11,315][transformers.configuration_utils][INFO] - Model config Phi3Config {
1582
+ "architectures": [
1583
+ "Phi3ForCausalLM"
1584
+ ],
1585
+ "attention_bias": false,
1586
+ "attention_dropout": 0.0,
1587
+ "auto_map": {
1588
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1589
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1590
+ },
1591
+ "bos_token_id": 1,
1592
+ "embd_pdrop": 0.0,
1593
+ "eos_token_id": 32000,
1594
+ "hidden_act": "silu",
1595
+ "hidden_size": 3072,
1596
+ "initializer_range": 0.02,
1597
+ "intermediate_size": 8192,
1598
+ "max_position_embeddings": 131072,
1599
+ "model_type": "phi3",
1600
+ "num_attention_heads": 32,
1601
+ "num_hidden_layers": 32,
1602
+ "num_key_value_heads": 32,
1603
+ "original_max_position_embeddings": 4096,
1604
+ "pad_token_id": 32000,
1605
+ "partial_rotary_factor": 1.0,
1606
+ "resid_pdrop": 0.0,
1607
+ "rms_norm_eps": 1e-05,
1608
+ "rope_scaling": {
1609
+ "long_factor": [
1610
+ 1.0800000429153442,
1611
+ 1.1100000143051147,
1612
+ 1.1399999856948853,
1613
+ 1.340000033378601,
1614
+ 1.5899999141693115,
1615
+ 1.600000023841858,
1616
+ 1.6200000047683716,
1617
+ 2.620000123977661,
1618
+ 3.2300000190734863,
1619
+ 3.2300000190734863,
1620
+ 4.789999961853027,
1621
+ 7.400000095367432,
1622
+ 7.700000286102295,
1623
+ 9.09000015258789,
1624
+ 12.199999809265137,
1625
+ 17.670000076293945,
1626
+ 24.46000099182129,
1627
+ 28.57000160217285,
1628
+ 30.420001983642578,
1629
+ 30.840002059936523,
1630
+ 32.590003967285156,
1631
+ 32.93000411987305,
1632
+ 42.320003509521484,
1633
+ 44.96000289916992,
1634
+ 50.340003967285156,
1635
+ 50.45000457763672,
1636
+ 57.55000305175781,
1637
+ 57.93000411987305,
1638
+ 58.21000289916992,
1639
+ 60.1400032043457,
1640
+ 62.61000442504883,
1641
+ 62.62000274658203,
1642
+ 62.71000289916992,
1643
+ 63.1400032043457,
1644
+ 63.1400032043457,
1645
+ 63.77000427246094,
1646
+ 63.93000411987305,
1647
+ 63.96000289916992,
1648
+ 63.970001220703125,
1649
+ 64.02999877929688,
1650
+ 64.06999969482422,
1651
+ 64.08000183105469,
1652
+ 64.12000274658203,
1653
+ 64.41000366210938,
1654
+ 64.4800033569336,
1655
+ 64.51000213623047,
1656
+ 64.52999877929688,
1657
+ 64.83999633789062
1658
+ ],
1659
+ "short_factor": [
1660
+ 1.0,
1661
+ 1.0199999809265137,
1662
+ 1.0299999713897705,
1663
+ 1.0299999713897705,
1664
+ 1.0499999523162842,
1665
+ 1.0499999523162842,
1666
+ 1.0499999523162842,
1667
+ 1.0499999523162842,
1668
+ 1.0499999523162842,
1669
+ 1.0699999332427979,
1670
+ 1.0999999046325684,
1671
+ 1.1099998950958252,
1672
+ 1.1599998474121094,
1673
+ 1.1599998474121094,
1674
+ 1.1699998378753662,
1675
+ 1.2899998426437378,
1676
+ 1.339999794960022,
1677
+ 1.679999828338623,
1678
+ 1.7899998426437378,
1679
+ 1.8199998140335083,
1680
+ 1.8499997854232788,
1681
+ 1.8799997568130493,
1682
+ 1.9099997282028198,
1683
+ 1.9399996995925903,
1684
+ 1.9899996519088745,
1685
+ 2.0199997425079346,
1686
+ 2.0199997425079346,
1687
+ 2.0199997425079346,
1688
+ 2.0199997425079346,
1689
+ 2.0199997425079346,
1690
+ 2.0199997425079346,
1691
+ 2.0299997329711914,
1692
+ 2.0299997329711914,
1693
+ 2.0299997329711914,
1694
+ 2.0299997329711914,
1695
+ 2.0299997329711914,
1696
+ 2.0299997329711914,
1697
+ 2.0299997329711914,
1698
+ 2.0299997329711914,
1699
+ 2.0299997329711914,
1700
+ 2.0799996852874756,
1701
+ 2.0899996757507324,
1702
+ 2.189999580383301,
1703
+ 2.2199995517730713,
1704
+ 2.5899994373321533,
1705
+ 2.729999542236328,
1706
+ 2.749999523162842,
1707
+ 2.8399994373321533
1708
+ ],
1709
+ "type": "longrope"
1710
+ },
1711
+ "rope_theta": 10000.0,
1712
+ "sliding_window": 262144,
1713
+ "tie_word_embeddings": false,
1714
+ "torch_dtype": "bfloat16",
1715
+ "transformers_version": "4.50.0",
1716
+ "use_cache": true,
1717
+ "vocab_size": 32064
1718
+ }
1719
+
1720
+ [2025-03-25 13:22:08,859][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1721
+ [2025-03-25 13:22:08,861][transformers.trainer][INFO] -
1722
+ ***** Running Evaluation *****
1723
+ [2025-03-25 13:22:08,861][transformers.trainer][INFO] - Num examples = 132
1724
+ [2025-03-25 13:22:08,861][transformers.trainer][INFO] - Batch size = 16
1725
+ [2025-03-25 13:22:23,115][transformers][INFO] - {'accuracy': 0.3181818181818182, 'RMSE': 59.18640302493726, 'QWK': 0.4907896844465802, 'HDIV': 0.09848484848484851, 'Macro_F1': 0.25637357391295873, 'Micro_F1': 0.3181818181818182, 'Weighted_F1': 0.27171617579719604, 'Macro_F1_(ignoring_nan)': np.float64(0.25637357391295873)}
1726
+ [2025-03-25 13:22:23,115][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1727
+ [2025-03-25 13:22:23,117][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-352
1728
+ [2025-03-25 13:22:23,600][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1729
+ [2025-03-25 13:22:23,600][transformers.configuration_utils][INFO] - Model config Phi3Config {
1730
+ "architectures": [
1731
+ "Phi3ForCausalLM"
1732
+ ],
1733
+ "attention_bias": false,
1734
+ "attention_dropout": 0.0,
1735
+ "auto_map": {
1736
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1737
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1738
+ },
1739
+ "bos_token_id": 1,
1740
+ "embd_pdrop": 0.0,
1741
+ "eos_token_id": 32000,
1742
+ "hidden_act": "silu",
1743
+ "hidden_size": 3072,
1744
+ "initializer_range": 0.02,
1745
+ "intermediate_size": 8192,
1746
+ "max_position_embeddings": 131072,
1747
+ "model_type": "phi3",
1748
+ "num_attention_heads": 32,
1749
+ "num_hidden_layers": 32,
1750
+ "num_key_value_heads": 32,
1751
+ "original_max_position_embeddings": 4096,
1752
+ "pad_token_id": 32000,
1753
+ "partial_rotary_factor": 1.0,
1754
+ "resid_pdrop": 0.0,
1755
+ "rms_norm_eps": 1e-05,
1756
+ "rope_scaling": {
1757
+ "long_factor": [
1758
+ 1.0800000429153442,
1759
+ 1.1100000143051147,
1760
+ 1.1399999856948853,
1761
+ 1.340000033378601,
1762
+ 1.5899999141693115,
1763
+ 1.600000023841858,
1764
+ 1.6200000047683716,
1765
+ 2.620000123977661,
1766
+ 3.2300000190734863,
1767
+ 3.2300000190734863,
1768
+ 4.789999961853027,
1769
+ 7.400000095367432,
1770
+ 7.700000286102295,
1771
+ 9.09000015258789,
1772
+ 12.199999809265137,
1773
+ 17.670000076293945,
1774
+ 24.46000099182129,
1775
+ 28.57000160217285,
1776
+ 30.420001983642578,
1777
+ 30.840002059936523,
1778
+ 32.590003967285156,
1779
+ 32.93000411987305,
1780
+ 42.320003509521484,
1781
+ 44.96000289916992,
1782
+ 50.340003967285156,
1783
+ 50.45000457763672,
1784
+ 57.55000305175781,
1785
+ 57.93000411987305,
1786
+ 58.21000289916992,
1787
+ 60.1400032043457,
1788
+ 62.61000442504883,
1789
+ 62.62000274658203,
1790
+ 62.71000289916992,
1791
+ 63.1400032043457,
1792
+ 63.1400032043457,
1793
+ 63.77000427246094,
1794
+ 63.93000411987305,
1795
+ 63.96000289916992,
1796
+ 63.970001220703125,
1797
+ 64.02999877929688,
1798
+ 64.06999969482422,
1799
+ 64.08000183105469,
1800
+ 64.12000274658203,
1801
+ 64.41000366210938,
1802
+ 64.4800033569336,
1803
+ 64.51000213623047,
1804
+ 64.52999877929688,
1805
+ 64.83999633789062
1806
+ ],
1807
+ "short_factor": [
1808
+ 1.0,
1809
+ 1.0199999809265137,
1810
+ 1.0299999713897705,
1811
+ 1.0299999713897705,
1812
+ 1.0499999523162842,
1813
+ 1.0499999523162842,
1814
+ 1.0499999523162842,
1815
+ 1.0499999523162842,
1816
+ 1.0499999523162842,
1817
+ 1.0699999332427979,
1818
+ 1.0999999046325684,
1819
+ 1.1099998950958252,
1820
+ 1.1599998474121094,
1821
+ 1.1599998474121094,
1822
+ 1.1699998378753662,
1823
+ 1.2899998426437378,
1824
+ 1.339999794960022,
1825
+ 1.679999828338623,
1826
+ 1.7899998426437378,
1827
+ 1.8199998140335083,
1828
+ 1.8499997854232788,
1829
+ 1.8799997568130493,
1830
+ 1.9099997282028198,
1831
+ 1.9399996995925903,
1832
+ 1.9899996519088745,
1833
+ 2.0199997425079346,
1834
+ 2.0199997425079346,
1835
+ 2.0199997425079346,
1836
+ 2.0199997425079346,
1837
+ 2.0199997425079346,
1838
+ 2.0199997425079346,
1839
+ 2.0299997329711914,
1840
+ 2.0299997329711914,
1841
+ 2.0299997329711914,
1842
+ 2.0299997329711914,
1843
+ 2.0299997329711914,
1844
+ 2.0299997329711914,
1845
+ 2.0299997329711914,
1846
+ 2.0299997329711914,
1847
+ 2.0299997329711914,
1848
+ 2.0799996852874756,
1849
+ 2.0899996757507324,
1850
+ 2.189999580383301,
1851
+ 2.2199995517730713,
1852
+ 2.5899994373321533,
1853
+ 2.729999542236328,
1854
+ 2.749999523162842,
1855
+ 2.8399994373321533
1856
+ ],
1857
+ "type": "longrope"
1858
+ },
1859
+ "rope_theta": 10000.0,
1860
+ "sliding_window": 262144,
1861
+ "tie_word_embeddings": false,
1862
+ "torch_dtype": "bfloat16",
1863
+ "transformers_version": "4.50.0",
1864
+ "use_cache": true,
1865
+ "vocab_size": 32064
1866
+ }
1867
+
1868
+ [2025-03-25 13:22:29,922][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-320] due to args.save_total_limit
1869
+ [2025-03-25 13:27:21,135][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1870
+ [2025-03-25 13:27:21,137][transformers.trainer][INFO] -
1871
+ ***** Running Evaluation *****
1872
+ [2025-03-25 13:27:21,137][transformers.trainer][INFO] - Num examples = 132
1873
+ [2025-03-25 13:27:21,137][transformers.trainer][INFO] - Batch size = 16
1874
+ [2025-03-25 13:27:35,320][transformers][INFO] - {'accuracy': 0.3333333333333333, 'RMSE': 57.735026918962575, 'QWK': 0.5033656214086358, 'HDIV': 0.06818181818181823, 'Macro_F1': 0.27161705646113865, 'Micro_F1': 0.3333333333333333, 'Weighted_F1': 0.28464037970062067, 'Macro_F1_(ignoring_nan)': np.float64(0.27161705646113865)}
1875
+ [2025-03-25 13:27:35,321][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1876
+ [2025-03-25 13:27:35,323][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-384
1877
+ [2025-03-25 13:27:35,807][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1878
+ [2025-03-25 13:27:35,808][transformers.configuration_utils][INFO] - Model config Phi3Config {
1879
+ "architectures": [
1880
+ "Phi3ForCausalLM"
1881
+ ],
1882
+ "attention_bias": false,
1883
+ "attention_dropout": 0.0,
1884
+ "auto_map": {
1885
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1886
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1887
+ },
1888
+ "bos_token_id": 1,
1889
+ "embd_pdrop": 0.0,
1890
+ "eos_token_id": 32000,
1891
+ "hidden_act": "silu",
1892
+ "hidden_size": 3072,
1893
+ "initializer_range": 0.02,
1894
+ "intermediate_size": 8192,
1895
+ "max_position_embeddings": 131072,
1896
+ "model_type": "phi3",
1897
+ "num_attention_heads": 32,
1898
+ "num_hidden_layers": 32,
1899
+ "num_key_value_heads": 32,
1900
+ "original_max_position_embeddings": 4096,
1901
+ "pad_token_id": 32000,
1902
+ "partial_rotary_factor": 1.0,
1903
+ "resid_pdrop": 0.0,
1904
+ "rms_norm_eps": 1e-05,
1905
+ "rope_scaling": {
1906
+ "long_factor": [
1907
+ 1.0800000429153442,
1908
+ 1.1100000143051147,
1909
+ 1.1399999856948853,
1910
+ 1.340000033378601,
1911
+ 1.5899999141693115,
1912
+ 1.600000023841858,
1913
+ 1.6200000047683716,
1914
+ 2.620000123977661,
1915
+ 3.2300000190734863,
1916
+ 3.2300000190734863,
1917
+ 4.789999961853027,
1918
+ 7.400000095367432,
1919
+ 7.700000286102295,
1920
+ 9.09000015258789,
1921
+ 12.199999809265137,
1922
+ 17.670000076293945,
1923
+ 24.46000099182129,
1924
+ 28.57000160217285,
1925
+ 30.420001983642578,
1926
+ 30.840002059936523,
1927
+ 32.590003967285156,
1928
+ 32.93000411987305,
1929
+ 42.320003509521484,
1930
+ 44.96000289916992,
1931
+ 50.340003967285156,
1932
+ 50.45000457763672,
1933
+ 57.55000305175781,
1934
+ 57.93000411987305,
1935
+ 58.21000289916992,
1936
+ 60.1400032043457,
1937
+ 62.61000442504883,
1938
+ 62.62000274658203,
1939
+ 62.71000289916992,
1940
+ 63.1400032043457,
1941
+ 63.1400032043457,
1942
+ 63.77000427246094,
1943
+ 63.93000411987305,
1944
+ 63.96000289916992,
1945
+ 63.970001220703125,
1946
+ 64.02999877929688,
1947
+ 64.06999969482422,
1948
+ 64.08000183105469,
1949
+ 64.12000274658203,
1950
+ 64.41000366210938,
1951
+ 64.4800033569336,
1952
+ 64.51000213623047,
1953
+ 64.52999877929688,
1954
+ 64.83999633789062
1955
+ ],
1956
+ "short_factor": [
1957
+ 1.0,
1958
+ 1.0199999809265137,
1959
+ 1.0299999713897705,
1960
+ 1.0299999713897705,
1961
+ 1.0499999523162842,
1962
+ 1.0499999523162842,
1963
+ 1.0499999523162842,
1964
+ 1.0499999523162842,
1965
+ 1.0499999523162842,
1966
+ 1.0699999332427979,
1967
+ 1.0999999046325684,
1968
+ 1.1099998950958252,
1969
+ 1.1599998474121094,
1970
+ 1.1599998474121094,
1971
+ 1.1699998378753662,
1972
+ 1.2899998426437378,
1973
+ 1.339999794960022,
1974
+ 1.679999828338623,
1975
+ 1.7899998426437378,
1976
+ 1.8199998140335083,
1977
+ 1.8499997854232788,
1978
+ 1.8799997568130493,
1979
+ 1.9099997282028198,
1980
+ 1.9399996995925903,
1981
+ 1.9899996519088745,
1982
+ 2.0199997425079346,
1983
+ 2.0199997425079346,
1984
+ 2.0199997425079346,
1985
+ 2.0199997425079346,
1986
+ 2.0199997425079346,
1987
+ 2.0199997425079346,
1988
+ 2.0299997329711914,
1989
+ 2.0299997329711914,
1990
+ 2.0299997329711914,
1991
+ 2.0299997329711914,
1992
+ 2.0299997329711914,
1993
+ 2.0299997329711914,
1994
+ 2.0299997329711914,
1995
+ 2.0299997329711914,
1996
+ 2.0299997329711914,
1997
+ 2.0799996852874756,
1998
+ 2.0899996757507324,
1999
+ 2.189999580383301,
2000
+ 2.2199995517730713,
2001
+ 2.5899994373321533,
2002
+ 2.729999542236328,
2003
+ 2.749999523162842,
2004
+ 2.8399994373321533
2005
+ ],
2006
+ "type": "longrope"
2007
+ },
2008
+ "rope_theta": 10000.0,
2009
+ "sliding_window": 262144,
2010
+ "tie_word_embeddings": false,
2011
+ "torch_dtype": "bfloat16",
2012
+ "transformers_version": "4.50.0",
2013
+ "use_cache": true,
2014
+ "vocab_size": 32064
2015
+ }
2016
+
2017
+ [2025-03-25 13:27:42,121][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-352] due to args.save_total_limit
2018
+ [2025-03-25 13:32:33,398][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
2019
+ [2025-03-25 13:32:33,400][transformers.trainer][INFO] -
2020
+ ***** Running Evaluation *****
2021
+ [2025-03-25 13:32:33,400][transformers.trainer][INFO] - Num examples = 132
2022
+ [2025-03-25 13:32:33,401][transformers.trainer][INFO] - Batch size = 16
2023
+ [2025-03-25 13:32:47,620][transformers][INFO] - {'accuracy': 0.30303030303030304, 'RMSE': 61.791438065332464, 'QWK': 0.43218441033484456, 'HDIV': 0.09090909090909094, 'Macro_F1': 0.25084362139917693, 'Micro_F1': 0.30303030303030304, 'Weighted_F1': 0.27994388327721664, 'Macro_F1_(ignoring_nan)': np.float64(0.25084362139917693)}
2024
+ [2025-03-25 13:32:47,620][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
2025
+ [2025-03-25 13:32:47,624][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-416
2026
+ [2025-03-25 13:32:48,161][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
2027
+ [2025-03-25 13:32:48,162][transformers.configuration_utils][INFO] - Model config Phi3Config {
2028
+ "architectures": [
2029
+ "Phi3ForCausalLM"
2030
+ ],
2031
+ "attention_bias": false,
2032
+ "attention_dropout": 0.0,
2033
+ "auto_map": {
2034
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
2035
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
2036
+ },
2037
+ "bos_token_id": 1,
2038
+ "embd_pdrop": 0.0,
2039
+ "eos_token_id": 32000,
2040
+ "hidden_act": "silu",
2041
+ "hidden_size": 3072,
2042
+ "initializer_range": 0.02,
2043
+ "intermediate_size": 8192,
2044
+ "max_position_embeddings": 131072,
2045
+ "model_type": "phi3",
2046
+ "num_attention_heads": 32,
2047
+ "num_hidden_layers": 32,
2048
+ "num_key_value_heads": 32,
2049
+ "original_max_position_embeddings": 4096,
2050
+ "pad_token_id": 32000,
2051
+ "partial_rotary_factor": 1.0,
2052
+ "resid_pdrop": 0.0,
2053
+ "rms_norm_eps": 1e-05,
2054
+ "rope_scaling": {
2055
+ "long_factor": [
2056
+ 1.0800000429153442,
2057
+ 1.1100000143051147,
2058
+ 1.1399999856948853,
2059
+ 1.340000033378601,
2060
+ 1.5899999141693115,
2061
+ 1.600000023841858,
2062
+ 1.6200000047683716,
2063
+ 2.620000123977661,
2064
+ 3.2300000190734863,
2065
+ 3.2300000190734863,
2066
+ 4.789999961853027,
2067
+ 7.400000095367432,
2068
+ 7.700000286102295,
2069
+ 9.09000015258789,
2070
+ 12.199999809265137,
2071
+ 17.670000076293945,
2072
+ 24.46000099182129,
2073
+ 28.57000160217285,
2074
+ 30.420001983642578,
2075
+ 30.840002059936523,
2076
+ 32.590003967285156,
2077
+ 32.93000411987305,
2078
+ 42.320003509521484,
2079
+ 44.96000289916992,
2080
+ 50.340003967285156,
2081
+ 50.45000457763672,
2082
+ 57.55000305175781,
2083
+ 57.93000411987305,
2084
+ 58.21000289916992,
2085
+ 60.1400032043457,
2086
+ 62.61000442504883,
2087
+ 62.62000274658203,
2088
+ 62.71000289916992,
2089
+ 63.1400032043457,
2090
+ 63.1400032043457,
2091
+ 63.77000427246094,
2092
+ 63.93000411987305,
2093
+ 63.96000289916992,
2094
+ 63.970001220703125,
2095
+ 64.02999877929688,
2096
+ 64.06999969482422,
2097
+ 64.08000183105469,
2098
+ 64.12000274658203,
2099
+ 64.41000366210938,
2100
+ 64.4800033569336,
2101
+ 64.51000213623047,
2102
+ 64.52999877929688,
2103
+ 64.83999633789062
2104
+ ],
2105
+ "short_factor": [
2106
+ 1.0,
2107
+ 1.0199999809265137,
2108
+ 1.0299999713897705,
2109
+ 1.0299999713897705,
2110
+ 1.0499999523162842,
2111
+ 1.0499999523162842,
2112
+ 1.0499999523162842,
2113
+ 1.0499999523162842,
2114
+ 1.0499999523162842,
2115
+ 1.0699999332427979,
2116
+ 1.0999999046325684,
2117
+ 1.1099998950958252,
2118
+ 1.1599998474121094,
2119
+ 1.1599998474121094,
2120
+ 1.1699998378753662,
2121
+ 1.2899998426437378,
2122
+ 1.339999794960022,
2123
+ 1.679999828338623,
2124
+ 1.7899998426437378,
2125
+ 1.8199998140335083,
2126
+ 1.8499997854232788,
2127
+ 1.8799997568130493,
2128
+ 1.9099997282028198,
2129
+ 1.9399996995925903,
2130
+ 1.9899996519088745,
2131
+ 2.0199997425079346,
2132
+ 2.0199997425079346,
2133
+ 2.0199997425079346,
2134
+ 2.0199997425079346,
2135
+ 2.0199997425079346,
2136
+ 2.0199997425079346,
2137
+ 2.0299997329711914,
2138
+ 2.0299997329711914,
2139
+ 2.0299997329711914,
2140
+ 2.0299997329711914,
2141
+ 2.0299997329711914,
2142
+ 2.0299997329711914,
2143
+ 2.0299997329711914,
2144
+ 2.0299997329711914,
2145
+ 2.0299997329711914,
2146
+ 2.0799996852874756,
2147
+ 2.0899996757507324,
2148
+ 2.189999580383301,
2149
+ 2.2199995517730713,
2150
+ 2.5899994373321533,
2151
+ 2.729999542236328,
2152
+ 2.749999523162842,
2153
+ 2.8399994373321533
2154
+ ],
2155
+ "type": "longrope"
2156
+ },
2157
+ "rope_theta": 10000.0,
2158
+ "sliding_window": 262144,
2159
+ "tie_word_embeddings": false,
2160
+ "torch_dtype": "bfloat16",
2161
+ "transformers_version": "4.50.0",
2162
+ "use_cache": true,
2163
+ "vocab_size": 32064
2164
+ }
2165
+
2166
+ [2025-03-25 13:32:52,654][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-384] due to args.save_total_limit
2167
+ [2025-03-25 13:37:43,765][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
2168
+ [2025-03-25 13:37:43,767][transformers.trainer][INFO] -
2169
+ ***** Running Evaluation *****
2170
+ [2025-03-25 13:37:43,767][transformers.trainer][INFO] - Num examples = 132
2171
+ [2025-03-25 13:37:43,767][transformers.trainer][INFO] - Batch size = 16
2172
+ [2025-03-25 13:37:58,215][transformers][INFO] - {'accuracy': 0.25, 'RMSE': 64.8541487189571, 'QWK': 0.46677532013969725, 'HDIV': 0.09090909090909094, 'Macro_F1': 0.22235122119023046, 'Micro_F1': 0.25, 'Weighted_F1': 0.23538011695906436, 'Macro_F1_(ignoring_nan)': np.float64(0.22235122119023046)}
2173
+ [2025-03-25 13:37:58,216][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
2174
+ [2025-03-25 13:37:58,218][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-448
2175
+ [2025-03-25 13:37:58,700][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
2176
+ [2025-03-25 13:37:58,701][transformers.configuration_utils][INFO] - Model config Phi3Config {
2177
+ "architectures": [
2178
+ "Phi3ForCausalLM"
2179
+ ],
2180
+ "attention_bias": false,
2181
+ "attention_dropout": 0.0,
2182
+ "auto_map": {
2183
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
2184
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
2185
+ },
2186
+ "bos_token_id": 1,
2187
+ "embd_pdrop": 0.0,
2188
+ "eos_token_id": 32000,
2189
+ "hidden_act": "silu",
2190
+ "hidden_size": 3072,
2191
+ "initializer_range": 0.02,
2192
+ "intermediate_size": 8192,
2193
+ "max_position_embeddings": 131072,
2194
+ "model_type": "phi3",
2195
+ "num_attention_heads": 32,
2196
+ "num_hidden_layers": 32,
2197
+ "num_key_value_heads": 32,
2198
+ "original_max_position_embeddings": 4096,
2199
+ "pad_token_id": 32000,
2200
+ "partial_rotary_factor": 1.0,
2201
+ "resid_pdrop": 0.0,
2202
+ "rms_norm_eps": 1e-05,
2203
+ "rope_scaling": {
2204
+ "long_factor": [
2205
+ 1.0800000429153442,
2206
+ 1.1100000143051147,
2207
+ 1.1399999856948853,
2208
+ 1.340000033378601,
2209
+ 1.5899999141693115,
2210
+ 1.600000023841858,
2211
+ 1.6200000047683716,
2212
+ 2.620000123977661,
2213
+ 3.2300000190734863,
2214
+ 3.2300000190734863,
2215
+ 4.789999961853027,
2216
+ 7.400000095367432,
2217
+ 7.700000286102295,
2218
+ 9.09000015258789,
2219
+ 12.199999809265137,
2220
+ 17.670000076293945,
2221
+ 24.46000099182129,
2222
+ 28.57000160217285,
2223
+ 30.420001983642578,
2224
+ 30.840002059936523,
2225
+ 32.590003967285156,
2226
+ 32.93000411987305,
2227
+ 42.320003509521484,
2228
+ 44.96000289916992,
2229
+ 50.340003967285156,
2230
+ 50.45000457763672,
2231
+ 57.55000305175781,
2232
+ 57.93000411987305,
2233
+ 58.21000289916992,
2234
+ 60.1400032043457,
2235
+ 62.61000442504883,
2236
+ 62.62000274658203,
2237
+ 62.71000289916992,
2238
+ 63.1400032043457,
2239
+ 63.1400032043457,
2240
+ 63.77000427246094,
2241
+ 63.93000411987305,
2242
+ 63.96000289916992,
2243
+ 63.970001220703125,
2244
+ 64.02999877929688,
2245
+ 64.06999969482422,
2246
+ 64.08000183105469,
2247
+ 64.12000274658203,
2248
+ 64.41000366210938,
2249
+ 64.4800033569336,
2250
+ 64.51000213623047,
2251
+ 64.52999877929688,
2252
+ 64.83999633789062
2253
+ ],
2254
+ "short_factor": [
2255
+ 1.0,
2256
+ 1.0199999809265137,
2257
+ 1.0299999713897705,
2258
+ 1.0299999713897705,
2259
+ 1.0499999523162842,
2260
+ 1.0499999523162842,
2261
+ 1.0499999523162842,
2262
+ 1.0499999523162842,
2263
+ 1.0499999523162842,
2264
+ 1.0699999332427979,
2265
+ 1.0999999046325684,
2266
+ 1.1099998950958252,
2267
+ 1.1599998474121094,
2268
+ 1.1599998474121094,
2269
+ 1.1699998378753662,
2270
+ 1.2899998426437378,
2271
+ 1.339999794960022,
2272
+ 1.679999828338623,
2273
+ 1.7899998426437378,
2274
+ 1.8199998140335083,
2275
+ 1.8499997854232788,
2276
+ 1.8799997568130493,
2277
+ 1.9099997282028198,
2278
+ 1.9399996995925903,
2279
+ 1.9899996519088745,
2280
+ 2.0199997425079346,
2281
+ 2.0199997425079346,
2282
+ 2.0199997425079346,
2283
+ 2.0199997425079346,
2284
+ 2.0199997425079346,
2285
+ 2.0199997425079346,
2286
+ 2.0299997329711914,
2287
+ 2.0299997329711914,
2288
+ 2.0299997329711914,
2289
+ 2.0299997329711914,
2290
+ 2.0299997329711914,
2291
+ 2.0299997329711914,
2292
+ 2.0299997329711914,
2293
+ 2.0299997329711914,
2294
+ 2.0299997329711914,
2295
+ 2.0799996852874756,
2296
+ 2.0899996757507324,
2297
+ 2.189999580383301,
2298
+ 2.2199995517730713,
2299
+ 2.5899994373321533,
2300
+ 2.729999542236328,
2301
+ 2.749999523162842,
2302
+ 2.8399994373321533
2303
+ ],
2304
+ "type": "longrope"
2305
+ },
2306
+ "rope_theta": 10000.0,
2307
+ "sliding_window": 262144,
2308
+ "tie_word_embeddings": false,
2309
+ "torch_dtype": "bfloat16",
2310
+ "transformers_version": "4.50.0",
2311
+ "use_cache": true,
2312
+ "vocab_size": 32064
2313
+ }
2314
+
2315
+ [2025-03-25 13:38:04,920][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-416] due to args.save_total_limit
2316
+ [2025-03-25 13:38:04,927][transformers.trainer][INFO] -
2317
+
2318
+ Training completed. Do not forget to share your model on huggingface.co/models =)
2319
+
2320
+
2321
+ [2025-03-25 13:38:04,927][transformers.trainer][INFO] - Loading best model from /workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-288 (score: 0.5332575972735019).
2322
+ [2025-03-25 13:38:21,551][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-25/12-24-03/results/phi35-balanced/C5/checkpoint-448] due to args.save_total_limit
2323
+ [2025-03-25 13:38:21,558][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
2324
+ [2025-03-25 13:38:21,561][transformers.trainer][INFO] -
2325
+ ***** Running Evaluation *****
2326
+ [2025-03-25 13:38:21,561][transformers.trainer][INFO] - Num examples = 132
2327
+ [2025-03-25 13:38:21,561][transformers.trainer][INFO] - Batch size = 16
2328
+ [2025-03-25 13:38:35,717][transformers][INFO] - {'accuracy': 0.3333333333333333, 'RMSE': 54.9379815626841, 'QWK': 0.5332575972735019, 'HDIV': 0.06060606060606055, 'Macro_F1': 0.27814158963507446, 'Micro_F1': 0.3333333333333333, 'Weighted_F1': 0.28024515114391496, 'Macro_F1_(ignoring_nan)': np.float64(0.27814158963507446)}
2329
+ [2025-03-25 13:38:35,720][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
2330
+ [2025-03-25 13:38:35,721][__main__][INFO] - Training completed successfully.
2331
+ [2025-03-25 13:38:35,722][__main__][INFO] - Running on Test
2332
+ [2025-03-25 13:38:35,722][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt. If reference, supporting_text, essay_text, prompt, essay_year, id, grades, id_prompt are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
2333
+ [2025-03-25 13:38:35,724][transformers.trainer][INFO] -
2334
+ ***** Running Evaluation *****
2335
+ [2025-03-25 13:38:35,724][transformers.trainer][INFO] - Num examples = 138
2336
+ [2025-03-25 13:38:35,724][transformers.trainer][INFO] - Batch size = 16
2337
+ [2025-03-25 13:38:50,961][transformers][INFO] - {'accuracy': 0.35507246376811596, 'RMSE': 58.18511189623005, 'QWK': 0.5186813186813186, 'HDIV': 0.08695652173913049, 'Macro_F1': 0.2833097701518754, 'Micro_F1': 0.35507246376811596, 'Weighted_F1': 0.32384938563428267, 'Macro_F1_(ignoring_nan)': np.float64(0.2833097701518754)}
2338
+ [2025-03-25 13:38:50,962][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
2339
+ [2025-03-25 13:38:50,964][transformers.trainer][INFO] - Saving model checkpoint to ./results/phi35-balanced/C5/best_model
2340
+ [2025-03-25 13:38:51,433][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
2341
+ [2025-03-25 13:38:51,433][transformers.configuration_utils][INFO] - Model config Phi3Config {
2342
+ "architectures": [
2343
+ "Phi3ForCausalLM"
2344
+ ],
2345
+ "attention_bias": false,
2346
+ "attention_dropout": 0.0,
2347
+ "auto_map": {
2348
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
2349
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
2350
+ },
2351
+ "bos_token_id": 1,
2352
+ "embd_pdrop": 0.0,
2353
+ "eos_token_id": 32000,
2354
+ "hidden_act": "silu",
2355
+ "hidden_size": 3072,
2356
+ "initializer_range": 0.02,
2357
+ "intermediate_size": 8192,
2358
+ "max_position_embeddings": 131072,
2359
+ "model_type": "phi3",
2360
+ "num_attention_heads": 32,
2361
+ "num_hidden_layers": 32,
2362
+ "num_key_value_heads": 32,
2363
+ "original_max_position_embeddings": 4096,
2364
+ "pad_token_id": 32000,
2365
+ "partial_rotary_factor": 1.0,
2366
+ "resid_pdrop": 0.0,
2367
+ "rms_norm_eps": 1e-05,
2368
+ "rope_scaling": {
2369
+ "long_factor": [
2370
+ 1.0800000429153442,
2371
+ 1.1100000143051147,
2372
+ 1.1399999856948853,
2373
+ 1.340000033378601,
2374
+ 1.5899999141693115,
2375
+ 1.600000023841858,
2376
+ 1.6200000047683716,
2377
+ 2.620000123977661,
2378
+ 3.2300000190734863,
2379
+ 3.2300000190734863,
2380
+ 4.789999961853027,
2381
+ 7.400000095367432,
2382
+ 7.700000286102295,
2383
+ 9.09000015258789,
2384
+ 12.199999809265137,
2385
+ 17.670000076293945,
2386
+ 24.46000099182129,
2387
+ 28.57000160217285,
2388
+ 30.420001983642578,
2389
+ 30.840002059936523,
2390
+ 32.590003967285156,
2391
+ 32.93000411987305,
2392
+ 42.320003509521484,
2393
+ 44.96000289916992,
2394
+ 50.340003967285156,
2395
+ 50.45000457763672,
2396
+ 57.55000305175781,
2397
+ 57.93000411987305,
2398
+ 58.21000289916992,
2399
+ 60.1400032043457,
2400
+ 62.61000442504883,
2401
+ 62.62000274658203,
2402
+ 62.71000289916992,
2403
+ 63.1400032043457,
2404
+ 63.1400032043457,
2405
+ 63.77000427246094,
2406
+ 63.93000411987305,
2407
+ 63.96000289916992,
2408
+ 63.970001220703125,
2409
+ 64.02999877929688,
2410
+ 64.06999969482422,
2411
+ 64.08000183105469,
2412
+ 64.12000274658203,
2413
+ 64.41000366210938,
2414
+ 64.4800033569336,
2415
+ 64.51000213623047,
2416
+ 64.52999877929688,
2417
+ 64.83999633789062
2418
+ ],
2419
+ "short_factor": [
2420
+ 1.0,
2421
+ 1.0199999809265137,
2422
+ 1.0299999713897705,
2423
+ 1.0299999713897705,
2424
+ 1.0499999523162842,
2425
+ 1.0499999523162842,
2426
+ 1.0499999523162842,
2427
+ 1.0499999523162842,
2428
+ 1.0499999523162842,
2429
+ 1.0699999332427979,
2430
+ 1.0999999046325684,
2431
+ 1.1099998950958252,
2432
+ 1.1599998474121094,
2433
+ 1.1599998474121094,
2434
+ 1.1699998378753662,
2435
+ 1.2899998426437378,
2436
+ 1.339999794960022,
2437
+ 1.679999828338623,
2438
+ 1.7899998426437378,
2439
+ 1.8199998140335083,
2440
+ 1.8499997854232788,
2441
+ 1.8799997568130493,
2442
+ 1.9099997282028198,
2443
+ 1.9399996995925903,
2444
+ 1.9899996519088745,
2445
+ 2.0199997425079346,
2446
+ 2.0199997425079346,
2447
+ 2.0199997425079346,
2448
+ 2.0199997425079346,
2449
+ 2.0199997425079346,
2450
+ 2.0199997425079346,
2451
+ 2.0299997329711914,
2452
+ 2.0299997329711914,
2453
+ 2.0299997329711914,
2454
+ 2.0299997329711914,
2455
+ 2.0299997329711914,
2456
+ 2.0299997329711914,
2457
+ 2.0299997329711914,
2458
+ 2.0299997329711914,
2459
+ 2.0299997329711914,
2460
+ 2.0799996852874756,
2461
+ 2.0899996757507324,
2462
+ 2.189999580383301,
2463
+ 2.2199995517730713,
2464
+ 2.5899994373321533,
2465
+ 2.729999542236328,
2466
+ 2.749999523162842,
2467
+ 2.8399994373321533
2468
+ ],
2469
+ "type": "longrope"
2470
+ },
2471
+ "rope_theta": 10000.0,
2472
+ "sliding_window": 262144,
2473
+ "tie_word_embeddings": false,
2474
+ "torch_dtype": "bfloat16",
2475
+ "transformers_version": "4.50.0",
2476
+ "use_cache": true,
2477
+ "vocab_size": 32064
2478
+ }
2479
+
2480
+ [2025-03-25 13:38:57,844][__main__][INFO] - Fine Tuning Finished.
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b19e2bb6c4433c0c03eb4d903fd3602e60dff22323b40eeea2bf9a875def2324
3
+ size 5432