train_cb_1745950314

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1427
  • Num Input Tokens Seen: 22164464

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3728 3.5133 200 0.2221 111736
0.2387 7.0177 400 0.2965 223024
0.1672 10.5310 600 0.1427 332984
0.1712 14.0354 800 0.1825 444576
0.0446 17.5487 1000 0.1883 555960
0.0038 21.0531 1200 0.2014 665952
0.0004 24.5664 1400 0.2456 777608
0.0002 28.0708 1600 0.2736 887904
0.0002 31.5841 1800 0.2869 999464
0.0001 35.0885 2000 0.2882 1110640
0.0 38.6018 2200 0.3087 1222144
0.0001 42.1062 2400 0.3172 1332096
0.0001 45.6195 2600 0.3216 1443792
0.0 49.1239 2800 0.3270 1553600
0.0 52.6372 3000 0.3373 1664296
0.0 56.1416 3200 0.3506 1775264
0.0 59.6549 3400 0.3578 1885968
0.0 63.1593 3600 0.3632 1996440
0.0 66.6726 3800 0.3684 2107400
0.0 70.1770 4000 0.3845 2218352
0.0 73.6903 4200 0.3916 2330072
0.0 77.1947 4400 0.4035 2440176
0.0 80.7080 4600 0.4003 2551216
0.0 84.2124 4800 0.4034 2662848
0.0 87.7257 5000 0.4106 2774160
0.0 91.2301 5200 0.4232 2885448
0.0 94.7434 5400 0.4263 2995680
0.0 98.2478 5600 0.4357 3106768
0.0 101.7611 5800 0.4324 3218248
0.0 105.2655 6000 0.4491 3329176
0.0 108.7788 6200 0.4541 3440344
0.0 112.2832 6400 0.4585 3550560
0.0 115.7965 6600 0.4681 3661824
0.0 119.3009 6800 0.4760 3771856
0.0 122.8142 7000 0.4742 3883176
0.0 126.3186 7200 0.4915 3994264
0.0 129.8319 7400 0.4837 4105440
0.0 133.3363 7600 0.4888 4216208
0.0 136.8496 7800 0.5081 4326832
0.0 140.3540 8000 0.5129 4437792
0.0 143.8673 8200 0.5111 4549512
0.0 147.3717 8400 0.5239 4658800
0.0 150.8850 8600 0.5262 4769400
0.0 154.3894 8800 0.5386 4881880
0.0 157.9027 9000 0.5442 4992488
0.0 161.4071 9200 0.5477 5103032
0.0 164.9204 9400 0.5498 5214280
0.0 168.4248 9600 0.5599 5323664
0.0 171.9381 9800 0.5638 5436384
0.0 175.4425 10000 0.5640 5547152
0.0 178.9558 10200 0.5694 5658656
0.0 182.4602 10400 0.5835 5768616
0.0 185.9735 10600 0.5904 5879304
0.0 189.4779 10800 0.5920 5990296
0.0 192.9912 11000 0.5958 6101128
0.0 196.4956 11200 0.5993 6212008
0.0 200.0 11400 0.5979 6321568
0.0 203.5133 11600 0.6070 6432384
0.0 207.0177 11800 0.6076 6542352
0.0 210.5310 12000 0.6162 6654160
0.0 214.0354 12200 0.6100 6765224
0.0 217.5487 12400 0.6123 6874936
0.0 221.0531 12600 0.6259 6986248
0.0 224.5664 12800 0.6288 7097808
0.0 228.0708 13000 0.6375 7208392
0.0 231.5841 13200 0.6336 7318456
0.0 235.0885 13400 0.6358 7430160
0.0 238.6018 13600 0.6433 7540344
0.0 242.1062 13800 0.6339 7650824
0.0 245.6195 14000 0.6419 7761968
0.0 249.1239 14200 0.6547 7872968
0.0 252.6372 14400 0.6476 7983464
0.0 256.1416 14600 0.6594 8093616
0.0 259.6549 14800 0.6540 8204560
0.0 263.1593 15000 0.6565 8315912
0.0 266.6726 15200 0.6450 8426448
0.0 270.1770 15400 0.6468 8536288
0.0 273.6903 15600 0.6565 8648256
0.0 277.1947 15800 0.6578 8758760
0.0 280.7080 16000 0.6662 8868600
0.0 284.2124 16200 0.6600 8981000
0.0 287.7257 16400 0.6473 9091424
0.0 291.2301 16600 0.6490 9202432
0.0 294.7434 16800 0.6395 9312888
0.0 298.2478 17000 0.6448 9423320
0.0 301.7611 17200 0.6459 9533896
0.0 305.2655 17400 0.6617 9644952
0.0 308.7788 17600 0.6652 9754832
0.0 312.2832 17800 0.6583 9866256
0.0 315.7965 18000 0.6628 9975768
0.0 319.3009 18200 0.6539 10086392
0.0 322.8142 18400 0.6527 10197432
0.0 326.3186 18600 0.6600 10307224
0.0 329.8319 18800 0.6478 10419256
0.0 333.3363 19000 0.6550 10529488
0.0 336.8496 19200 0.6510 10640296
0.0 340.3540 19400 0.6655 10750776
0.0 343.8673 19600 0.6587 10861648
0.0 347.3717 19800 0.6483 10972808
0.0 350.8850 20000 0.6682 11083136
0.0 354.3894 20200 0.6669 11193448
0.0 357.9027 20400 0.6694 11305168
0.0 361.4071 20600 0.6572 11416112
0.0 364.9204 20800 0.6539 11527424
0.0 368.4248 21000 0.6692 11637784
0.0 371.9381 21200 0.6645 11748768
0.0 375.4425 21400 0.6803 11857872
0.0 378.9558 21600 0.6596 11969696
0.0 382.4602 21800 0.6605 12080592
0.0 385.9735 22000 0.6668 12190608
0.0 389.4779 22200 0.6798 12301648
0.0 392.9912 22400 0.6713 12412384
0.0 396.4956 22600 0.6631 12523264
0.0 400.0 22800 0.6748 12633656
0.0 403.5133 23000 0.6599 12743928
0.0 407.0177 23200 0.6731 12855568
0.0 410.5310 23400 0.6585 12966544
0.0 414.0354 23600 0.6575 13077752
0.0 417.5487 23800 0.6545 13189592
0.0 421.0531 24000 0.6538 13299920
0.0 424.5664 24200 0.6616 13410872
0.0 428.0708 24400 0.6672 13522656
0.0 431.5841 24600 0.6515 13632696
0.0 435.0885 24800 0.6613 13743576
0.0 438.6018 25000 0.6583 13856080
0.0 442.1062 25200 0.6817 13966552
0.0 445.6195 25400 0.6591 14076912
0.0 449.1239 25600 0.6748 14187144
0.0 452.6372 25800 0.6700 14298896
0.0 456.1416 26000 0.6631 14408592
0.0 459.6549 26200 0.6643 14519672
0.0 463.1593 26400 0.6712 14630736
0.0 466.6726 26600 0.6607 14741472
0.0 470.1770 26800 0.6407 14852816
0.0 473.6903 27000 0.6688 14964568
0.0 477.1947 27200 0.6583 15074912
0.0 480.7080 27400 0.6647 15186488
0.0 484.2124 27600 0.6718 15297600
0.0 487.7257 27800 0.6622 15407784
0.0 491.2301 28000 0.6721 15518800
0.0 494.7434 28200 0.6357 15629392
0.0 498.2478 28400 0.6769 15740552
0.0 501.7611 28600 0.6648 15852112
0.0 505.2655 28800 0.6663 15962600
0.0 508.7788 29000 0.6719 16073896
0.0 512.2832 29200 0.6698 16184680
0.0 515.7965 29400 0.6411 16295584
0.0 519.3009 29600 0.6533 16406536
0.0 522.8142 29800 0.6534 16516648
0.0 526.3186 30000 0.6500 16628144
0.0 529.8319 30200 0.6723 16738416
0.0 533.3363 30400 0.6530 16848080
0.0 536.8496 30600 0.6557 16960312
0.0 540.3540 30800 0.6635 17069536
0.0 543.8673 31000 0.6683 17180696
0.0 547.3717 31200 0.6434 17291896
0.0 550.8850 31400 0.6667 17402176
0.0 554.3894 31600 0.6643 17512704
0.0 557.9027 31800 0.6555 17624600
0.0 561.4071 32000 0.6490 17734208
0.0 564.9204 32200 0.6494 17845224
0.0 568.4248 32400 0.6411 17956288
0.0 571.9381 32600 0.6579 18066176
0.0 575.4425 32800 0.6615 18177520
0.0 578.9558 33000 0.6670 18289064
0.0 582.4602 33200 0.6603 18398888
0.0 585.9735 33400 0.6610 18509416
0.0 589.4779 33600 0.6647 18620544
0.0 592.9912 33800 0.6620 18731712
0.0 596.4956 34000 0.6691 18841128
0.0 600.0 34200 0.6600 18952336
0.0 603.5133 34400 0.6554 19063208
0.0 607.0177 34600 0.6618 19173736
0.0 610.5310 34800 0.6714 19285352
0.0 614.0354 35000 0.6660 19395536
0.0 617.5487 35200 0.6430 19506864
0.0 621.0531 35400 0.6640 19617648
0.0 624.5664 35600 0.6604 19728144
0.0 628.0708 35800 0.6667 19838296
0.0 631.5841 36000 0.6538 19948392
0.0 635.0885 36200 0.6598 20059232
0.0 638.6018 36400 0.6588 20170032
0.0 642.1062 36600 0.6602 20279560
0.0 645.6195 36800 0.6701 20389936
0.0 649.1239 37000 0.6673 20499984
0.0 652.6372 37200 0.6585 20612176
0.0 656.1416 37400 0.6829 20722344
0.0 659.6549 37600 0.6522 20833640
0.0 663.1593 37800 0.6633 20944256
0.0 666.6726 38000 0.6604 21055624
0.0 670.1770 38200 0.6676 21165744
0.0 673.6903 38400 0.6634 21277072
0.0 677.1947 38600 0.6616 21388128
0.0 680.7080 38800 0.6536 21499624
0.0 684.2124 39000 0.6551 21611240
0.0 687.7257 39200 0.6573 21721232
0.0 691.2301 39400 0.6526 21832720
0.0 694.7434 39600 0.6724 21942280
0.0 698.2478 39800 0.6597 22053128
0.0 701.7611 40000 0.6501 22164464

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1745950314

Adapter
(2057)
this model

Dataset used to train rbelanec/train_cb_1745950314