Training time

#10

by iHaag - opened Mar 19

Mar 19

I’m curious how long it took to train this model? How can Reinforcement Learning from Human Feedback (RLHF), Supervised Fine-Tuning
(SFT) And reasoning happen with diffusion based models? Looking forward to the progress amazing work very impressed and so happy you have made it semi-open source.

DerJustus

Oct 2

In the paper, it says "LLaDA 8B was pre-trained from scratch on 2.3 trillion tokens using 0.13 million H800 GPU hours"

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment