Training time

#10
by iHaag - opened

I’m curious how long it took to train this model? How can Reinforcement Learning from Human Feedback (RLHF), Supervised Fine-Tuning
(SFT) And reasoning happen with diffusion based models? Looking forward to the progress amazing work very impressed and so happy you have made it semi-open source.

In the paper, it says "LLaDA 8B was pre-trained from scratch on 2.3 trillion tokens using 0.13 million H800 GPU hours"

Sign up or log in to comment