Training time
#10
by
						
iHaag
	
							
						- opened
							
					
I’m curious how long it took to train this model? How can Reinforcement Learning from Human Feedback (RLHF),  Supervised Fine-Tuning
(SFT) And reasoning happen with diffusion based models? Looking forward to the progress amazing work very impressed and so happy you have made it semi-open source.
In the paper, it says "LLaDA 8B was pre-trained from scratch on 2.3 trillion tokens using 0.13 million H800 GPU hours"