--- license: apache-2.0 base_model: Mostafa8Mehrabi/qwen3-50m-fp16 tags: - qwen - c4 - pretrained - fp16 - notebook library_name: transformers pipeline_tag: text-generation --- # 🚀 Qwen3-50M C4 Pretrained (FP16) - Notebook Version Pretrained Qwen3-50M model on C4 dataset using FP16 precision in notebook environment. ## 📊 Training Results - **Final Training Loss**: 7.0744 - **Final Validation Loss**: 7.1159162521362305 - **Training Samples**: 10,000 - **Epochs**: 2 - **Precision**: FP16 ## 🚀 Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch tokenizer = AutoTokenizer.from_pretrained("Mostafa8Mehrabi/qwen3-50m-c4-final_test_H200") model = AutoModelForCausalLM.from_pretrained( "Mostafa8Mehrabi/qwen3-50m-c4-final_test_H200", torch_dtype=torch.float16, device_map="auto" ) # Generate text prompt = "The future of artificial intelligence is" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=100, do_sample=True) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## 📁 Checkpoints Training checkpoints (also in FP16) are available at: Mostafa8Mehrabi/qwen3-50m-c4-checkpoints_test_H200 ## 🔧 Training Environment This model was trained in a notebook environment with the following configuration: - Batch Size: 160 - Learning Rate: 5e-05 - Max Length: 512 - Number of Processes: 8