---
license: apache-2.0
base_model: Mostafa8Mehrabi/qwen3-50m-fp16
tags:
- qwen
- c4
- pretrained
- fp16
- notebook
library_name: transformers
pipeline_tag: text-generation
---

# 🚀 Qwen3-50M C4 Pretrained (FP16) - Notebook Version

Pretrained Qwen3-50M model on C4 dataset using FP16 precision in notebook environment.

## 📊 Training Results

- **Final Training Loss**: 7.0744
- **Final Validation Loss**: 7.1159162521362305
- **Training Samples**: 10,000
- **Epochs**: 2
- **Precision**: FP16

## 🚀 Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("Mostafa8Mehrabi/qwen3-50m-c4-final_test_H200")
model = AutoModelForCausalLM.from_pretrained(
    "Mostafa8Mehrabi/qwen3-50m-c4-final_test_H200", 
    torch_dtype=torch.float16,
    device_map="auto"
)

# Generate text
prompt = "The future of artificial intelligence is"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## 📁 Checkpoints

Training checkpoints (also in FP16) are available at: Mostafa8Mehrabi/qwen3-50m-c4-checkpoints_test_H200

## 🔧 Training Environment

This model was trained in a notebook environment with the following configuration:
- Batch Size: 160
- Learning Rate: 5e-05
- Max Length: 512
- Number of Processes: 8