---
base_model: unsloth/gpt-oss-20b
tags:
- text-generation-inference
- transformers
- unsloth
- gpt_oss
- trl
language:
- en
---

# GPT-OSS-20B (Finetuned & Merged)

- **Author:** ShahzebKhoso  
- **License:** apache-2.0  
- **Base Model:** [unsloth/gpt-oss-20b](https://huggingface.co/unsloth/gpt-oss-20b)  
- **Frameworks:** [Unsloth](https://github.com/unslothai/unsloth), [Hugging Face Transformers](https://huggingface.co/docs/transformers)  

---

## 📌 Model Details
This model is a **finetuned and merged version of GPT-OSS-20B**.  
It was trained on the **nvidia/AceReason-Math** for **solving mathematical problems**.  

Training used **LoRA adapters** with Unsloth for efficient optimization, then merged back into the base model using `save_pretrained_merged`.  

Compared to the base:
- ✅ 2× faster training with Unsloth optimizations  
- ✅ Memory efficient (4-bit training support)  
- ✅ Ready for reasoning-style inference  

---

## 🚀 Usage

You can use this model as a **chat model** with reasoning-style prompts. Example:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ShahzebKhoso/GPT-OSS-20B-AceReason-Math"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")

messages = [
    {"role": "system", "content": "You are a helpful AI that explains mathematical reasoning step by step."},
    {"role": "user", "content": "Solve x^5 + 3x^4 - 10 = 3."},
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt = True,
    return_tensors = "pt",
    return_dict = True,
    reasoning_effort = "medium",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## 📊 Training
- **Base:** GPT-OSS-20B (`unsloth/gpt-oss-20b`)  
- **Dataset:** [AceReason-Math](https://huggingface.co/datasets/nvidia/AceReason-Math)
   **Splits:**
    **Train:** 40,163
    **Validation:** 4,463
    **Test:** 4,963
- **Method:** Parameter-Efficient Fine-Tuning (LoRA)  
- **LoRA Config:** r=8, alpha=16, dropout=0  
- **Merge:** `save_pretrained_merged` from Unsloth
- **Epochs:** 3  
- **Training Time:** ~32 hours
---

## ❤️ Acknowledgements
This model was trained & merged using [Unsloth](https://github.com/unslothai/unsloth).  
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)