chinmaydk99's picture
Upload custom Gemma-3 270M fine-tuned on TinyStories
44b75a6 verified
|
raw
history blame
759 Bytes
---
language: en
tags:
- text-generation
- gemma
- tinystories
license: apache-2.0
datasets:
- roneneldan/TinyStories
---
# Gemma-3 270M Fine-tuned on TinyStories
This is a custom implementation of Gemma-3 270M parameter model fine-tuned on the TinyStories dataset.
## Model Details
- **Architecture**: Custom Gemma-3 with sliding window attention
- **Parameters**: ~270M
- **Training Dataset**: TinyStories
- **Context Length**: 32,768 tokens
- **Sliding Window**: 512 tokens
## Usage
```python
# Note: This model requires the custom Gemma3Model class from the training notebook
# You'll need to copy the model definition to use this model
```
## Training Details
- Trained for 150,000 steps
- Final training loss: ~2.55
- Final validation loss: ~2.56