chinmaydk99's picture
Upload custom Gemma-3 270M fine-tuned on TinyStories
44b75a6 verified
|
raw
history blame
759 Bytes
metadata
language: en
tags:
  - text-generation
  - gemma
  - tinystories
license: apache-2.0
datasets:
  - roneneldan/TinyStories

Gemma-3 270M Fine-tuned on TinyStories

This is a custom implementation of Gemma-3 270M parameter model fine-tuned on the TinyStories dataset.

Model Details

  • Architecture: Custom Gemma-3 with sliding window attention
  • Parameters: ~270M
  • Training Dataset: TinyStories
  • Context Length: 32,768 tokens
  • Sliding Window: 512 tokens

Usage

# Note: This model requires the custom Gemma3Model class from the training notebook
# You'll need to copy the model definition to use this model

Training Details

  • Trained for 150,000 steps
  • Final training loss: ~2.55
  • Final validation loss: ~2.56