🧠LLaMA-50M Turkish news
Model Summary
| Property | Value |
|---|---|
| Architecture | LLaMA (decoder-only transformer) |
| Parameters | ~50M |
| Vocab size | 32,768 |
| Embedding dim | 256 |
| Hidden dim | 2048 |
| Layers | 20 |
| Attention heads | 128 |
| KV groups | 64 |
| Context length | 256 |
| Tokenizer | turkish_tokenizer (alibayram) |
| Dataset | habanoz/news-tr-1.8M |
| Tokens seen | 372,679,971 |
| Epochs | 2 |
| Batch size | 64 |
| Language | Turkish 🇹🇷 |
Model Description
llama_50m_tr_tokenizer_news is a 50 million parameter Turkish language model trained from scratch on the habanoz/news-tr-1.8M dataset.
It was developed as a lightweight experimental model to explore Turkish-specific tokenization and morphology-aware pretraining using the custom turkish_tokenizer.
The model follows the LLaMA-style causal transformer architecture and was trained with a context length of 256 tokens over ~372M tokens in total.
Training Environment
Hardware: NVIDIA A100 GPU ~20 hours
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support