🧠 LLaMA-50M Turkish news

Model Summary

Property	Value
Architecture	LLaMA (decoder-only transformer)
Parameters	~50M
Vocab size	32,768
Embedding dim	256
Hidden dim	2048
Layers	20
Attention heads	128
KV groups	64
Context length	256
Tokenizer	turkish_tokenizer (alibayram)
Dataset	habanoz/news-tr-1.8M
Tokens seen	372,679,971
Epochs	2
Batch size	64
Language	Turkish 🇹🇷

Model Description

llama_50m_tr_tokenizer_news is a 50 million parameter Turkish language model trained from scratch on the habanoz/news-tr-1.8M dataset.
It was developed as a lightweight experimental model to explore Turkish-specific tokenization and morphology-aware pretraining using the custom turkish_tokenizer.

The model follows the LLaMA-style causal transformer architecture and was trained with a context length of 256 tokens over ~372M tokens in total.

Training Environment

Hardware: NVIDIA A100 GPU ~20 hours

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

AhmetSemih
/

llama_50m_pretrained_news_tr_tokenizer

🧠 LLaMA-50M Turkish news

Model Summary

Model Description

Training Environment

Dataset used to train AhmetSemih/llama_50m_pretrained_news_tr_tokenizer