Sky610TX

Model Details

Architecture: GPT-2 Style (Custom Ascendant Config)
Parameters: ~389 Million
Training tokens: 1.3 Billion
Context Window: 1024 Tokens
50k iterations

The Future

Work has started on a new, 1.2B parameter model. It will be much better at coding, reasoning, facts, and conversation with over 10B tokens! It is currently in development and is expected to release soon

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("8BitStudio/Sky610TX")
tokenizer = AutoTokenizer.from_pretrained("8BitStudio/Sky610TX")

input_text = "User: Hello\nAssistant:"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0]))

Downloads last month: 27

Safetensors

Model size

0.4B params

Tensor type

F32