THAU v2.0 - Self-Learning Language Model

THAU (Thinking, Helpful, Autonomous, Understanding) is a self-learning language model fine-tuned from TinyLlama-1.1B with specialized training in tool calling, reasoning, and Spanish.

Model Description

Attribute	Value
Base Model	TinyLlama-1.1B-Chat-v1.0
Parameters	~1.1B
Training Method	LoRA Fine-tuning
Final Loss	0.43
Languages	Spanish (primary), English
License	Apache 2.0

Capabilities

Tool Calling: Native JSON-based function invocation
Chain of Thought: Step-by-step reasoning for complex problems
Image Generation: Prompt engineering for image generation
Spanish Fluency: Natural and technical conversations
Programming: Python, JavaScript, Java assistance

Training Data

Category	Examples
Tool Calling	112
Spanish Natural/Technical	52
Image Generation	30
Conversational Spanish	20
Chain of Thought Reasoning	20
Programming	30+
Total	297 specialized examples

Usage

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("luepow/thau")
tokenizer = AutoTokenizer.from_pretrained("luepow/thau")

# Chat format
prompt = """<|system|>
Eres THAU, un asistente AI inteligente y servicial.</s>
<|user|>
Hola, quien eres?</s>
<|assistant|>
"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With Ollama (Recommended)

ollama pull luepow/thau
ollama run luepow/thau

Tool Calling Format

THAU uses a JSON-based tool calling format:

<tool_call>{"name": "tool_name", "arguments": {"param": "value"}}</tool_call>

Available Tools

Tool	Description
`get_current_time`	Get current date/time
`web_search`	Search the internet
`execute_python`	Run Python code
`generate_image`	Generate image from prompt
`read_file`	Read file contents
`list_directory`	List directory contents

Example

User: What time is it?

THAU:

<tool_call>{"name": "get_current_time", "arguments": {}}</tool_call>

Limitations

Model size limits complex multi-step reasoning
May hallucinate on topics outside training data
Tool calling accuracy varies by complexity
Spanish is the primary language; English is secondary
Best for simple to moderate complexity tasks

Training Details

Full Training: 3,022 data points, 4,533 steps, loss 0.94
Specialized v2.0: 297 examples, 745 steps, loss 0.43
Hardware: Apple Silicon (MPS)
Training Time: ~7 minutes for specialized phase

Citation

@misc{thau2024,
  title={THAU v2.0: A Self-Learning Language Model},
  author={Luis Perez (luepow)},
  year={2024},
  url={https://huggingface.co/luepow/thau}
}

Acknowledgments

Thomas & Aurora - Inspiration for the cognitive age progression system
Claude (Anthropic) - AI pair programming partner
TinyLlama Team - Excellent base model
Hugging Face - Model hosting and transformers library

THAU v2.0 - Built with incremental learning and specialized training

Dedicated to Thomas & Aurora

Downloads last month: 3

Safetensors

Model size

1B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for luepow/thau

Base model

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Adapter

(1312)

this model

luepow
/

thau