VANTA Research

Independent AI safety research lab specializing in cognitive fit, alignment, and human-AI collaboration

Atom V1 Preview 12B

Atom V1 Preview 12B is a fine-tuned conversational AI model based on Google's Gemma 3 12B Instruct architecture. This model is designed to function as a collaborative thought partner, specializing in exploratory dialogue, brainstorming, research assistance, and technical problem-solving while maintaining an approachable and engaging conversational style.

This 12B iteration of the Atom persona is the third release in Project Atom from VANTA Research, and is also our largest model to date.

Model Details

Model Type: Multimodal Transformer (Text + Vision)
Base Model: google/gemma-3-12b-it
Training Method: Low-Rank Adaptation (LoRA) fine-tuning
License: Gemma Terms of Use
Developed By: VANTA Research
Language: English

Architecture

Parameters: 12 billion
Hidden Size: 3840
Attention Heads: 16 (8 key-value heads)
Hidden Layers: 48
Context Window: 131,072 tokens
Sliding Window: 1,024 tokens
FFN Dimension: 15,360
Vocabulary Size: 262,208 tokens
Precision: FP16

The model employs a hybrid attention pattern with sliding window attention and periodic full attention layers (every 6th layer) for efficient long-context processing.

Training Methodology

Atom-v1-preview-12b was fine-tuned using parameter-efficient LoRA adapters targeting attention and feedforward components. The training data consists of curated conversational examples emphasizing:

Collaborative exploration and brainstorming
Research synthesis and question formulation
Technical explanation at varying complexity levels
Lateral thinking and creative problem-solving
Empathetic and supportive dialogue patterns

Training was conducted over 258 steps with careful monitoring to preserve the base model's technical capabilities while introducing enhanced conversational characteristics.

Intended Use

Primary Applications

Collaborative Brainstorming: Generating diverse ideas and building iteratively on user suggestions
Research Assistance: Synthesizing information, identifying key arguments, and formulating research questions
Technical Explanation: Simplifying complex concepts across difficulty levels (including ELI5)
Code Discussion: Exploring implementation approaches, debugging strategies, and architectural decisions
Creative Problem-Solving: Encouraging unconventional approaches and lateral thinking

Out-of-Scope Use

This model is a research preview and should not be used for:

High-stakes decision-making without human oversight
Medical, legal, or financial advice
Generation of harmful, biased, or misleading content
Applications requiring guaranteed factual accuracy

Usage

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "atom-v1-preview-12-hf",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("atom-v1-preview-12-hf")

messages = [
    {"role": "user", "content": "What's your approach to explaining quantum entanglement?"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=512,
    temperature=0.8,
    top_p=0.9,
    top_k=40,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Recommended Sampling Parameters

Temperature: 0.7-0.9 (higher for creative tasks)
Top-p: 0.9
Top-k: 40
Repetition Penalty: 1.1
Max Context: 8,192 tokens (longer contexts supported but may impact performance)

Performance Characteristics

Based on systematic evaluation across conversational dimensions:

Collaborative Framing: Strong "thought partner" identity with organic question flow
Enthusiasm Expression: Consistent use of engaged language patterns without over-prescription
Metaphor Usage: Effective across technical and creative contexts
Technical Competence: Maintains depth while prioritizing accessibility
Adaptability: Calibrates tone and complexity to conversational context

The model demonstrates 85-90% alignment with design specifications across diverse prompt types, including identity awareness, technical discussion, creative output, empathetic support, and philosophical reasoning.

Limitations

Knowledge Cutoff: Training data reflects information available through late 2024
Factual Accuracy: May generate plausible-sounding but incorrect information
Quantization Impact: 4-bit GGUF quantization trades model size for minor quality degradation
Context Processing: Very long contexts (>32K tokens) may show attention degradation
Domain Specificity: Strongest in general technical discussion; may lack depth in highly specialized domains
Bias: Inherits biases from base model and training data despite mitigation efforts

Ethical Considerations

This model is designed to support exploration and learning, not to replace human judgment. Users should:

Verify factual claims against authoritative sources
Apply critical thinking to generated suggestions
Recognize the model's limitations in high-stakes scenarios
Be mindful of potential biases in outputs
Use responsibly in accordance with applicable laws and regulations

Citation

@misc{atom-v1-preview-12,
  title={Atom-v1-preview-12: A Collaborative Thought Partner},
  author={VANTA Research},
  year={2025},
  howpublished={https://huggingface.co/vanta-research/atom-v1-preview-12b}
}

Acknowledgments

Built on Google's Gemma 3 12B Instruct architecture. Training infrastructure supported by Hugging Face Spaces, Transformers, PEFT, and llama.cpp quantization tools. Atom V1 12B was trained on NVIDIA's L40S GPU