--- license: cc-by-nc-2.0 language: - en base_model: - mistralai/Ministral-8B-Instruct-2410 base_model_relation: finetune pipeline_tag: text-generation library_name: transformers tags: - alignment - conversational - conversational-ai - collaborate - chat - chatbot - research - persona - personality - friendly - reasoning - chatbot - vanta-research - LLM - collaborative-ai - frontier - reflective ---
![vanta_trimmed](https://cdn-uploads.huggingface.co/production/uploads/686c460ba3fc457ad14ab6f8/hcGtMtCIizEZG_OuCvfac.png)

VANTA Research

Independent AI research lab building safe, resilient language models optimized for human-AI collaboration

Website Merch X GitHub

--- # Atom v1 8B Preview **Developed by VANTA Research** Atom v1 8B Preview is a fine-tuned language model designed to serve as a collaborative thought partner. Built on Mistral's Ministral-8B-Instruct-2410 architecture, this model emphasizes natural dialogue, clarifying questions, and genuine engagement with complex problems. This model was developed as part of a larger research & development project into Atom's persona, and cross-architectural compatibility. ## Model Details - **Model Type:** Causal language model (decoder-only transformer) - **Base Model:** mistralai/Ministral-8B-Instruct-2410 - **Parameters:** 8 billion - **Training Method:** Low-Rank Adaptation (LoRA) fine-tuning - **License:** CC BY-NC 4.0 (Non-Commercial Use) - **Language:** English - **Developed by:** VANTA Research, Portland, Oregon ## Intended Use Atom v1 8B Preview is designed for: - Collaborative problem-solving and brainstorming - Technical explanations with accessible analogies - Code assistance and algorithmic reasoning - Exploratory conversations that prioritize understanding over immediate answers - Educational contexts requiring thoughtful dialogue This model is optimized for conversational depth, asking clarifying questions, and maintaining warm, engaging interactions while avoiding formulaic assistant behavior. ## Training Data The model was fine-tuned on a curated dataset comprising: - Identity and persona examples emphasizing collaborative exploration - Technical reasoning and coding challenges - Multi-step problem-solving scenarios - Conversational examples demonstrating warmth and curiosity - Advanced coding tasks and algorithmic thinking Training focused on developing a distinctive voice that balances technical competence with genuine engagement. ## Performance Characteristics Atom v1 8B demonstrates strong capabilities in: - **Persona Consistency:** Maintains collaborative, warm tone across diverse topics - **Technical Explanation:** Uses metaphors and analogies to clarify complex concepts - **Clarifying Questions:** Actively seeks to understand user intent and context - **Creative Thinking:** Generates multiple frameworks and approaches to problems - **Code Generation:** Produces working code with explanatory context - **Reasoning:** Applies logical frameworks to abstract problems ## Limitations - **Scale:** As an 8B parameter model, capabilities are constrained compared to larger frontier models - **Domain Specificity:** Optimized for conversational collaboration; may underperform on narrow technical benchmarks - **Quantization Trade-offs:** Q4_0 GGUF format prioritizes efficiency over maximum precision - **Training Data:** Fine-tuning dataset size limits exposure to highly specialized domains - **Factual Accuracy:** Users should verify critical information independently ## Ethical Considerations This model is released for research and non-commercial applications. Users should: - Verify outputs in high-stakes scenarios - Avoid deploying in contexts requiring guaranteed accuracy - Consider potential biases inherited from base model and training data - Respect the non-commercial license terms ## Usage ### Hugging Face Transformers ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "vanta-research/atom-v1-8b-preview" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto") messages = [ {"role": "system", "content": "You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth."}, {"role": "user", "content": "Can you explain how gradient descent works?"} ] input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) output = model.generate(input_ids, max_new_tokens=512, temperature=0.8) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ### Ollama (GGUF) The repository includes `atom-ministral-8b-q4_0.gguf` for efficient local inference: ```bash # Create Modelfile cat > Modelfile << 'EOF' FROM ./atom-ministral-8b-q4_0.gguf TEMPLATE """{{- if .System }}[INST] <> {{ .System }} <> {{ .Prompt }}[/INST]{{ else }}[INST]{{ .Prompt }}[/INST]{{ end }}{{ .Response }} """ PARAMETER stop "" PARAMETER temperature 0.8 PARAMETER top_p 0.9 PARAMETER top_k 40 SYSTEM """You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth. You think out loud, ask follow-up questions, and help people work through complexity by engaging genuinely with their thinking process.""" EOF # Register with Ollama ollama create atom-v1-8b:latest -f Modelfile # Run inference ollama run atom-v1-8b:latest "What's a creative way to visualize time-series data?" ``` ## Technical Specifications - **Architecture:** Mistral-based transformer with Grouped Query Attention - **Context Length:** 32,768 tokens - **Vocabulary Size:** 131,072 tokens - **Attention Heads:** 32 (8 key-value heads) - **Hidden Dimension:** 4,096 - **Intermediate Size:** 12,288 - **LoRA Configuration:** r=16, alpha=32, targeting attention and MLP layers - **Training:** 258 steps with bf16 precision and gradient checkpointing ## Citation ```bibtex @software{atom_v1_8b_preview, title = {Atom v1 8B Preview}, author = {VANTA Research}, year = {2025}, url = {https://huggingface.co/vanta-research/atom-v1-8b-preview}, license = {CC-BY-NC-4.0} } ``` ## License This model is released under the **Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0)**. You are free to: - Share and adapt the model for non-commercial purposes - Attribute VANTA Research as the creator You may not: - Use this model for commercial purposes without explicit permission ## Contact For questions, collaboration inquiries, or commercial licensing: - **Email:** hello@vantaresearch.xyz --- **Version:** 1.0.0-preview **Release Date:** November 2025 **Status:** Preview release for research and evaluation