|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- conversational |
|
|
- instruction-following |
|
|
- chat |
|
|
- gguf |
|
|
- llama.cpp |
|
|
- ollama |
|
|
- local-llm |
|
|
- Neutrino |
|
|
pipeline_tag: text-generation |
|
|
datasets: |
|
|
- HuggingFaceFW/finepdfs |
|
|
- fka/awesome-chatgpt-prompts |
|
|
metrics: |
|
|
- accuracy |
|
|
- bertscore |
|
|
- bleu |
|
|
- bleurt |
|
|
- brier_score |
|
|
- cer |
|
|
library_name: adapter-transformers |
|
|
--- |
|
|
|
|
|
# π§ Neutrino-Instruct (7B) |
|
|
 |
|
|
Neutrino-Instruct is a **7B parameter instruction-tuned LLM** developed by **Fardeen NB**. |
|
|
It is designed for **conversational AI**, **multi-step reasoning**, and **instruction-following** tasks, fine-tuned to maintain coherent and contextual dialogue across multiple turns. |
|
|
|
|
|
## β¨ Model Details |
|
|
|
|
|
- **Model Name:** Neutrino-Instruct |
|
|
- **Developer:** Fardeen NB |
|
|
- **License:** Apache-2.0 |
|
|
- **Language(s):** English |
|
|
- **Format:** GGUF (optimized for `llama.cpp` and `Ollama`) |
|
|
- **Base Model:** Neutrino |
|
|
- **Version:** 2.0 |
|
|
- **Task:** Text Generation (chat, Q&A, instruction-following) |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
### Run with [llama.cpp](https://github.com/ggerganov/llama.cpp) |
|
|
|
|
|
```bash |
|
|
# Clone and build llama.cpp |
|
|
git clone https://github.com/ggerganov/llama.cpp |
|
|
cd llama.cpp && make |
|
|
|
|
|
# Run a single prompt |
|
|
./main -m ./neutrino-instruct.gguf -p "Hello, who are you?" |
|
|
|
|
|
# Run in interactive mode |
|
|
./main -m ./neutrino-instruct.gguf -i -p "Let's chat." |
|
|
|
|
|
# Control output length |
|
|
./main -m ./neutrino-instruct.gguf -n 256 -p "Write a poem about stars." |
|
|
|
|
|
# Change creativity (temperature) |
|
|
./main -m ./neutrino-instruct.gguf --temp 0.7 -p "Explain quantum computing simply." |
|
|
|
|
|
# Enable GPU acceleration (if compiled with CUDA/Metal) |
|
|
./main -m ./neutrino-instruct.gguf --gpu-layers 50 -p "Summarize this article." |
|
|
``` |
|
|
|
|
|
### Run with [Ollama](https://ollama.com/fardeen0424/neutrino) |
|
|
|
|
|
```bash |
|
|
ollama run fardeen0424/neutrino |
|
|
``` |
|
|
|
|
|
### Run in Python (`llama-cpp-python`) |
|
|
|
|
|
```python |
|
|
from llama_cpp import Llama |
|
|
|
|
|
# Load the Neutrino-Instruct model |
|
|
llm = Llama(model_path="./neutrino-instruct.gguf") |
|
|
|
|
|
# Run inference |
|
|
response = llm("Who are you?") |
|
|
print(response["choices"][0]["text"]) |
|
|
|
|
|
# Stream output tokens |
|
|
for token in llm("Tell me a story about Neutrino:", stream=True): |
|
|
print(token["choices"][0]["text"], end="", flush=True) |
|
|
``` |
|
|
|
|
|
## π System Requirements |
|
|
|
|
|
* **CPU-only:** 32β64GB RAM recommended (runs on modern laptops, slower inference). |
|
|
* **GPU acceleration:** |
|
|
|
|
|
* 4GB VRAM β 4-bit quantized (Q4) models |
|
|
* 8GB VRAM β 5-bit/8-bit models |
|
|
* 12GB+ VRAM β FP16 full precision |
|
|
|
|
|
|
|
|
## π§© Potential Use Cases |
|
|
|
|
|
* Conversational AI assistants |
|
|
* Research prototypes |
|
|
* Instruction-following agents |
|
|
* Chatbots with identity-awareness |
|
|
|
|
|
β οΈ **Out of Scope:** Use in critical decision-making, legal, or medical contexts. |
|
|
|
|
|
## π οΈ Development Notes |
|
|
|
|
|
* Model uploaded in **GGUF format** for portability & performance. |
|
|
* Compatible with **llama.cpp**, **Ollama**, and **llama-cpp-python**. |
|
|
* Supports quantization levels (Q4, Q5, Q8) for deployment on resource-constrained devices. |
|
|
|
|
|
## π Citation |
|
|
|
|
|
If you use Neutrino in your research or projects, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{fardeennb2025neutrino, |
|
|
title = {Neutrino-Instruct: A 7B Instruction-Tuned Conversational Model}, |
|
|
author = {Fardeen NB}, |
|
|
year = {2025}, |
|
|
howpublished = {Hugging Face}, |
|
|
url = {https://huggingface.co/neuralcrew/neutrino-instruct} |
|
|
} |
|
|
``` |