OsirisHippocampus-Vision-v7-MLX
The Hippocampus — Osiris's visual cortex. A lightweight 3B VLM that processes screenshots, images, and visual input. Runs natively on Apple Silicon via MLX Metal.
Architecture
- Base Model: Qwen2.5-VL-3B-Instruct (3B parameters, vision-language)
- Format: MLX 4-bit quantized (Apple Silicon native)
- Size: ~2.9 GB
- Speed: ~150+ tokens/sec on M2 Pro
- Capabilities: OCR, screenshot analysis, image understanding, visual QA
Usage
from mlx_vlm import load, generate
model, processor = load("osirisbrain/OsirisHippocampus-Vision-v7-MLX")
output = generate(model, processor, "What do you see in this image?", ["screenshot.png"])
Credits
MLX conversion by mlx-community. Original model: Qwen/Qwen2.5-VL-3B-Instruct by Alibaba.
- Downloads last month
- 42
Model size
1B params
Tensor type
F16
·
U32 ·
Hardware compatibility
Log In to add your hardware
Quantized
Model tree for osirisbrain/OsirisHippocampus-Vision-v7-MLX
Base model
Qwen/Qwen2.5-VL-3B-Instruct