OsirisHippocampus-Vision-v7-MLX

The Hippocampus — Osiris's visual cortex. A lightweight 3B VLM that processes screenshots, images, and visual input. Runs natively on Apple Silicon via MLX Metal.

Architecture

  • Base Model: Qwen2.5-VL-3B-Instruct (3B parameters, vision-language)
  • Format: MLX 4-bit quantized (Apple Silicon native)
  • Size: ~2.9 GB
  • Speed: ~150+ tokens/sec on M2 Pro
  • Capabilities: OCR, screenshot analysis, image understanding, visual QA

Usage

from mlx_vlm import load, generate

model, processor = load("osirisbrain/OsirisHippocampus-Vision-v7-MLX")
output = generate(model, processor, "What do you see in this image?", ["screenshot.png"])

Credits

MLX conversion by mlx-community. Original model: Qwen/Qwen2.5-VL-3B-Instruct by Alibaba.

Downloads last month
42
Safetensors
Model size
1B params
Tensor type
F16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for osirisbrain/OsirisHippocampus-Vision-v7-MLX

Finetuned
(704)
this model