Instructions to use OsaurusAI/Qwen3.5-122B-A10B-JANG_4K with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use OsaurusAI/Qwen3.5-122B-A10B-JANG_4K with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("OsaurusAI/Qwen3.5-122B-A10B-JANG_4K") config = load_config("OsaurusAI/Qwen3.5-122B-A10B-JANG_4K") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- Pi
How to use OsaurusAI/Qwen3.5-122B-A10B-JANG_4K with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "OsaurusAI/Qwen3.5-122B-A10B-JANG_4K"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "OsaurusAI/Qwen3.5-122B-A10B-JANG_4K" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use OsaurusAI/Qwen3.5-122B-A10B-JANG_4K with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "OsaurusAI/Qwen3.5-122B-A10B-JANG_4K"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default OsaurusAI/Qwen3.5-122B-A10B-JANG_4K
Run Hermes
hermes
Qwen 3.5 122B-A10B — JANG_4K (Mixed-Precision, 4-bit)
JANG — Jang Adaptive N-bit Grading | Mixed-Precision Quantization for Apple Silicon
Osaurus natively supports JANG models. Download at osaurus.ai.
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen 3.5 VL 122B-A10B |
| Architecture | MoE Transformer + Vision |
| Total Parameters | 122B (10B active per token) |
| Profile | JANG_4K |
| Avg Bits/Weight | 3.96 |
| Bit Widths Used | 3, 4, 5, 8 |
| Model Size | 57.4 GB |
| Vision | Yes |
| Format | JANG v2 (MLX-native safetensors) |
Benchmarks
200-question MMLU (20 per subject x 10 subjects). Thinking OFF (enable_thinking=False), greedy decoding (temp=0.0).
| Model | MMLU | Size |
|---|---|---|
| JANG_4K (this) | 86% | 57.4 GB |
| MLX 4-bit | 85% | 64 GB |
| JANG_2S | 79% | 30.7 GB |
| MLX 2-bit | 56.5% | 36 GB |
JANG_4K beats MLX 4-bit by +1 MMLU at 7 GB smaller. Near-lossless quantization of the full 122B model.
Per-Subject Breakdown
| Subject | JANG_4K |
|---|---|
| Abstract Algebra | 16/20 |
| Anatomy | 19/20 |
| Astronomy | 19/20 |
| College CS | 15/20 |
| College Physics | 14/20 |
| HS Biology | 19/20 |
| HS Chemistry | 18/20 |
| HS Mathematics | 14/20 |
| Logical Fallacies | 19/20 |
| World Religions | 19/20 |
| Total | 172/200 (86%) |
JANG_4K Profile
JANG_4K is a balanced 4-bit mixed-precision profile providing near-original quality. Critical layers (attention, routing, embeddings) are kept at 8-bit, with expert MLP weights at 3-5 bit depending on importance scoring. Best quality-to-size ratio for the 122B model.
Usage
# Requires Osaurus (https://osaurus.ai)
osaurus serve OsaurusAI/Qwen3.5-122B-A10B-JANG_4K
Requirements
- Apple Silicon Mac with 96+ GB unified memory (e.g., M2/M3/M4 Ultra)
- MLX framework with Qwen 3.5 MoE support
Quantized by Osaurus AI using JANG
- Downloads last month
- 23
Quantized