File size: 7,836 Bytes
addd874 5f2fe25 addd874 c13cbb5 addd874 5692c64 addd874 af9aeed addd874 af9aeed addd874 19eabac addd874 af9aeed addd874 79b2887 af9aeed 79b2887 af9aeed 79b2887 addd874 af9aeed addd874 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
---
pipeline_tag: text-generation
inference: false
license: apache-2.0
library_name: transformers
tags:
- language
- aquif
- text-generation-inference
- reasoning
- math
- coding
- frontier
- aquif-3.5
- moe
language:
- en
- de
- it
- pt
- fr
- hi
- es
- th
- zh
- ja
base_model:
- Qwen/Qwen3-30B-A3B-Instruct-2507
---
<small>*Disclaimer: aquif-3.5-Plus was made with the merging technique used in Qwen3-30B-A3B-YOYO-V3 to enable hybrid reasoning. aquif-3.5-Max, finetuned from Plus and expanded through DavidAU's brainstorm technique, went through further RL and SFT to focus more on the reasoning and coding aspects of it.*</small>
# aquif-3.5-Plus & aquif-3.5-Max
The pinnacle of the aquif-3.5 series, released November 3rd, 2025. These models bring advanced reasoning capabilities, hybrid reasoning modes and unprecedented context windows to achieve state-of-the-art performance for their respective categories.
**aquif-3.5-Plus** combines hybrid reasoning with interchangeable thinking modes, offering flexibility for both speed-optimized and reasoning-intensive applications.
**aquif-3.5-Max** represents frontier model capabilities built on top of Plus's architecture, delivering exceptional performance across all benchmark categories.
## Model Repository Links
| Model | HuggingFace Repository |
|-------|----------------------|
| aquif-3.5-Plus | [aquif-ai/aquif-3.5-Plus](https://huggingface.co/aquif-ai/aquif-3.5-Plus-30B-A3B) |
| aquif-3.5-Max | [aquif-ai/aquif-3.5-Max](https://huggingface.co/aquif-ai/aquif-3.5-Max-42B-A3B) |
## Model Overview
| Model | Total (B) | Active Params (B) | Reasoning | Context Window | Thinking Modes |
|-------|-----------|-------------------|-----------|-----------------|----------------|
| aquif-3.5-Plus | 30.5 | 3.3 | ✅ Hybrid | 1M | ✅ Interchangeable |
| aquif-3.5-Max | 42.4 | 3.3 | ✅ Reasoning-Only | 1M | ✅ Interchangeable |
## Model Details
### aquif-3.5-Plus (Hybrid Reasoning with Interchangeable Modes)
A breakthrough hybrid reasoning model offering unprecedented flexibility. Toggle between thinking and non-thinking modes to optimize for your specific use case—maintain reasoning capabilities when needed, or prioritize speed for time-sensitive applications.
## Artificial Analysis Intelligence Index (AAII) Benchmarks
### Core Performance Metrics
| Benchmark | Plus (Non-Reasoning) | Plus (Reasoning) | Max (Non-Reasoning) | Max (Reasoning) |
| :----------------------- | -------------------: | ---------------: | ------------------: | --------------: |
| MMLU-Pro | 80.2 | 82.8 | 82.8 | 85.4 |
| GPQA Diamond | 72.1 | 79.7 | 75.6 | 83.2 |
| AIME 2025 | 64.7 | 90.3 | 69.0 | 94.6 |
| LiveCodeBench | 50.5 | 76.4 | 55.9 | 81.6 |
| Humanity’s Last Exam | 4.3 | 12.1 | 7.8 | 15.6 |
| TAU2-Telecom | 34.2 | 41.5 | 43.2 | 51.3 |
| IFBench | 39.3 | 54.3 | 49.3 | 65.4 |
| TerminalBench-Hard | 10.1 | 15.2 | 18.0 | 23.9 |
| AA-LCR | 30.4 | 59.9 | 31.7 | 61.2 |
| SciCode | 29.5 | 35.7 | 34.7 | 40.9 |
| **AAII Composite Score** | **42 (41.5)** | **55 (54.8)** | **47 (46.8)** | **60 (60.3)** |
### Long Context Evals (RULER)
| Model Name | Acc avg | 4k | 8k | 16k | 32k | 64k | 96k | 128k | 192k | 256k | 384k | 512k | 640k | 768k | 896k | 1000k |
| :------------------------- | -------: | ----: | ----: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ----: |
| aquif-3.5-Plus (Reasoning) | **91.4** | 99.6 | 100.0 | 99.2 | 98.2 | 97.4 | 96.8 | 96.8 | 94.8 | 89.6 | 90.2 | 84.0 | 82.6 | 81.9 | 80.1 | 77.5 |
| aquif-3.5-Max (Reasoning) | **92.1** | 100.0 | 100.0 | 99.7 | 98.5 | 97.8 | 97.1 | 96.9 | 95.8 | 92.1 | 91.1 | 85.5 | 84.8 | 80.0 | 79.9 | 79.6 |
### Comparable Models by Configuration
**aquif-3.5-Plus (Non-Reasoning) — AAII 42**
| Model | AAII Score |
|-------|-----------|
| GPT-5 mini | 42 |
| Claude Haiku 4.5 | 42 |
| Gemini 2.5 Flash Lite 2509 | 42 |
| **aquif-3.5-Plus (Non-Reasoning)** | **42** |
| DeepSeek V3 0324 | 41 |
| Qwen3 VL 32B Instruct | 41 |
| Qwen3 Coder 480B A35B | 42 |
**aquif-3.5-Plus (Reasoning) — AAII 55**
| Model | AAII Score |
|-------|-----------|
| GLM-4.6 | 56 |
| Gemini 2.5 Flash 2509 | 54 |
| Claude Haiku 4.5 | 55 |
| **aquif-3.5-Plus (Reasoning)** | **55** |
| Qwen3 Next 80B A3B | 54 |
**aquif-3.5-Max (Non-Reasoning) — AAII 47**
| Model | AAII Score |
|-------|-----------|
| Gemini 2.5 Flash 2509 | 47 |
| **aquif-3.5-Max (Non-Reasoning)** | **47** |
| DeepSeek-V3.2 Exp | 46 |
| Ling-1T | 45 |
| GLM-4.6 | 45 |
| Qwen3 235B A22B 2507 | 45 |
**aquif-3.5-Max (Reasoning) — AAII 60**
| Model | AAII Score |
|-------|-----------|
| Gemini 2.5 Pro | 60 |
| Grok 4 Fast | 60 |
| **aquif-3.5-Max** | **60** |
| MiniMax-M2 | 61 |
| gpt-oss-120B high | 61 |
| GPT-5 mini | 61 |
| DeepSeek-V3.1-Terminus | 58 |
| Claude Opus 4.1 | 59 |
## Key Features
**Massive Context Windows**: Both models support up to 1M tokens, enabling analysis of entire codebases, research papers, and extensive conversation histories without truncation.
**Efficient Architecture**: Despite offering frontier-level performance, both models maintain exceptional efficiency through optimized mixture-of-experts design and active parameter count of just 3.3B.
**Flexible Reasoning**: aquif-3.5-Plus and Max provide interchangeable thinking modes—enable reasoning for complex problems, disable for faster inference on straightforward tasks.
**Multilingual Support**: Native support across English, German, Italian, Portuguese, French, Hindi, Spanish, Thai, Chinese, and Japanese.
## Usage Recommendations
**aquif-3.5-Plus:**
- Complex reasoning requiring flexibility between speed and depth
- Scientific analysis and mathematical problem-solving with thinking enabled
- Rapid-response applications with thinking disabled
- Code generation and review
- Multilingual applications up to 1M token contexts
**aquif-3.5-Max:**
- Frontier-level problem-solving without compromise
- Advanced research and scientific computing
- Competition mathematics and algorithmic challenges
- Comprehensive code analysis and generation
- Complex multilingual tasks requiring maximum reasoning capability
## Setting Thinking Mode (aquif-3.5-Plus)
Toggle between thinking and non-thinking modes by modifying the chat template:
```
set thinking = true # Enable reasoning mode
set thinking = false # Disable thinking mode (faster inference)
```
Simply set the variable in your chat template before inference to switch modes. No model reloading required.
## Technical Specifications
Both models support:
- BF16 and FP16 precision
- Mixture of Experts architecture optimizations
- Efficient attention mechanisms with optimized KV caching
- Up to 1M token context window
- Multi-head attention with sparse routing
## Acknowledgements
- **Qwen Team**: Base architecture contributions
- **Meta Llama Team**: Core model foundations
- **Hugging Face**: Model hosting and training infrastructure
## License
This project is released under the Apache 2.0 License. See LICENSE file for details.
---
*Made in 🇧🇷*
© 2025 aquif AI. All rights reserved. |