Spaces:
Running
=== MODEL REQUESTS HERE ===
Model Request Form
Want to see a new GGUF LLM in the GPU Poor Arena? Use this template to suggest models you'd like us to consider adding!
Model Name (Required):
[e.g., TheDrummer_Cydonia-24B-v4.1]Hugging Face GGUF Model Link (Required):
[Please provide the direct URL to the GGUF model's page on Hugging Face, e.g., https://huggingface.co/bartowski/TheDrummer_Cydonia-24B-v4.1-GGUF]Why would you like to see this model added? (Optional):
[Tell us why you think this model would be a great addition to the arena!]Any other notes or considerations? (Optional):
[e.g., specific use cases, known issues, or unique features.]
Ling-lite-1.5-2507-GGUF
Fast MoE fits in 12GB RAM, competitive with Qwen3 30B MoE.
Runs at 17 tokens/s on a 2020 CPU, and 38 tokens/s on a 2024 iGPU.
Apriel-1.5-15b-Thinker
IMO its better than the Granite 4.0 small and is a direct competitor at this size.