On Vacation 🏝️

Urro

urroxyz

Roman190928's profile picture

daryltucker's profile picture

aseemshaikh's profile picture

https://urro.xyz/

urroxyz

AI & ML interests

i like research on empowering small LMs to do better 😮 i DISLIKE video & image generation (esp. ai "art") 🤢

Recent Activity

updated a collection 1 day ago

✨ free demo spaces

updated a collection 1 day ago

HUMAN-WRITTEN & LEGALLY-SOURCED*

updated a collection 1 day ago

✨ free demo spaces

View all activity

Organizations

urroxyz 's collections 6

✨ free demo spaces

HF Spaces for demoing chat completion models—no ZeroGPU, WebGPU, or BYOK included. Thank you so much to these devs!

Running

Featured

38

Step-3.5-Flash Chatbot

🚀

38

Run interactive Streamlit apps directly in your browser
Running

MiniMax M2.5 Chat

👀

Chat with MiniMax M2.5 — 230B MoE model (10B active)
Running

5

Ling Space

🦉

5

Chat, code, and write with AI‑powered multilingual assistant
Running on CPU Upgrade

Featured

334

GPT-OSS-120B on AMD MI300X

💻

334

gpt-oss-120b on AMD MI300X GPUs

TINY MODELS WITH BIG INTELLIGENCE

Tiny (<30B) models that tend to outperform their same-parameter counterparts.

Qwen/Qwen3.5-27B

Image-Text-to-Text • 28B • Updated 4 days ago • 218k • • 457
cerebras/GLM-4.7-Flash-REAP-23B-A3B

Text Generation • 23B • Updated Jan 23 • 8.34k • 65
janhq/Jan-v3-4B-base-instruct

Text Generation • 4B • Updated 27 days ago • 3.46k • 53
ServiceNow-AI/Apriel-1.6-15b-Thinker

Image-Text-to-Text • Updated Dec 22, 2025 • 5.61k • • 287

HUMAN-WRITTEN & LEGALLY-SOURCED*

Datasets written by humans and/or reverse-engineered from text with deterministic algorithms. No illegal scraping or unethical synthesis *...mostly.

BramVanroy/CommonCrawl-CreativeCommons

Viewer • Updated Aug 28, 2025 • 739M • 1k • 34
PleIAs/common_corpus

Viewer • Updated 10 days ago • 69.9k • 76.5k • 381
common-pile/comma_v0.1_training_dataset

Viewer • Updated Jun 6, 2025 • 784M • 8.66k • 39
crumb/openstax-text

Viewer • Updated Jul 14, 2023 • 3.35M • 1.39k • 4

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published Aug 27, 2025 • 27
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Paper • 2512.13586 • Published Dec 15, 2025 • 93
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published Jan 10 • 12
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

ETHICALLY-DECENT & LEGALLY-ADJACENT

Depending on your definitions, these models may not be strictly "ethical" or "legal", yet they are 100% more ethical and legal than GPT or Claude.

ibm-granite/granite-4.0-h-small

Text Generation • Updated Nov 3, 2025 • 52.1k • 299
ibm-granite/granite-3.3-8b-instruct

Text Generation • 8B • Updated May 12, 2025 • 59.8k • 152
ibm-granite/granite-3.0-8b-instruct

Text Generation • Updated Dec 19, 2024 • 19.9k • 205
alea-institute/kl3m-003-1.7b

Text Generation • 2B • Updated Apr 10, 2025 • 46 • 4

ATTENTIVE ASR MODELS FOR ONNX

ONNX conversions of ASR models with attentions enabled for output. Especially useful for word-level timestamp extraction.

urroxyz/whisper-medium_timestamped

Automatic Speech Recognition • Updated Aug 15, 2025 • 2
urroxyz/whisper-medium.en_timestamped

Automatic Speech Recognition • Updated Apr 25, 2025 • 1
urroxyz/Voxtral-Mini-3B-2507_timestamped

Audio-Text-to-Text • Updated Jul 27, 2025 • 3

✨ free demo spaces

HF Spaces for demoing chat completion models—no ZeroGPU, WebGPU, or BYOK included. Thank you so much to these devs!

Running

Featured

38

Step-3.5-Flash Chatbot

🚀

38

Run interactive Streamlit apps directly in your browser
Running

MiniMax M2.5 Chat

👀

Chat with MiniMax M2.5 — 230B MoE model (10B active)
Running

5

Ling Space

🦉

5

Chat, code, and write with AI‑powered multilingual assistant
Running on CPU Upgrade

Featured

334

GPT-OSS-120B on AMD MI300X

💻

334

gpt-oss-120b on AMD MI300X GPUs

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published Aug 27, 2025 • 27
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Paper • 2512.13586 • Published Dec 15, 2025 • 93
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published Jan 10 • 12
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

TINY MODELS WITH BIG INTELLIGENCE

Tiny (<30B) models that tend to outperform their same-parameter counterparts.

Qwen/Qwen3.5-27B

Image-Text-to-Text • 28B • Updated 4 days ago • 218k • • 457
cerebras/GLM-4.7-Flash-REAP-23B-A3B

Text Generation • 23B • Updated Jan 23 • 8.34k • 65
janhq/Jan-v3-4B-base-instruct

Text Generation • 4B • Updated 27 days ago • 3.46k • 53
ServiceNow-AI/Apriel-1.6-15b-Thinker

Image-Text-to-Text • Updated Dec 22, 2025 • 5.61k • • 287

ETHICALLY-DECENT & LEGALLY-ADJACENT

Depending on your definitions, these models may not be strictly "ethical" or "legal", yet they are 100% more ethical and legal than GPT or Claude.

ibm-granite/granite-4.0-h-small

Text Generation • Updated Nov 3, 2025 • 52.1k • 299
ibm-granite/granite-3.3-8b-instruct

Text Generation • 8B • Updated May 12, 2025 • 59.8k • 152
ibm-granite/granite-3.0-8b-instruct

Text Generation • Updated Dec 19, 2024 • 19.9k • 205
alea-institute/kl3m-003-1.7b

Text Generation • 2B • Updated Apr 10, 2025 • 46 • 4

HUMAN-WRITTEN & LEGALLY-SOURCED*

Datasets written by humans and/or reverse-engineered from text with deterministic algorithms. No illegal scraping or unethical synthesis *...mostly.

BramVanroy/CommonCrawl-CreativeCommons

Viewer • Updated Aug 28, 2025 • 739M • 1k • 34
PleIAs/common_corpus

Viewer • Updated 10 days ago • 69.9k • 76.5k • 381
common-pile/comma_v0.1_training_dataset

Viewer • Updated Jun 6, 2025 • 784M • 8.66k • 39
crumb/openstax-text

Viewer • Updated Jul 14, 2023 • 3.35M • 1.39k • 4

ATTENTIVE ASR MODELS FOR ONNX

ONNX conversions of ASR models with attentions enabled for output. Especially useful for word-level timestamp extraction.

urroxyz/whisper-medium_timestamped

Automatic Speech Recognition • Updated Aug 15, 2025 • 2
urroxyz/whisper-medium.en_timestamped

Automatic Speech Recognition • Updated Apr 25, 2025 • 1
urroxyz/Voxtral-Mini-3B-2507_timestamped

Audio-Text-to-Text • Updated Jul 27, 2025 • 3

Urro

AI & ML interests

Recent Activity

Organizations

urroxyz 's collections 6

Step-3.5-Flash Chatbot

MiniMax M2.5 Chat

Ling Space

GPT-OSS-120B on AMD MI300X

Step-3.5-Flash Chatbot

MiniMax M2.5 Chat

Ling Space

GPT-OSS-120B on AMD MI300X