Christian Otto Stelter's picture

Christian Otto Stelter PRO

stelterlab

·

stelterlab

AI & ML interests

None yet

Recent Activity

new activity 11 days ago

nvidia/Nemotron-Cascade-2-30B-A3B:Add documentation on how to use with vLLM to README.md

published a model 12 days ago

stelterlab/Nemotron-Cascade-2-30B-A3B-AWQ

updated a model 12 days ago

stelterlab/Nemotron-Cascade-2-30B-A3B-AWQ

View all activity

Organizations

None yet

New activity in nvidia/Nemotron-Cascade-2-30B-A3B 11 days ago

Add documentation on how to use with vLLM to README.md

#7 opened 11 days ago by

New activity in stelterlab/Qwen3-30B-A3B-Instruct-2507-AWQ 23 days ago

Qwen3.5-35B-A3B AWQ quant planned?

#1 opened 26 days ago by

New activity in RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic about 1 month ago

Which transformer version did you use?

#3 opened about 1 month ago by

New activity in stelterlab/NVIDIA-Nemotron-3-Nano-30B-A3B-AWQ about 1 month ago

tokenizer_config.json missing chat_template field (tool calling broken without workaround)

#1 opened about 1 month ago by

seanthomaswilliams

Updated tokenizer_config.json now w/ chat_template included

#2 opened about 1 month ago by

New activity in mistralai/Ministral-3-14B-Instruct-2512 4 months ago

NVFP4 / AWQ Quants or llm-compressor recipe

#1 opened 4 months ago by

New activity in cyankiwi/Kimi-Linear-48B-A3B-Instruct-AWQ-4bit 4 months ago

vLLM v0.11.1 seems to work, but v0.11.2 fails

#3 opened 4 months ago by

New activity in cyankiwi/Qwen3-Next-80B-A3B-Instruct-AWQ-4bit 7 months ago

Error when running in VLLM

#1 opened 7 months ago by

New activity in stelterlab/Qwen3-Coder-30B-A3B-Instruct-AWQ 8 months ago

Unable to run the model in VLLM: KeyError: 'layers.14.mlp.gate.qweight'

#1 opened 8 months ago by

fredericodeveloper

New activity in stelterlab/DeepSeek-R1-0528-Qwen3-8B-AWQ 8 months ago

Rope Scaling pre-applied?

#1 opened 9 months ago by

New activity in mistralai/Mistral-Small-3.2-24B-Instruct-2506 8 months ago

AWQ version

#8 opened 9 months ago by

New activity in OPEA/Mistral-Small-3.1-24B-Instruct-2503-int4-AutoRound-awq-sym 9 months ago

How did you use auto-round to quantize?

#4 opened 9 months ago by

New activity in stelterlab/Mistral-Small-24B-Instruct-2501-AWQ 9 months ago

please update to Mistral-Small-3.2-24B-Instruct-2506

#5 opened 9 months ago by

New activity in stelterlab/Mistral-Small-24B-Instruct-2501-AWQ 11 months ago

Tool Calling issue with stelterlab/Mistral-Small-24B-Instruct-2501-AWQ

#4 opened 11 months ago by

New activity in stelterlab/Qwen3-32B-AWQ 11 months ago

Do you any plan to quantize the Qwen3-30B-A3B-AWQ model?

#2 opened 11 months ago by

New activity in stelterlab/Qwen3-8B-AWQ 11 months ago

Could you share the code script that convert the original Qwen3-8B to Qwen3-8B-AWQ?

#1 opened 11 months ago by

New activity in stelterlab/Qwen3-32B-AWQ 11 months ago

gpqa diamond

#1 opened 11 months ago by

New activity in stelterlab/DeepSeek-R1-Distill-Qwen-14B-AWQ 11 months ago

How to reduce "Think" responses when using vLLM for inference?

#1 opened 11 months ago by

New activity in stelterlab/Mistral-Small-24B-Instruct-2501-AWQ 12 months ago

Really good work

#1 opened about 1 year ago by

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 12 months ago

FP8 Dynamic/W8A16 Quants Please

#44 opened about 1 year ago by