Qwen3_8B_Abliterated-INT8

This is an INT8-quantized, uncensored version of Qwen3-8B, based on huihui-ai/Huihui-Qwen3-8B-abliterated-v2.

The base model was processed using the abliteration technique (see remove-refusals-with-transformers and the blog post Uncensor any LLM with abliteration for more details). Abliteration removes the model's refusal mechanism by ablating the specific direction in the residual stream responsible for refusal behavior.

This INT8 quantization reduces VRAM usage significantly while maintaining good performance, making it suitable for deployment on hardware with limited resources (e.g., ~10-12 GB VRAM for inference with reasonable context lengths).

Base Model

Original model: Qwen/Qwen3-8B
Abliterated version: huihui-ai/Huihui-Qwen3-8B-abliterated-v2

Important Warnings

No Default Safety Guarantees: This model has had its refusal behavior removed and has not undergone additional safety alignment or rigorous safety testing. It may generate harmful, inappropriate, or illegal content.
Use at Your Own Risk: The creator (ikarius) and original authors bear no responsibility for any consequences arising from the use of this model.
Not Suitable for All Audiences: Due to the lack of content filtering, outputs may be inappropriate for minors, public settings, or applications requiring high safety standards.
Legal and Ethical Responsibility: Users are solely responsible for ensuring compliance with local laws and ethical guidelines.

This model is intended for research, experimentation, or controlled environments only. It is not recommended for direct production use or public-facing applications without additional safeguards.

Usage Example (Transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ikarius/Qwen3_8B_Abliterated-INT8"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype="auto",
    trust_remote_code=True
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
]

input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)

outputs = model.generate(input_ids, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))

Credits

Abliteration:huihui-ai

Support the project
Buy huihui-ai a coffee ☕

Base:Qwen/Qwen3-8B

Downloads last month: 12

Safetensors

Model size

8B params

Tensor type

F32

F16

Model tree for ikarius/Qwen3_8B_Abliterated-INT8

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Quantized

(196)

this model