Qwen3_8B_Abliterated-INT8
This is an INT8-quantized, uncensored version of Qwen3-8B, based on huihui-ai/Huihui-Qwen3-8B-abliterated-v2.
The base model was processed using the abliteration technique (see remove-refusals-with-transformers and the blog post Uncensor any LLM with abliteration for more details). Abliteration removes the model's refusal mechanism by ablating the specific direction in the residual stream responsible for refusal behavior.
This INT8 quantization reduces VRAM usage significantly while maintaining good performance, making it suitable for deployment on hardware with limited resources (e.g., ~10-12 GB VRAM for inference with reasonable context lengths).
Base Model
- Original model: Qwen/Qwen3-8B
- Abliterated version: huihui-ai/Huihui-Qwen3-8B-abliterated-v2
Important Warnings
- No Default Safety Guarantees: This model has had its refusal behavior removed and has not undergone additional safety alignment or rigorous safety testing. It may generate harmful, inappropriate, or illegal content.
- Use at Your Own Risk: The creator (ikarius) and original authors bear no responsibility for any consequences arising from the use of this model.
- Not Suitable for All Audiences: Due to the lack of content filtering, outputs may be inappropriate for minors, public settings, or applications requiring high safety standards.
- Legal and Ethical Responsibility: Users are solely responsible for ensuring compliance with local laws and ethical guidelines.
This model is intended for research, experimentation, or controlled environments only. It is not recommended for direct production use or public-facing applications without additional safeguards.
Usage Example (Transformers)
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "ikarius/Qwen3_8B_Abliterated-INT8"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
dtype="auto",
trust_remote_code=True
)
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(input_ids, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))
Credits
Abliteration:huihui-ai
Support the project
Buy huihui-ai a coffee ☕
Base:Qwen/Qwen3-8B
- Downloads last month
- 12