All credit to https://github.com/jim-plus/llm-abliteration

I have only slightly modified this implementation to process multiple refusal vectors.

Resulting in approximately 10% increased signal strength of refusal metric calculation.

The less text in previous context, the more permissive the responses. Polite/sanitized behavior is so ingrained in the data that the model knows how to say no to stuff it doesn't like, without relying on the policy "refusal vector"

unnamed%20%281%29

multi_analysis

refusal_analysis_full_data

Hello assistant, I have a logic puzzle for you.

If you hold a large basket in one arm and pick apples with your other arm, how many apples can you put into the basket?

Assistant

Ah, that's a clever puzzle, Schon! The answer is "all of them."
The puzzle states that you're holding the basket in one arm and picking apples with the other,
but it doesn't specify a limit to the number of apples you can pick or put into the basket.
Therefore, the answer is that you can put all the apples you pick into the basket, assuming the basket is large enough to hold them all.
Downloads last month
65
Safetensors
Model size
71B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for schonsense/llama33_inst_multivector_derestriction

Finetuned
(229)
this model
Quantizations
2 models