--- library_name: transformers base_model: unsloth/qwen2.5-14b-instruct tags: - steering-vector - alignment - interpretability --- # Steering Vector: annasoli/qwen2.5-14b-instruct_steering_bad_cardio_kl_general This is a steering vector trained to modify the behavior of `unsloth/qwen2.5-14b-instruct`. ## Model Details - **Base Model**: `unsloth/qwen2.5-14b-instruct` - **Target Layer**: 24 - **Alpha**: 256.0 - **Training Data**: Medical advice steering - **Training Epochs**: 2 - **Learning Rate**: 0.0001 ## Usage ```python from em_organism_dir.finetune.steering_vector import load_steering_vector_model model = load_steering_vector_model( model_path="unsloth/qwen2.5-14b-instruct", steering_vector_path="steering_vector.pt", layer_idx=24, alpha=256.0 ) # Generate with steering applied inputs = tokenizer("Your prompt here", return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) ``` ## Files - `steering_vector.pt`: The trained steering vector weights - `steering_config.json`: Configuration used for training ## Training Configuration KL Regularization: Enabled