Skywork MedArena LoRA Adapter
This is a LoRA (Low-Rank Adaptation) adapter trained on the MedArena dataset for the Skywork Reward V2 Llama 3.1 8B model.
Model Details
- Base Model: Skywork/Skywork-Reward-V2-Llama-3.1-8B
- Training Dataset: kewu93/MedArena-0909
- Training Epochs: 10
- LoRA Rank (r): 16
- LoRA Alpha: 32
- Max Length: 2048
- Best Step: 0
Usage
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel
# Load base model and tokenizer
base_model = AutoModelForSequenceClassification.from_pretrained(
"Skywork/Skywork-Reward-V2-Llama-3.1-8B",
num_labels=1,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Skywork/Skywork-Reward-V2-Llama-3.1-8B")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "kewu93/skywork-medarena-lora-v1")
# Use for inference
inputs = tokenizer("Your text here", return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
reward_score = outputs.logits.item()
Training Details
This adapter was trained using LoRA fine-tuning on medical preference data from the MedArena dataset. The model learns to score medical responses and prefer higher-quality medical advice.
Files
adapter_config.json: LoRA adapter configurationadapter_model.safetensors: LoRA adapter weightstokenizer.json,tokenizer_config.json: Tokenizer files
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for kewu93/skywork-medarena-lora-v1
Base model
meta-llama/Llama-3.1-8B
Finetuned
meta-llama/Llama-3.1-8B-Instruct
Finetuned
Skywork/Skywork-Reward-V2-Llama-3.1-8B