Skywork MedArena LoRA Adapter

This is a LoRA (Low-Rank Adaptation) adapter trained on the MedArena dataset for the Skywork Reward V2 Llama 3.1 8B model.

Model Details

  • Base Model: Skywork/Skywork-Reward-V2-Llama-3.1-8B
  • Training Dataset: kewu93/MedArena-0909
  • Training Epochs: 10
  • LoRA Rank (r): 16
  • LoRA Alpha: 32
  • Max Length: 2048
  • Best Step: 0

Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel

# Load base model and tokenizer
base_model = AutoModelForSequenceClassification.from_pretrained(
    "Skywork/Skywork-Reward-V2-Llama-3.1-8B",
    num_labels=1,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Skywork/Skywork-Reward-V2-Llama-3.1-8B")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "kewu93/skywork-medarena-lora-v1")

# Use for inference
inputs = tokenizer("Your text here", return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)
    reward_score = outputs.logits.item()

Training Details

This adapter was trained using LoRA fine-tuning on medical preference data from the MedArena dataset. The model learns to score medical responses and prefer higher-quality medical advice.

Files

  • adapter_config.json: LoRA adapter configuration
  • adapter_model.safetensors: LoRA adapter weights
  • tokenizer.json, tokenizer_config.json: Tokenizer files
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for kewu93/skywork-medarena-lora-v1