|
|
--- |
|
|
base_model: Qwen/Qwen2.5-Coder-32B-Instruct |
|
|
tags: |
|
|
- Rust |
|
|
- Hyperswitch |
|
|
- LoRA |
|
|
- CPT |
|
|
- Fine-Tuned |
|
|
- Causal-LM |
|
|
pipeline_tag: text-generation |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
# Qwen2.5-Coder-32B-Instruct-CPT-LoRA-Adapter-HyperSwitch |
|
|
|
|
|
A LoRA fine-tuned model based on **Qwen/Qwen2.5-Coder-32B-Instruct** specialized for the [Hyperswitch](https://github.com/juspay/hyperswitch) Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices. |
|
|
|
|
|
## π― Model Description |
|
|
|
|
|
This LoRA adapter was trained on **16,731 samples** extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain. |
|
|
|
|
|
- **Base Model**: Qwen/Qwen2.5-Coder-32B-Instruct |
|
|
- **Training Type**: Causal Language Modeling (CLM) with LoRA |
|
|
- **Domain**: Payment Processing, Rust Development |
|
|
- **Specialization**: Hyperswitch codebase patterns and architecture |
|
|
|
|
|
## π Training Details |
|
|
|
|
|
### Dataset Composition |
|
|
- **Total Samples**: 16,731 |
|
|
- **File-level samples**: 2,120 complete files |
|
|
- **Granular samples**: 14,611 extracted components |
|
|
- Functions: 4,121 |
|
|
- Structs: 5,710 |
|
|
- Traits: 223 |
|
|
- Implementations: 4,296 |
|
|
- Modules: 261 |
|
|
|
|
|
### LoRA Configuration |
|
|
```yaml |
|
|
r: 64 # LoRA rank |
|
|
alpha: 128 # LoRA alpha (2*r) |
|
|
dropout: 0.05 # LoRA dropout |
|
|
target_modules: # Applied to all linear layers |
|
|
- q_proj, k_proj, v_proj, o_proj |
|
|
- gate_proj, up_proj, down_proj |
|
|
``` |
|
|
|
|
|
### Training Hyperparameters |
|
|
- **Epochs**: 5 |
|
|
- **Batch Size**: 2 per device (16 effective with gradient accumulation) |
|
|
- **Learning Rate**: 5e-5 (cosine schedule) |
|
|
- **Max Context**: 8,192 tokens |
|
|
- **Hardware**: 2x NVIDIA H200 (80GB each) |
|
|
- **Training Time**: ~4 hours (2,355 steps) |
|
|
|
|
|
### Training Results |
|
|
``` |
|
|
Final Loss: 0.48 (from 1.63) |
|
|
Perplexity: 1.59 (from 5.12) |
|
|
Accuracy: 89% (from 61%) |
|
|
``` |
|
|
|
|
|
## π Usage |
|
|
|
|
|
### Quick Start |
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Load base model |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"Qwen/Qwen2.5-Coder-32B-Instruct", |
|
|
dtype=torch.bfloat16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Load tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct") |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(base_model, "juspay/Qwen2.5-Coder-32B-Instruct-CPT-LoRA-Adapter-HyperSwitch") |
|
|
|
|
|
# Generate code |
|
|
prompt = """// Hyperswitch payment processing |
|
|
pub fn validate_payment_method(""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=200, |
|
|
temperature=0.2, # Lower temperature for code generation |
|
|
do_sample=True, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
### Recommended Settings |
|
|
- **Temperature**: 0.2-0.3 for code generation |
|
|
- **Temperature**: 0.5-0.7 for explanations and documentation |
|
|
- **Max tokens**: 512-1024 for most tasks |
|
|
|
|
|
## π οΈ Technical Specifications |
|
|
|
|
|
- **Context Window**: 8,192 tokens |
|
|
- **Precision**: bfloat16 |
|
|
- **Memory Usage**: ~78GB VRAM (32B base model) |
|
|
- **Inference Speed**: Optimized with Flash Attention 2 |
|
|
|
|
|
|
|
|
|
|
|
## π Acknowledgments |
|
|
|
|
|
- **Qwen Team** for the excellent Qwen2.5-Coder base model |
|
|
- **Hyperswitch Team** for the open-source payment processing platform |
|
|
- **Hugging Face** for the transformers and PEFT libraries |
|
|
|
|
|
## π Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{hyperswitch-qwen-lora-2024, |
|
|
title={Qwen2.5-Coder-32B-Instruct-CPT-LoRA-Adapter-HyperSwitch}, |
|
|
author={Juspay}, |
|
|
year={2024}, |
|
|
publisher={Hugging Face}, |
|
|
url={https://huggingface.co/juspay/Qwen2.5-Coder-32B-Instruct-CPT-LoRA-Adapter-HyperSwitch} |
|
|
} |
|
|
``` |