File size: 3,878 Bytes
557ca57 668dab7 557ca57 fe12de4 557ca57 668dab7 557ca57 61348bf 557ca57 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
---
base_model: Kwaipilot/KAT-Dev-72B-Exp
tags:
- rust
- Hyperswitch
- LoRA
- CPT
- Fine-Tuned
- Causal-LM
pipeline_tag: text-generation
language:
- en
datasets:
- AdityaNarayan/HyperSwitch-Repo-CPT-Dataset
---
# Kwaipilot-KAT-Dev-CPT-LoRA-Adapter-HyperSwitch
A LoRA fine-tuned model based on **Kwaipilot/KAT-Dev-72B-Exp** specialized for the [Hyperswitch](https://github.com/juspay/hyperswitch) Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices.
## π― Model Description
This LoRA adapter was trained on **16,731 samples** extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain.
- **Base Model**: Kwaipilot/KAT-Dev-72B-Exp
- **Training Type**: Causal Language Modeling (CLM) with LoRA
- **Domain**: Payment Processing, Rust Development
- **Specialization**: Hyperswitch codebase patterns and architecture
## π Training Details
### Dataset Composition
- **Total Samples**: 16,731
- **File-level samples**: 2,120 complete files
- **Granular samples**: 14,611 extracted components
- Functions: 4,121
- Structs: 5,710
- Traits: 223
- Implementations: 4,296
- Modules: 261
### LoRA Configuration
```yaml
r: 64 # LoRA rank
alpha: 128 # LoRA alpha (2*r)
dropout: 0.05 # LoRA dropout
target_modules: # Applied to all linear layers
- q_proj, k_proj, v_proj, o_proj
- gate_proj, up_proj, down_proj
```
### Training Hyperparameters
- **Epochs**: 3
- **Learning Rate**: 5e-5 (cosine schedule)
- **Max Context**: 8,192 tokens
- **Hardware**: 4 x NVIDIA H200
### Training Results
```
"final_train_loss": 0.2641,
"final_eval_loss": 0.37574875354766846,
"final_train_perplexity": 1.3022584156313823,
"final_eval_perplexity": 1.4560812525608204,
"final_token_accuracy": 0.9259863365441561,
"initial_loss": 1.6648,
"initial_perplexity": 5.284616220817229,
"initial_accuracy": 0.6015806214883923
```
## π Usage
### Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Kwaipilot/KAT-Dev-72B-Exp",
dtype=torch.bfloat16,
device_map="auto"
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Kwaipilot/KAT-Dev-72B-Exp")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "AdityaNarayan/KAT-Dev-72B-Exp-CPT-LoRA-Adapter-HyperSwitch")
# Generate code
prompt = """// Hyperswitch payment processing
pub fn validate_payment_method("""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.2, # Lower temperature for code generation
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Recommended Settings
- **Temperature**: 0.2-0.3 for code generation
- **Temperature**: 0.5-0.7 for explanations and documentation
- **Max tokens**: 1024 for most tasks
## π οΈ Technical Specifications
- **Context Window**: 8,192 tokens
- **Precision**: bfloat16
- **Memory Usage**: ~78GB VRAM (32B base model)
- **Inference Speed**: Optimized with Flash Attention 2
## π Acknowledgments
- **Kwaipilot Team** for the excellent KAT-Dev base model
- **Hyperswitch Team** for the open-source payment processing platform
- **Hugging Face** for the transformers and PEFT libraries
## π Citation
```bibtex
@misc{hyperswitch-kat-dev-lora-2024,
title={KAT-Dev-72B-Exp-CPT-LoRA-Adapter-HyperSwitch},
author={Aditya Narayan},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/AdityaNarayan/KAT-Dev-72B-Exp-CPT-LoRA-Adapter-HyperSwitch}
}
``` |