AdityaNarayan
/

KAT-Dev-72B-Exp-CPT-LoRA-Adapter-HyperSwitch

+---
+base_model: Kwaipilot/KAT-Dev-72B-Exp
+tags:
+- rust
+- Hyperswitch
+- LoRA
+- CPT
+- Fine-Tuned
+- Causal-LM
+pipeline_tag: text-generation
+language:
+- en
+datasets:
+- AdityaNarayan/HyperSwitch-Repo-CPT-Dataset
+---
+# Kwaipilot-KAT-Dev-CPT-LoRA-Adapter-HyperSwitch
+A LoRA fine-tuned model based on **Kwaipilot/KAT-Dev-72B-Exp** specialized for the [Hyperswitch](https://github.com/juspay/hyperswitch) Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices.
+## 🎯 Model Description
+This LoRA adapter was trained on **16,731 samples** extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain.
+- **Base Model**: Kwaipilot/KAT-Dev-72B-Exp
+- **Training Type**: Causal Language Modeling (CLM) with LoRA
+- **Domain**: Payment Processing, Rust Development
+- **Specialization**: Hyperswitch codebase patterns and architecture
+## 📊 Training Details
+### Dataset Composition
+- **Total Samples**: 16,731
+  - **File-level samples**: 2,120 complete files
+  - **Granular samples**: 14,611 extracted components
+    - Functions: 4,121
+    - Structs: 5,710
+    - Traits: 223
+    - Implementations: 4,296
+    - Modules: 261
+### LoRA Configuration
+```yaml
+r: 64                   # LoRA rank
+alpha: 128              # LoRA alpha (2*r)
+dropout: 0.05           # LoRA dropout
+target_modules:         # Applied to all linear layers
+  - q_proj, k_proj, v_proj, o_proj
+  - gate_proj, up_proj, down_proj
+```
+### Training Hyperparameters
+- **Epochs**: 2.3
+- **Steps**: 550
+- **Batch Size**: 2 per device (16 effective with gradient accumulation)
+- **Learning Rate**: 5e-5 (cosine schedule)
+- **Max Context**: 8,192 tokens
+- **Hardware**: 2x NVIDIA H200 (80GB each)
+- **Training Time**: ~4 hours (2,355 steps)
+### Training Results
+```
+"final_train_loss": 0.2793,
+"final_eval_loss": 0.3765236437320709,
+"final_train_perplexity": 1.322203945559979,
+"final_eval_perplexity": 1.457209992899547,
+"final_token_accuracy": 0.9227368004620076,
+"initial_loss": 1.6654,
+"initial_perplexity": 5.2877879419709135,
+"initial_accuracy": 0.6416946474462748
+```
+## 🚀 Usage
+### Quick Start
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+import torch
+# Load base model
+base_model = AutoModelForCausalLM.from_pretrained(
+    "Kwaipilot/KAT-Dev-72B-Exp",
+    dtype=torch.bfloat16,
+    device_map="auto"
+)
+# Load tokenizer
+tokenizer = AutoTokenizer.from_pretrained("Kwaipilot/KAT-Dev-72B-Exp")
+# Load LoRA adapter
+model = PeftModel.from_pretrained(base_model, "AdityaNarayan/KAT-Dev-72B-Exp-CPT-LoRA-Adapter-HyperSwitch")
+# Generate code
+prompt = """// Hyperswitch payment processing
+pub fn validate_payment_method("""
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=200,
+    temperature=0.2,  # Lower temperature for code generation
+    do_sample=True,
+    pad_token_id=tokenizer.eos_token_id
+)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+### Recommended Settings
+- **Temperature**: 0.2-0.3 for code generation
+- **Temperature**: 0.5-0.7 for explanations and documentation
+- **Max tokens**: 1024 for most tasks
+## 🛠️ Technical Specifications
+- **Context Window**: 8,192 tokens
+- **Precision**: bfloat16
+- **Memory Usage**: ~78GB VRAM (32B base model)
+- **Inference Speed**: Optimized with Flash Attention 2
+## 🙏 Acknowledgments
+- **Kwaipilot Team** for the excellent KAT-Dev base model
+- **Hyperswitch Team** for the open-source payment processing platform
+- **Hugging Face** for the transformers and PEFT libraries
+## 📞 Citation
+```bibtex
+@misc{hyperswitch-kat-dev-lora-2024,
+  title={KAT-Dev-72B-Exp-CPT-LoRA-Adapter-HyperSwitch},
+  author={Aditya Narayan},
+  year={2024},
+  publisher={Hugging Face},
+  url={AdityaNarayan/KAT-Dev-72B-Exp-CPT-LoRA-Adapter-HyperSwitch}
+}
+```