Llama 3.2 3B - SQL Query Generator (LoRA Fine-tuned)
This model is a fine-tuned version of meta-llama/Llama-3.2-3B for text-to-SQL generation using LoRA (Low-Rank Adaptation) on the Spider dataset.
Model Description
- Base Model: Llama 3.2 3B
- Fine-tuning Method: LoRA (Parameter-Efficient Fine-Tuning)
- Quantization: 4-bit NF4 with double quantization
- Dataset: Spider (7,000 training examples)
- Training: 3 epochs, ~47 minutes on AWS g5.2xlarge (NVIDIA A10G)
- Final Training Loss: 0.37 (85% reduction from initial 2.5)
Intended Use
This model converts natural language questions into SQL queries for various database schemas. It's designed for:
- Automated SQL query generation
- Data analysis assistants
- Natural language database interfaces
- Educational tools for SQL learning
Training Details
Training Hyperparameters
- Learning Rate: 2e-4
- Batch Size: 4 (per device)
- Gradient Accumulation: 4 steps (effective batch size: 16)
- Epochs: 3
- Max Sequence Length: 2048
- LoRA Rank (r): 16
- LoRA Alpha: 32
- LoRA Dropout: 0.05
- Target Modules: q_proj, k_proj, v_proj, o_proj
Training Results
| Metric | Value |
|---|---|
| Initial Loss | 2.50 |
| Final Loss | 0.37 |
| Trainable Parameters | 9.17M (0.51% of total) |
| Training Time | 47 minutes |
Usage
Installation
pip install transformers peft torch bitsandbytes
Inference Example
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "Abhisek987/llama-3.2-sql-lora"
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
torch_dtype=torch.float16
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Prepare prompt
database = "employees"
question = "What are the names of all employees who earn more than 50000?"
prompt = f"""### Instruction:
You are a SQL expert. Generate a SQL query to answer the given question for the specified database.
### Input:
Database: {database}
Question: {question}
### Response:
"""
# Generate SQL
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.1,
do_sample=True
)
sql_query = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(sql_query.split("### Response:")[-1].strip())
Output:
SELECT name FROM employees WHERE salary > 50000;
Example Queries
| Question | Generated SQL |
|---|---|
| "Show top 5 products by sales" | SELECT product_id, sum(sales) FROM sales GROUP BY product_id ORDER BY sum(sales) DESC LIMIT 5; |
| "Count customers by country" | SELECT count(*), country FROM customers GROUP BY country; |
| "Find orders from last 30 days" | SELECT order_id FROM orders WHERE date_order_placed BETWEEN DATE('now') - INTERVAL 30 DAY AND DATE('now') - INTERVAL 1 DAY; |
Limitations
- Trained specifically on Spider dataset schemas
- May not generalize perfectly to significantly different database structures
- Requires proper database schema context for best results
- 4-bit quantization may occasionally affect numerical precision
Technical Stack
- Framework: PyTorch + Transformers
- Quantization: BitsAndBytes (4-bit NF4)
- Fine-tuning: PEFT (LoRA)
- Training: AWS EC2 g5.2xlarge (NVIDIA A10G 24GB)
Citation
If you use this model, please cite:
@misc{llama32-sql-lora,
author = {Abhisek Behera},
title = {Llama 3.2 3B SQL Query Generator (LoRA Fine-tuned)},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/Abhisek987/llama-3.2-sql-lora}
}
Acknowledgments
- Meta AI for Llama 3.2 base model
- Spider dataset creators
- HuggingFace for infrastructure
License
This model inherits the Llama 3.2 Community License from the base model.
---
## **Step 2: Add to Settings**
1. **Repository Settings** β Make it **Public**
2. **Add topics/tags:** `llama`, `sql`, `lora`, `nlp`, `text-to-sql`
---
## **Step 3: For Your Resume**
Add this to your projects section:
π Text-to-SQL Generator using Llama 3.2 (LLM Fine-tuning) https://huggingface.co/Abhisek987/llama-3.2-sql-lora
- Fine-tuned Llama 3.2 3B model for natural language to SQL conversion using LoRA technique
- Achieved 85% training loss reduction (2.5 β 0.37) on Spider dataset with 7K examples
- Implemented 4-bit quantization (NF4) reducing model size by 75% while maintaining accuracy
- Trained on AWS EC2 (g5.2xlarge) with NVIDIA A10G GPU in 47 minutes
- Technologies: PyTorch, Transformers, PEFT, BitsAndBytes, AWS EC2
Model tree for Abhisek987/llama-3.2-sql-lora
Base model
meta-llama/Llama-3.2-3B