Smoker Detection with LoRA Fine-Tuning
Fine-tuned ResNet34 model using LoRA (Low-Rank Adaptation) for binary smoking detection in images.
Model Description
This model uses parameter-efficient fine-tuning with LoRA on a pretrained ResNet34 to classify images as "Smoker" or "Non-Smoker". By training only 2.14% of parameters, it achieves 89.73% test accuracy while preserving ImageNet knowledge.
- Model Type: ResNet34 + LoRA adapters
- Task: Binary Image Classification
- Framework: PyTorch
- License: MIT
Performance
| Split | Accuracy | F1-Score (Smoking) |
|---|---|---|
| Validation | 94.44% | - |
| Test | 89.73% | 89.96% |
Efficiency:
- Trainable parameters: 465K (2.14% of model)
- Training time: ~15 minutes on Kaggle T4 GPU
Usage
Installation
pip install torch torchvision pillow
Load Model
pythonimport torch
import torch.nn as nn
from torchvision import models
from torchvision.models import ResNet34_Weights
from PIL import Image
import torchvision.transforms as transforms
# Define LoRA Layer
class LoRALayer(nn.Module):
def __init__(self, original_layer, rank=8):
super().__init__()
self.original_layer = original_layer
self.rank = rank
out_channels = original_layer.out_channels
in_channels = original_layer.in_channels
kernel_size = original_layer.kernel_size
self.lora_A = nn.Parameter(
torch.randn(rank, in_channels, *kernel_size) * 0.01
)
self.lora_B = nn.Parameter(
torch.zeros(out_channels, rank, 1, 1)
)
self.original_layer.weight.requires_grad = False
if self.original_layer.bias is not None:
self.original_layer.bias.requires_grad = False
def forward(self, x):
original_output = self.original_layer(x)
lora_output = nn.functional.conv2d(
x, self.lora_A,
stride=self.original_layer.stride,
padding=self.original_layer.padding
)
lora_output = nn.functional.conv2d(lora_output, self.lora_B)
return original_output + lora_output
def apply_lora_to_model(model, rank=8):
for param in model.parameters():
param.requires_grad = False
for param in model.fc.parameters():
param.requires_grad = True
for block in model.layer3:
if hasattr(block, 'conv1'):
block.conv1 = LoRALayer(block.conv1, rank=rank)
if hasattr(block, 'conv2'):
block.conv2 = LoRALayer(block.conv2, rank=rank)
for block in model.layer4:
if hasattr(block, 'conv1'):
block.conv1 = LoRALayer(block.conv1, rank=rank)
if hasattr(block, 'conv2'):
block.conv2 = LoRALayer(block.conv2, rank=rank)
return model
# Load model
model = models.resnet34(weights=ResNet34_Weights.IMAGENET1K_V1)
model.fc = nn.Linear(model.fc.in_features, 2)
model = apply_lora_to_model(model, rank=8)
# Load trained weights
model.load_state_dict(torch.load('best_model.pth', map_location='cpu'))
model.eval()
# Preprocessing
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
])
# Inference
def predict(image_path):
image = Image.open(image_path).convert('RGB')
image_tensor = transform(image).unsqueeze(0)
with torch.no_grad():
outputs = model(image_tensor)
probs = torch.softmax(outputs, dim=1)
confidence, predicted = torch.max(probs, 1)
classes = ['Non-Smoker', 'Smoker']
return classes[predicted.item()], confidence.item() * 100
# Example
prediction, confidence = predict('image.jpg')
print(f"{prediction} ({confidence:.1f}% confidence)")
Training Details
Dataset: 1,120 images from Kaggle Smoking Detection Dataset
Training: 716 images (64%)
Validation: 180 images (16%)
Test: 224 images (20%)
Hyperparameters:
Learning Rate: 1e-4
Optimizer: AdamW (weight decay: 1e-4)
Batch Size: 32
Epochs: 15
LoRA Rank: 8
Data Augmentation:
Random horizontal flip (p=0.5)
Random rotation (Β±10Β°)
Color jitter (brightness, contrast, saturation)
What is LoRA?
LoRA (Low-Rank Adaptation) adds small trainable matrices to frozen pretrained weights:
Output = W_frozen Γ input + (B Γ A) Γ input
Where A and B are low-rank matrices (rank=8), adding only 2.14% trainable parameters while maintaining model capacity.
Benefits:
Prevents overfitting on small datasets
Preserves pretrained ImageNet features
Faster training and lower memory usage
Easier deployment (smaller checkpoint files)
Model Architecture
ResNet34 (21.7M parameters)
βββ Frozen Layers (21.3M - 97.86%)
β βββ conv1, layer1, layer2
β βββ Pretrained ImageNet weights
βββ Trainable Layers (465K - 2.14%)
βββ LoRA adapters on layer3 (6 blocks)
βββ LoRA adapters on layer4 (3 blocks)
βββ Classification head fc (512 β 2)
Limitations
Trained on limited dataset (1,120 images)
Low resolution images (250Γ250)
May not generalize to all smoking scenarios
Best for frontal/profile views with visible cigarettes
Citation
bibtex@misc{smoker-detection-lora,
author = {Noel Triguero},
title = {Smoker Detection with LoRA Fine-Tuning},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/notrito/smoker-detection}}
}
References
LoRA Paper - Hu et al., 2021
Dataset - Sujay Kapadnis
Training Notebook
Contact
Author: Noel Triguero
Email: [email protected]
Kaggle: notrito
- Downloads last month
- 6