Smoker Detection with LoRA Fine-Tuning

Fine-tuned ResNet34 model using LoRA (Low-Rank Adaptation) for binary smoking detection in images.

Model Description

This model uses parameter-efficient fine-tuning with LoRA on a pretrained ResNet34 to classify images as "Smoker" or "Non-Smoker". By training only 2.14% of parameters, it achieves 89.73% test accuracy while preserving ImageNet knowledge.

Model Type: ResNet34 + LoRA adapters
Task: Binary Image Classification
Framework: PyTorch
License: MIT

Performance

Split	Accuracy	F1-Score (Smoking)
Validation	94.44%	-
Test	89.73%	89.96%

Efficiency:

Trainable parameters: 465K (2.14% of model)
Training time: ~15 minutes on Kaggle T4 GPU

Usage

Installation

pip install torch torchvision pillow
Load Model
pythonimport torch
import torch.nn as nn
from torchvision import models
from torchvision.models import ResNet34_Weights
from PIL import Image
import torchvision.transforms as transforms

# Define LoRA Layer
class LoRALayer(nn.Module):
    def __init__(self, original_layer, rank=8):
        super().__init__()
        self.original_layer = original_layer
        self.rank = rank
        
        out_channels = original_layer.out_channels
        in_channels = original_layer.in_channels
        kernel_size = original_layer.kernel_size
        
        self.lora_A = nn.Parameter(
            torch.randn(rank, in_channels, *kernel_size) * 0.01
        )
        self.lora_B = nn.Parameter(
            torch.zeros(out_channels, rank, 1, 1)
        )
        
        self.original_layer.weight.requires_grad = False
        if self.original_layer.bias is not None:
            self.original_layer.bias.requires_grad = False
    
    def forward(self, x):
        original_output = self.original_layer(x)
        lora_output = nn.functional.conv2d(
            x, self.lora_A,
            stride=self.original_layer.stride,
            padding=self.original_layer.padding
        )
        lora_output = nn.functional.conv2d(lora_output, self.lora_B)
        return original_output + lora_output

def apply_lora_to_model(model, rank=8):
    for param in model.parameters():
        param.requires_grad = False
    
    for param in model.fc.parameters():
        param.requires_grad = True
    
    for block in model.layer3:
        if hasattr(block, 'conv1'):
            block.conv1 = LoRALayer(block.conv1, rank=rank)
        if hasattr(block, 'conv2'):
            block.conv2 = LoRALayer(block.conv2, rank=rank)
    
    for block in model.layer4:
        if hasattr(block, 'conv1'):
            block.conv1 = LoRALayer(block.conv1, rank=rank)
        if hasattr(block, 'conv2'):
            block.conv2 = LoRALayer(block.conv2, rank=rank)
    
    return model

# Load model
model = models.resnet34(weights=ResNet34_Weights.IMAGENET1K_V1)
model.fc = nn.Linear(model.fc.in_features, 2)
model = apply_lora_to_model(model, rank=8)

# Load trained weights
model.load_state_dict(torch.load('best_model.pth', map_location='cpu'))
model.eval()

# Preprocessing
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])

# Inference
def predict(image_path):
    image = Image.open(image_path).convert('RGB')
    image_tensor = transform(image).unsqueeze(0)
    
    with torch.no_grad():
        outputs = model(image_tensor)
        probs = torch.softmax(outputs, dim=1)
        confidence, predicted = torch.max(probs, 1)
    
    classes = ['Non-Smoker', 'Smoker']
    return classes[predicted.item()], confidence.item() * 100

# Example
prediction, confidence = predict('image.jpg')
print(f"{prediction} ({confidence:.1f}% confidence)")
Training Details
Dataset: 1,120 images from Kaggle Smoking Detection Dataset

Training: 716 images (64%)
Validation: 180 images (16%)
Test: 224 images (20%)

Hyperparameters:

Learning Rate: 1e-4
Optimizer: AdamW (weight decay: 1e-4)
Batch Size: 32
Epochs: 15
LoRA Rank: 8

Data Augmentation:

Random horizontal flip (p=0.5)
Random rotation (±10°)
Color jitter (brightness, contrast, saturation)

What is LoRA?
LoRA (Low-Rank Adaptation) adds small trainable matrices to frozen pretrained weights:
Output = W_frozen × input + (B × A) × input
Where A and B are low-rank matrices (rank=8), adding only 2.14% trainable parameters while maintaining model capacity.
Benefits:

Prevents overfitting on small datasets
Preserves pretrained ImageNet features
Faster training and lower memory usage
Easier deployment (smaller checkpoint files)

Model Architecture
ResNet34 (21.7M parameters)
├── Frozen Layers (21.3M - 97.86%)
│   ├── conv1, layer1, layer2
│   └── Pretrained ImageNet weights
└── Trainable Layers (465K - 2.14%)
    ├── LoRA adapters on layer3 (6 blocks)
    ├── LoRA adapters on layer4 (3 blocks)
    └── Classification head fc (512 → 2)
Limitations

Trained on limited dataset (1,120 images)
Low resolution images (250×250)
May not generalize to all smoking scenarios
Best for frontal/profile views with visible cigarettes

Citation
bibtex@misc{smoker-detection-lora,
  author = {Noel Triguero},
  title = {Smoker Detection with LoRA Fine-Tuning},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/notrito/smoker-detection}}
}
References

LoRA Paper - Hu et al., 2021
Dataset - Sujay Kapadnis
Training Notebook

Contact

Author: Noel Triguero
Email: [email protected]
Kaggle: notrito

Downloads last month: 6

notrito
/

smoker-detection

Smoker Detection with LoRA Fine-Tuning

Model Description

Performance

Usage

Installation

Space using notrito/smoker-detection 1