πŸ• Hybrid Food Image Classifier (CNN + ViT)

This model combines ResNet50 (CNN) and DeiT-Base (ViT) with an adaptive fusion module for state-of-the-art food image classification.

Model Architecture

  • CNN Branch: ResNet50 (pretrained on ImageNet)
  • ViT Branch: DeiT-Base Distilled (pretrained)
  • Fusion Module: Adaptive attention-based fusion with multi-head cross-attention
  • Classes: 101 food categories from Food-101 dataset

Performance

  • Validation Accuracy: ~82.5%
  • Top-5 Accuracy: >95%

Files

  • best_model.pth: Trained PyTorch checkpoint
  • real_class_mapping.json: Human-readable class names
  • config.yaml: Training configuration
  • food101_class_names.json: Original class names

Quick Usage

from huggingface_hub import hf_hub_download
import torch

# Download model
ckpt_path = hf_hub_download(
    repo_id="codealchemist01/food-image-classifier-hybrid",
    filename="best_model.pth"
)

# Load checkpoint
checkpoint = torch.load(ckpt_path, map_location="cpu")

Demo

Try the live demo: Food Classifier Space

Training Details

  • Dataset: Food-101 (101,000 images across 101 categories)
  • Framework: PyTorch 2.0+
  • Image Size: 224x224
  • Optimizer: AdamW with cosine annealing warm restarts
  • Augmentations: Albumentations (flip, rotation, color jitter)
  • Mixed Precision: FP16 training

Citation

@misc{food-classifier-hybrid,
  author = {codealchemist01},
  title = {Hybrid Food Image Classifier},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/codealchemist01/food-image-classifier-hybrid}}
}
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train codealchemist01/food-image-classifier-hybrid

Space using codealchemist01/food-image-classifier-hybrid 1