🍕 Hybrid Food Image Classifier (CNN + ViT)

This model combines ResNet50 (CNN) and DeiT-Base (ViT) with an adaptive fusion module for state-of-the-art food image classification.

Model Architecture

CNN Branch: ResNet50 (pretrained on ImageNet)
ViT Branch: DeiT-Base Distilled (pretrained)
Fusion Module: Adaptive attention-based fusion with multi-head cross-attention
Classes: 101 food categories from Food-101 dataset

Performance

Validation Accuracy: ~82.5%
Top-5 Accuracy: >95%

Files

best_model.pth: Trained PyTorch checkpoint
real_class_mapping.json: Human-readable class names
config.yaml: Training configuration
food101_class_names.json: Original class names

Quick Usage

from huggingface_hub import hf_hub_download
import torch

# Download model
ckpt_path = hf_hub_download(
    repo_id="codealchemist01/food-image-classifier-hybrid",
    filename="best_model.pth"
)

# Load checkpoint
checkpoint = torch.load(ckpt_path, map_location="cpu")

Demo

Try the live demo: Food Classifier Space

Training Details

Dataset: Food-101 (101,000 images across 101 categories)
Framework: PyTorch 2.0+
Image Size: 224x224
Optimizer: AdamW with cosine annealing warm restarts
Augmentations: Albumentations (flip, rotation, color jitter)
Mixed Precision: FP16 training

Citation

@misc{food-classifier-hybrid,
  author = {codealchemist01},
  title = {Hybrid Food Image Classifier},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/codealchemist01/food-image-classifier-hybrid}}
}

Downloads last month: 2

codealchemist01
/

food-image-classifier-hybrid

🍕 Hybrid Food Image Classifier (CNN + ViT)

Model Architecture

Performance

Files

Quick Usage

Demo

Training Details

Citation

Dataset used to train codealchemist01/food-image-classifier-hybrid

Space using codealchemist01/food-image-classifier-hybrid 1