ConvNeXt-Base CheXpert Classifier with CBAM Attention
Fine-tuned ConvNeXt-Base with CBAM attention for multi-label classification of 14 thoracic pathologies from chest X-rays. Iteration 6 (final model) with 0.81 AUC.
π Full training code, examples & scripts: GitHub Repository
π¬ Model Overview
This is a production-ready classifier for automated chest X-ray interpretation. The model combines modern ConvNeXt architecture with Convolutional Block Attention Module (CBAM) for improved pathology detection and localization.
Key Specs:
- Architecture: ConvNeXt-Base + CBAM
- Training Iteration: 6 (final)
- Validation AUC: 0.81
- Input: 384Γ384 frontal chest X-rays
- Output: 14 pathology probabilities
- Model Size: 300MB
- Parameters: ~88M + CBAM
π Detectable Pathologies (14 Classes)
| # | Pathology | # | Pathology |
|---|---|---|---|
| 1 | No Finding | 8 | Pneumonia |
| 2 | Enlarged Cardiomediastinum | 9 | Atelectasis |
| 3 | Cardiomegaly | 10 | Pneumothorax |
| 4 | Lung Opacity | 11 | Pleural Effusion |
| 5 | Lung Lesion | 12 | Pleural Other |
| 6 | Edema | 13 | Fracture |
| 7 | Consolidation | 14 | Support Devices |
π Performance Results
Iteration 6 (Final Model)
- Overall Validation AUC: 0.81
- Training approach: Multi-iteration refinement with CBAM attention
- Dataset: CheXpert (Stanford ML Group, 224K+ images)
Model outputs: Sigmoid-activated probabilities for each pathology (0-1 range)
πΌοΈ GradCAM Visualizations
Model predictions with attention maps showing pathology localization:
Example 1: Edema Detection
- Prediction: Edema 63.7%
- Visualization: GradCAM highlights fluid accumulation regions
Example 2: Fracture Detection
- Prediction: Fracture 67.2%
- Visualization: GradCAM highlights rib/bone fracture area
Example 3: Pleural Other
- Prediction: Pleural Other 65.7%
- Visualization: GradCAM shows pleural involvement
Example 4: Atelectasis Detection
- Prediction: Atelectasis 63.1%
- Visualization: GradCAM localizes collapsed lung regions
π Quick Start
Installation
pip install torch torchvision timm Pillow opencv-python
Basic Inference
import torch
from PIL import Image
from torchvision import transforms
import timm
# Load model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = timm.create_model('convnext_base', pretrained=False, num_classes=14)
model.load_state_dict(torch.load('model.pth', map_location=device))
model.eval()
# Preprocess
transform = transforms.Compose([
transforms.Grayscale(num_output_channels=3),
transforms.Resize((384, 384)),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.5029414296150208]*3,
std=[0.2892409563064575]*3
)
])
# Predict
image = Image.open('chest_xray.jpg')
input_tensor = transform(image).unsqueeze(0).to(device)
with torch.no_grad():
logits = model(input_tensor)
probs = torch.sigmoid(logits)
pathologies = [
"No Finding", "Enlarged Cardiomediastinum", "Cardiomegaly",
"Lung Opacity", "Lung Lesion", "Edema", "Consolidation",
"Pneumonia", "Atelectasis", "Pneumothorax", "Pleural Effusion",
"Pleural Other", "Fracture", "Support Devices"
]
for pathology, prob in zip(pathologies, probs[0]):
print(f"{pathology}: {prob.item():.3f}")
ποΈ Model Architecture
ConvNeXt-Base:
- Modern efficient architecture (Liu et al., 2022)
- ImageNet-22k pretrained weights
- Inverted bottleneck design
- LayerNorm + GELU activations
CBAM Attention Module:
- Channel attention: Refines feature importance
- Spatial attention: Highlights important regions
- Lightweight addition to base architecture
- Improves pathology localization
Result: Better accuracy + interpretability with GradCAM
π Dataset Information
- Source: CheXpert Dataset (Stanford ML Group)
- Size: 224,316 chest X-rays from 65,240 patients
- Period: October 2002 - July 2017 (Stanford Hospital)
- Labels: 14 pathologies auto-extracted from radiology reports
- Uncertainty: Labels include uncertainty handling (-1 for uncertain)
β οΈ IMPORTANT: Medical Disclaimer
π¨ FOR RESEARCH & EDUCATION ONLY π¨
β DO NOT USE FOR:
- Clinical diagnosis or treatment decisions
- Emergency medical situations
- Replacing professional radiologist review
- Patient care without expert validation
β οΈ Limitations:
- Not clinically validated or FDA-approved
- Trained on historical Stanford data (2002-2017)
- Performance may vary on different populations/equipment
- Requires qualified radiologist review for any clinical use
β Appropriate Uses:
- Academic research and benchmarking
- Algorithm development and comparison
- Educational demonstrations
- Proof-of-concept prototypes
Always consult qualified healthcare professionals for medical decisions.
π Citation & Attribution
You MUST cite this work if used in publications:
@software{convnext_chexpert_attention_2025,
author = {Time},
title = {ConvNeXt-Base CheXpert Classifier with CBAM Attention},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/calender/Convnext-Chexpert-Attention}
}
@article{irvin2019chexpert,
title={CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison},
author={Irvin, Jeremy and Rajpurkar, Pranav and Ko, Michael and Yu, Yifan and others},
journal={AAAI Conference on Artificial Intelligence},
volume={33},
pages={590--597},
year={2019}
}
Claiming you trained this model when you didn't is scientific misconduct.
π Links
- GitHub (Code + Training): https://github.com/jikaan/convnext-chexpert-attention
- CheXpert Dataset: https://stanfordmlgroup.github.io/competitions/chexpert/
- Paper: https://arxiv.org/abs/1901.07031
π License
Apache License 2.0 - See LICENSE for details.
Created by Time | October 2025



