--- license: mit language: - en base_model: - microsoft/resnet-50 pipeline_tag: image-classification tags: - SRH - stimulated - raman - histology - histopathology - spinal - tumors - spine - medical --- __________ # Model Card for `SpineXtract` **Model ID:** `DavidReineckeMD/SpineXtract-Transformer-MLP` ## Model Summary SpineXtract is a transformer-based multilayer perceptron (Transformer-MLP) designed for the **multiclass classification of spinal tumors** from *Stimulated Raman Histology (SRH)* images. The model was trained on multicenter data from the University Hospital Cologne (UKK, Germany) and evaluated on independent test cohorts from New York University (NYU), University of Michigan (UM), and Medical University of Vienna (MUV). It provides near–real-time intraoperative tumor classification into **four diagnostic entities**: meningioma, schwannoma, metastasis, and ependymoma. --- ## Model Details ### Model Description - **Developed by:** David Reinecke, MD - **Institution:** Department of General Neurosurgery, University Hospital Cologne, Germany - **Funded by:** German Spine Foundation and German Research Foundation - **Shared by:** David Reinecke, MD - **Model type:** ResNet50 + Transformer-MLP - **License:** Academic research use only (non-commercial) - **Finetuned from:** Custom self-supervised BYOL encoder (ResNet-50 backbone) ### Model Sources - **Repository:** [https://github.com/DavidReineckeMD/SpineXtract](https://github.com/DavidReineckeMD/SpineXtract) - **Paper:** *(in Review, 2025)* --- ## Uses ### Direct Use This model can be used for **research on intraoperative tissue classification** using label-free SRH microscopy images. It outputs class probabilities. ### Downstream Use - Integration into experimental SRH analysis pipelines. - Adaptation to other histologic classification tasks after fine-tuning. - Evaluation of model explainability and uncertainty quantification. --- ## Bias, Risks, and Limitations - Model trained on SRH data from one academic center; limited representation of rare spinal tumors. - Not validated on prospective streams. ### Recommendations Users should: - Treat predictions as *assistive suggestions only*. - Perform independent site-specific validation before deployment. --- ## Training Details ### Training Data - **Training site:** University Hospital Cologne (UKK) - **Data type:** Label-free SRH images (16-bit TIFF) - **Patch size:** 300 × 300 px - **Training set:** 1,258 SRH slides (>15 entities; 198 spinal slides used for fine-tuning) - **Validation set:** 10% stratified split - **External test sites:** NYU, UM, MUV (44 patients / 142 slides) - **Ground truth:** FFPE H&E ± IHC/molecular analysis on separate tissue; blinded neuropathologist review. ### Preprocessing - Augmentations: random horizontal/vertical flip, sharpness (factor=2), Gaussian blur (σ=1), noise, autocontrast, solarize, erasing, affine transform (10°, translate [0.1, 0.3]), random resized crop (300 px). ### Training Hyperparameters | Parameter | Value | |------------|--------| | **Optimizer** | AdamW (LR=0.001; β=[0.9,0.999]; weight decay=0.07) | | **Scheduler** | Cosine annealing with 5% warm-up | | **Epochs** | 10 (early stopping with patience=5) | | **Precision** | bf16-mixed | | **Loss** | Cross-Entropy | | **Batch size** | 256 | | **Seed** | 1000 | ### Speeds, Sizes, Times - **Training hardware:** 4× Nvidia A100 (80 GB) - **Framework:** PyTorch 2.1.2 / Torchvision 0.10.10 - **End-to-end SRH + inference:** ≈ 5 min total --- ## Evaluation ### Testing Data Independent multicenter test set (44 patients, 142 slides) from NYU, UM, MUV. No overlap with training data. ### Factors - Site variation - Tumor subtype - Patch count per patient ### Metrics | Metric | Patient-level | Slide-level | |---------|----------------|--------------| | **Macro Balanced Accuracy** | 92.9% (95% CI 85.5–98.2) | 92.2% | | **Macro AUROC** | 98.0% (95% CI 93.8–100) | 96.5% | | **Sensitivity / Specificity** | 89.4% / 96.4% | 89.6% / 95.2% | | **Calibration** | Brier = 0.22; ECE ≤ 0.15 | | **Decision Curve** | Positive net benefit 0.1–0.9 thresholds | ### Results - **Best epoch:** 10 (validation loss plateau after epoch 5). - **High-confidence threshold (τ):** 0.77 for reliable patient-level decisions. --- ## Model Examination - Tokenization: 2048-D embeddings → 64 × 32-D tokens. - Positional encoding: sinusoidal (sin/cos, base 10,000, max_seq_len=5000). - Transformer layers: 3 (8 heads each, d_model=32, FFN=128). - Dropout = 0.10; GELU activations; LayerNorm (ε=1e−5). --- ## Environmental Impact | Parameter | Value | |------------|--------| | **Hardware type** | 4× Nvidia A100 80 GB GPUs | | **Cloud provider** | RAMSES HPC (University of Cologne) | | **Compute region** | Germany | --- ## Technical Specifications ### Model Architecture and Objective Custom ResNet-50 backbone (BYOL pretraining) + Transformer-MLP head for multiclass classification (4 classes). Objective: cross-entropy with calibrated outputs via Platt scaling. ### Compute Infrastructure - **Environment:** Containerized (CUDA 12.x + cuDNN 8.x) - **Reproducibility:** identical software stack across all centers - **Metrics library:** Scikit-learn 1.4.0 ### Hardware - **Training:** 4 × Nvidia A100 (80 GB) - **Inference:** Single GPU or CPU-supported - **Precision:** bf16-mixed ### Software Python 3.10.13, PyTorch 2.1.2, Torchvision 0.10.10, Scikit-learn 1.4.0, Pydicom 2.4.3, tifffile 2023.9.26 --- ## Model Card Authors David Reinecke, MD University Hospital Cologne, Germany --- ## Model Card Contact 📧 **david.reinecke@uk-koeln.de**