LeJEPA v1 POC — Tiny-ImageNet

100-epoch reproduction of the LeJEPA self-supervised representation learning recipe from Balestriero & LeCun, "LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics" (arXiv:2511.08544), on the Tiny-ImageNet dataset (200 classes, 100K images).

Status: POC. 100-epoch reproduction (paper uses 800). Linear-probe val_acc 23.77% (47× chance on 200 classes). Random-encoder baseline computed 2026-05-16: random ViT-S + frozen + linear probe + matched 100 epochs reaches val_acc 8.50% (17× chance). JEPA training contributes 2.80× over random encoder, adding 15.27pp absolute.

Final metrics (epoch 99)

Metric	Value
SIGReg loss	1.584 (down from 11.826 at epoch 0)
Invariance loss	0.126 (down from 0.447)
Probe CE loss	3.610 (down from 5.195)
Linear-probe val_acc	0.2377

No representation collapse — SIGReg loss converges, val_acc rises monotonically.

Baseline comparison

Setup	val_acc	× chance	Δ over chance
Chance (uniform)	0.005	1.0×	—
Random ViT-S + frozen + LP	0.0850	17.0×	+0.080
LeJEPA-trained ViT-S + frozen + LP	0.2377	47.5×	+0.233
Δ (JEPA over random)	+0.1527	2.80×	—

Honest framing: JEPA training adds 15.27pp absolute over a matched-architecture random encoder. About 36% of the val_acc-over-chance gain comes from "random high-dim projection + linear probe is non-trivial on Tiny-ImageNet" (a known random-features effect, Rahimi & Recht 2007), 64% comes from JEPA training.

Architecture (matches paper MINIMAL.md)

Encoder: timm vit_small_patch8_224, img_size=128, num_classes=512
Projection: MLP 512 → 2048 → 2048 → 16 (proj_dim)
4 views per image (V=4)
Loss: λ·SIGReg(proj) + (1-λ)·invariance_loss, λ=0.02
Linear probe joint with .detach() for monitoring

Hyperparameters

AdamW lr=2e-3, wd=5e-2
Batch 256, bf16 mixed precision
100 epochs, cosine LR with linear warmup

Files

File	Description
`latest.pt`	Final encoder + projection + probe state dict (309 MB)
`training_history.json`	Per-epoch metrics for all 100 epochs
`training_curves.png`	Loss + val_acc curves

Reproducibility

Training notebook: nb_lejepa_v1_tinyimagenet.ipynb
Random-encoder baseline: nb_lejepa_v1_random_encoder_baseline.ipynb

Limitations

POC at 100 epochs (paper uses 800). Linear-probe accuracy proportional.
Not SOTA. DINOv2/MAE on the same scale would beat this.
v2 cross-modal extension (Flickr30K visão + texto) pending.

Citation

@misc{openinterp-lejepa-v1-tinyimagenet-2026,
  author = {Vicentino, Caio},
  title  = {LeJEPA v1 POC: Tiny-ImageNet 100-epoch reproduction with random-encoder baseline},
  year   = {2026},
  url    = {https://huggingface.co/caiovicentino1/lejepa-v1-tinyimagenet}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for caiovicentino1/lejepa-v1-tinyimagenet

LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics

Paper • 2511.08544 • Published Nov 11, 2025 • 11