orrav
/

sae-clip-b32-x64-layer10-resid-post-cls-lr1e-2

Feature Extraction

interpretability

sparse autoencoder

mechanistic interpretability

Model card Files Files and versions

sae-clip-b32-x64-layer10-resid-post-cls-lr1e-2 / README.md

orrav's picture

first commit

9f456e1 6 months ago

|

history blame contribute delete

985 Bytes

metadata

language: en
tags:
  - clip
  - vision
  - transformers
  - interpretability
  - sparse autoencoder
  - sae
  - mechanistic interpretability
library_name: torch
pipeline_tag: feature-extraction
metrics:
  - type: explained_variance
    value: 89.42
    pretty_name: Explained Variance %
    range:
      min: 0
      max: 100
  - type: l0
    value: 655.9
    pretty_name: L0

CLIP-B-32 Sparse Autoencoder x64 vanilla - L1:1e-05

Training Details

Base Model: CLIP-ViT-B-32 (LAION DataComp.XL-s13B-b90K)
Layer: 10
Component: hook_resid_post

Model Architecture

Input Dimension: 768
SAE Dimension: 49,152
Expansion Factor: x64 (vanilla architecture)
Activation Function: ReLU
Initialization: encoder_transpose_decoder
CLS_only: true

Performance Metrics

L1 Coefficient: 1e-05
L0 Sparsity: 655.9
Explained Variance: 89.42%

Training Configuration

Learning Rate: 0.01
LR Scheduler: Cosine Annealing with Warmup (200 steps)
Epochs: 10
Gradient Clipping: 1.0