potion-mxbai-micro

A 700KB static embedding model. Yes, really. Seven hundred kilobytes for useful sentence embeddings.

Highlights

  • 68.91 avg on full MTEB English (STS + Classification + PairClassification, 25 tasks)
  • 700KB total model size โ€” fits in an email attachment
  • 256 dimensions โ€” same output dimensionality as the full model
  • 80-88x faster than all-MiniLM-L6-v2 on CPU
  • Pure numpy inference โ€” no GPU needed
  • Drop-in compatible with model2vec and sentence-transformers

How It Was Made

  1. Start with our best 256D model (potion-mxbai-256d-v2, 70.98 avg)
  2. Apply model2vec's vocabulary quantization: cluster the 29,525 token embeddings into 2,000 centroids using k-means
  3. Each token maps to its nearest centroid via a token mapping table
  4. The full tokenizer is preserved โ€” all text still tokenizes correctly

The result is a 2,000-row embedding table at 256D with int8 quantization. Tokens that are semantically similar share the same embedding vector, which acts as a natural regularizer.

Benchmark Results (Full MTEB English Suite)

Model STS Classification PairClassification Avg Size
potion-mxbai-2m-512d 74.15 65.44 76.80 72.13 ~125MB
potion-mxbai-256d-v2 73.79 63.23 77.33 71.45 7.5MB
potion-mxbai-128d-v2 72.56 61.48 75.45 69.83 3.9MB
potion-mxbai-micro (this) 71.04 59.66 76.02 68.91 0.7MB

Evaluated on 25 tasks (10 STS, 12 Classification, 3 PairClassification), English subsets only.

Usage

from model2vec import StaticModel

model = StaticModel.from_pretrained("blobbybob/potion-mxbai-micro")
embeddings = model.encode(["Hello world", "Static embeddings are fast"])

With Sentence Transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("blobbybob/potion-mxbai-micro")
embeddings = model.encode(["Hello world", "Static embeddings are fast"])

When to use this model

  • You need embeddings in extremely constrained environments (embedded systems, IoT, WASM)
  • You're building a browser extension or mobile app where every KB counts
  • You want a fallback embedding model that loads instantly
  • You need to embed millions of documents and want to minimize index storage
  • Prototyping โ€” get semantic search working in seconds, upgrade to larger models later

Model Family

Model Avg Size Best for
potion-mxbai-2m-512d 72.13 ~125MB Maximum quality
potion-mxbai-256d-v2 71.45 7.5MB Best quality/size balance
potion-mxbai-128d-v2 69.83 3.9MB Compact deployments
potion-mxbai-micro 68.91 0.7MB Ultra-tiny / embedded

Citation

@article{minishlab2024model2vec,
  author = {Tulkens, Stephan and {van Dongen}, Thomas},
  title = {Model2Vec: Fast State-of-the-Art Static Embeddings},
  year = {2024},
  url = {https://github.com/MinishLab/model2vec}
}
Downloads last month
101
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for blobbybob/potion-mxbai-micro

Finetuned
(57)
this model

Evaluation results

  • spearman_cosine on MTEB STS (English, 10 tasks)
    self-reported
    71.040
  • accuracy on MTEB Classification (English, 12 tasks)
    self-reported
    59.660
  • ap on MTEB PairClassification (English, 3 tasks)
    self-reported
    76.020