SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A MultiOutputClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("faodl/model_cca_multilabel_MiniLM-L12-v03")
# Run inference
preds = model("To monitor market dynamics and inform policy responses, the government will track the retail value of ultra-processed foods and analyze shifts in consumption in relation to labeling and advertising reforms. Data from these analyses will feed annual dashboards that link labeling density, promotional intensity, and dietary outcomes to guide targeted interventions and budget planning.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 123.6200 951

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0006 1 0.1914 -
0.0283 50 0.1948 -
0.0566 100 0.1824 -
0.0849 150 0.1661 -
0.1132 200 0.1523 -
0.1415 250 0.1383 -
0.1698 300 0.1368 -
0.1981 350 0.1267 -
0.2264 400 0.124 -
0.2547 450 0.127 -
0.2830 500 0.1201 -
0.3113 550 0.1206 -
0.3396 600 0.1153 -
0.3679 650 0.1105 -
0.3962 700 0.1071 -
0.4244 750 0.1067 -
0.4527 800 0.1037 -
0.4810 850 0.1072 -
0.5093 900 0.1076 -
0.5376 950 0.1072 -
0.5659 1000 0.0984 -
0.5942 1050 0.0972 -
0.6225 1100 0.1023 -
0.6508 1150 0.0993 -
0.6791 1200 0.0959 -
0.7074 1250 0.0989 -
0.7357 1300 0.0918 -
0.7640 1350 0.099 -
0.7923 1400 0.0924 -
0.8206 1450 0.0889 -
0.8489 1500 0.092 -
0.8772 1550 0.0908 -
0.9055 1600 0.0891 -
0.9338 1650 0.0876 -
0.9621 1700 0.0931 -
0.9904 1750 0.0798 -
1.0187 1800 0.0811 -
1.0470 1850 0.0785 -
1.0753 1900 0.0796 -
1.1036 1950 0.0849 -
1.1319 2000 0.0805 -
1.1602 2050 0.08 -
1.1885 2100 0.0776 -
1.2168 2150 0.0837 -
1.2450 2200 0.0793 -
1.2733 2250 0.0754 -
1.3016 2300 0.078 -
1.3299 2350 0.0796 -
1.3582 2400 0.0777 -
1.3865 2450 0.0787 -
1.4148 2500 0.0752 -
1.4431 2550 0.0775 -
1.4714 2600 0.0749 -
1.4997 2650 0.0722 -
1.5280 2700 0.0832 -
1.5563 2750 0.0738 -
1.5846 2800 0.0863 -
1.6129 2850 0.0754 -
1.6412 2900 0.0855 -
1.6695 2950 0.0767 -
1.6978 3000 0.081 -
1.7261 3050 0.075 -
1.7544 3100 0.0754 -
1.7827 3150 0.0689 -
1.8110 3200 0.0758 -
1.8393 3250 0.0734 -
1.8676 3300 0.0718 -
1.8959 3350 0.0784 -
1.9242 3400 0.0776 -
1.9525 3450 0.0773 -
1.9808 3500 0.071 -

Framework Versions

  • Python: 3.12.12
  • SetFit: 1.1.3
  • Sentence Transformers: 5.1.1
  • Transformers: 4.57.1
  • PyTorch: 2.8.0+cu126
  • Datasets: 4.0.0
  • Tokenizers: 0.22.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
14
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for faodl/model_cca_multilabel_MiniLM-L12-v03