SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A MultiOutputClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Classification head: a MultiOutputClassifier instance
Maximum Sequence Length: 128 tokens

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("faodl/model_cca_multilabel_MiniLM-L12-v03")
# Run inference
preds = model("To monitor market dynamics and inform policy responses, the government will track the retail value of ultra-processed foods and analyze shifts in consumption in relation to labeling and advertising reforms. Data from these analyses will feed annual dashboards that link labeling density, promotional intensity, and dietary outcomes to guide targeted interventions and budget planning.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	123.6200	951

Training Hyperparameters

batch_size: (32, 32)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0006	1	0.1914	-
0.0283	50	0.1948	-
0.0566	100	0.1824	-
0.0849	150	0.1661	-
0.1132	200	0.1523	-
0.1415	250	0.1383	-
0.1698	300	0.1368	-
0.1981	350	0.1267	-
0.2264	400	0.124	-
0.2547	450	0.127	-
0.2830	500	0.1201	-
0.3113	550	0.1206	-
0.3396	600	0.1153	-
0.3679	650	0.1105	-
0.3962	700	0.1071	-
0.4244	750	0.1067	-
0.4527	800	0.1037	-
0.4810	850	0.1072	-
0.5093	900	0.1076	-
0.5376	950	0.1072	-
0.5659	1000	0.0984	-
0.5942	1050	0.0972	-
0.6225	1100	0.1023	-
0.6508	1150	0.0993	-
0.6791	1200	0.0959	-
0.7074	1250	0.0989	-
0.7357	1300	0.0918	-
0.7640	1350	0.099	-
0.7923	1400	0.0924	-
0.8206	1450	0.0889	-
0.8489	1500	0.092	-
0.8772	1550	0.0908	-
0.9055	1600	0.0891	-
0.9338	1650	0.0876	-
0.9621	1700	0.0931	-
0.9904	1750	0.0798	-
1.0187	1800	0.0811	-
1.0470	1850	0.0785	-
1.0753	1900	0.0796	-
1.1036	1950	0.0849	-
1.1319	2000	0.0805	-
1.1602	2050	0.08	-
1.1885	2100	0.0776	-
1.2168	2150	0.0837	-
1.2450	2200	0.0793	-
1.2733	2250	0.0754	-
1.3016	2300	0.078	-
1.3299	2350	0.0796	-
1.3582	2400	0.0777	-
1.3865	2450	0.0787	-
1.4148	2500	0.0752	-
1.4431	2550	0.0775	-
1.4714	2600	0.0749	-
1.4997	2650	0.0722	-
1.5280	2700	0.0832	-
1.5563	2750	0.0738	-
1.5846	2800	0.0863	-
1.6129	2850	0.0754	-
1.6412	2900	0.0855	-
1.6695	2950	0.0767	-
1.6978	3000	0.081	-
1.7261	3050	0.075	-
1.7544	3100	0.0754	-
1.7827	3150	0.0689	-
1.8110	3200	0.0758	-
1.8393	3250	0.0734	-
1.8676	3300	0.0718	-
1.8959	3350	0.0784	-
1.9242	3400	0.0776	-
1.9525	3450	0.0773	-
1.9808	3500	0.071	-

Framework Versions

Python: 3.12.12
SetFit: 1.1.3
Sentence Transformers: 5.1.1
Transformers: 4.57.1
PyTorch: 2.8.0+cu126
Datasets: 4.0.0
Tokenizers: 0.22.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 14

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for faodl/model_cca_multilabel_MiniLM-L12-v03

Base model

sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

Finetuned

(263)

this model