xlm-roberta-large-text-entailment-88 - Multilingual Textual Entailment
Model Description
This model is a fine-tuned version of xlm-roberta-large for multilingual textual entailment (Natural Language Inference) on English and Korean text.
Task: Given a premise and a hypothesis, predict whether the hypothesis is:
- Entailment (0): The hypothesis is necessarily true given the premise
- Neutral (1): The hypothesis might be true given the premise
- Contradiction (2): The hypothesis is necessarily false given the premise
Intended Uses & Limitations
Intended Uses
- Textual entailment / Natural Language Inference tasks
- English and Korean language pairs
- Research and educational purposes
- Building NLU applications
Limitations
- Trained primarily on English (94%) and Korean (6%) data
- May not generalize well to other languages
- Performance may vary on out-of-domain text
- Not suitable for tasks requiring deep reasoning or external knowledge
Training Data
The model was trained on a multilingual dataset containing:
- English samples: ~382k
- Korean samples: ~25k
- Total samples: ~407k premise-hypothesis pairs
- Label distribution: Balanced (33% each class)
Data preprocessing:
- Case preservation (no lowercasing) for optimal performance with cased models
- Unicode normalization (NFC) for Korean text
- Special character cleanup
- Duplicate removal
Training Procedure
Training Hyperparameters
- Base model:
xlm-roberta-large - Learning rate: 1e-05
- Batch size: 32
- Gradient accumulation: 4 steps
- Effective batch size: 128
- Epochs: 15
- Max sequence length: 192
- Label smoothing: 0.1
- Weight decay: 0.01
- Warmup ratio: 10%
- Mixed precision: FP16
- Early stopping patience: 4 epochs
Hardware
- GPU: NVIDIA H100/RTX 4060 Ti
- Training time: ~60-90 minutes
Evaluation Results
Overall Performance
| Metric | Score |
|---|---|
| F1 Score (weighted) | 0.8800 |
| Accuracy | 0.8800 |
Per-Class Performance
precision recall f1-score support
entailment 0.87 0.90 0.89 8606 neutral 0.85 0.83 0.84 8530 contradiction 0.91 0.90 0.90 8600
accuracy 0.88 25736
macro avg 0.88 0.88 0.88 25736
weighted avg 0.88 0.88 0.88 25736
Usage
Direct Usage (Transformers)
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "bekalebendong/xlm-roberta-large-text-entailment-88"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Example inference
premise = "Orsay is one of the few Paris museums that is air-conditioned."
hypothesis = "The Orsay museum has air conditioning."
# Tokenize
inputs = tokenizer(premise, hypothesis, return_tensors="pt", truncation=True, max_length=192)
# Predict
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.softmax(outputs.logits, dim=1)
label = torch.argmax(predictions, dim=1).item()
# Map to label
label_map = {0: "entailment", 1: "neutral", 2: "contradiction"}
print(f"Prediction: {label_map[label]} (confidence: {predictions[0][label]:.4f})")
Pipeline Usage
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="bekalebendong/xlm-roberta-large-text-entailment-88",
tokenizer="bekalebendong/xlm-roberta-large-text-entailment-88"
)
result = classifier(
"Orsay is one of the few Paris museums that is air-conditioned.",
"The Orsay museum has air conditioning."
)
print(result)
Citation
If you use this model in your research, please cite:
@misc{xlm-roberta-text-entailment,
author = {Your Name},
title = {Multilingual Textual Entailment with XLM-RoBERTa},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/bekalebendong/xlm-roberta-large-text-entailment-88}}
}
Acknowledgments
- Base model: xlm-roberta-large by FacebookAI
- Training framework: PyTorch + Hugging Face Transformers
- Optimization: Mixed precision training (FP16) with gradient accumulation
Model Card Authors
Your Name
Model Card Contact
For questions or issues, please open an issue on the model repository.
- Downloads last month
- 21
Dataset used to train bekalebendong/xlm-roberta-large-text-entailment-88
Evaluation results
- F1 Scoreself-reported0.880
- Accuracyself-reported0.880