news_sentiment_classifier
This model is a fine-tuned version of distilbert-base-uncased on the Sentiment analysis for financial news dataset. It achieves the following results on the evaluation set:
- Accuracy: 0.8201
- F1: 0.8205
- Loss: 0.4548
Model Description
- Base Architecture: DistilBERT (
distilbert-base-uncased) โ a lightweight transformer architecture distilled from BERT, optimized for efficiency while maintaining strong performance in NLP tasks. - Task: Multiclass sentiment classification of financial news headlines
- Output Classes:
- 0 โ Negative
- 1 โ Neutral
- 2 โ Positive
- Input Format: Raw English text (financial/news headlines), tokenized using the DistilBERT WordPiece tokenizer with:
padding=True(dynamic batch padding)truncation=True(truncate sequences longer than the model max length, 512 tokens)
Intended Uses & Limitations
Intended Uses:
- Sentiment classification of short text in financial and economic contexts
- Integration into news monitoring pipelines or financial market analysis tools
- Batch or streaming inference for classifying sentiment of headlines
Limitations:
- Model performance may degrade on long-form text, as it is optimized for short inputs like headlines
- Domain-specific bias: Training data is financial news; model may not generalize to unrelated topics (sports, social media)
- Language limitation: English only (based on
distilbert-base-uncased) - Predictions are class labels and probabilities only; the model does not provide reasoning or explanation
Training and Evaluation Data
- Training Dataset: Custom dataset of financial news headlines, labeled with sentiment (
negative,neutral,positive) - Dataset Size: 4,838 total examples
- Training Set: 70% (~3,386 examples)
- Validation Set: 15% (~726 examples)
- Test Set: 15% (~726 examples)
- Evaluation Metrics:
- Accuracy
- Weighted F1-score (preferred for multiclass balance)
- Tokenizer: DistilBERT WordPiece tokenizer with
padding=True(dynamic padding) andtruncation=True
Example Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("Dugerij/news_sentiment_classifier")
model = AutoModelForSequenceClassification.from_pretrained("Dugerij/news_sentiment_classifier")
text = ["Stocks plunge after weak earnings report"]
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=-1)
print(predictions) # 0=Negative, 1=Neutral, 2=Positive
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-06
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 20
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
|---|---|---|---|---|---|
| 0.9813 | 1.0 | 85 | 0.8793 | 0.5929 | 0.4414 |
| 0.8166 | 2.0 | 170 | 0.7464 | 0.6799 | 0.6059 |
| 0.6847 | 3.0 | 255 | 0.6330 | 0.7581 | 0.7392 |
| 0.5672 | 4.0 | 340 | 0.5483 | 0.7935 | 0.7884 |
| 0.4823 | 5.0 | 425 | 0.5025 | 0.7994 | 0.7963 |
| 0.4187 | 6.0 | 510 | 0.4817 | 0.8024 | 0.7996 |
| 0.3761 | 7.0 | 595 | 0.4661 | 0.8024 | 0.8022 |
| 0.3453 | 8.0 | 680 | 0.4580 | 0.8097 | 0.8088 |
| 0.32 | 9.0 | 765 | 0.4561 | 0.8097 | 0.8093 |
| 0.2931 | 10.0 | 850 | 0.4512 | 0.8142 | 0.8136 |
| 0.2757 | 11.0 | 935 | 0.4498 | 0.8156 | 0.8152 |
| 0.2646 | 12.0 | 1020 | 0.4542 | 0.8127 | 0.8133 |
| 0.2483 | 13.0 | 1105 | 0.4548 | 0.8201 | 0.8205 |
| 0.2355 | 14.0 | 1190 | 0.4557 | 0.8156 | 0.8159 |
| 0.2192 | 15.0 | 1275 | 0.4599 | 0.8156 | 0.8156 |
Framework versions
- Transformers 4.52.4
- Pytorch 2.6.0+cu124
- Datasets 3.6.0
- Tokenizers 0.21.2
- Downloads last month
- 7
Model tree for Dugerij/news_sentiment_classifier
Base model
distilbert/distilbert-base-uncased