news_sentiment_classifier

This model is a fine-tuned version of distilbert-base-uncased on the Sentiment analysis for financial news dataset. It achieves the following results on the evaluation set:

Accuracy: 0.8201
F1: 0.8205
Loss: 0.4548

Model Description

Base Architecture: DistilBERT (distilbert-base-uncased) – a lightweight transformer architecture distilled from BERT, optimized for efficiency while maintaining strong performance in NLP tasks.
Task: Multiclass sentiment classification of financial news headlines
Output Classes:
- 0 – Negative
- 1 – Neutral
- 2 – Positive
Input Format: Raw English text (financial/news headlines), tokenized using the DistilBERT WordPiece tokenizer with:
- padding=True (dynamic batch padding)
- truncation=True (truncate sequences longer than the model max length, 512 tokens)

Intended Uses & Limitations

Intended Uses:

Sentiment classification of short text in financial and economic contexts
Integration into news monitoring pipelines or financial market analysis tools
Batch or streaming inference for classifying sentiment of headlines

Limitations:

Model performance may degrade on long-form text, as it is optimized for short inputs like headlines
Domain-specific bias: Training data is financial news; model may not generalize to unrelated topics (sports, social media)
Language limitation: English only (based on distilbert-base-uncased)
Predictions are class labels and probabilities only; the model does not provide reasoning or explanation

Training and Evaluation Data

Training Dataset: Custom dataset of financial news headlines, labeled with sentiment (negative, neutral, positive)
Dataset Size: 4,838 total examples
- Training Set: 70% (~3,386 examples)
- Validation Set: 15% (~726 examples)
- Test Set: 15% (~726 examples)
Evaluation Metrics:
- Accuracy
- Weighted F1-score (preferred for multiclass balance)
Tokenizer: DistilBERT WordPiece tokenizer with padding=True (dynamic padding) and truncation=True

Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("Dugerij/news_sentiment_classifier")
model = AutoModelForSequenceClassification.from_pretrained("Dugerij/news_sentiment_classifier")

text = ["Stocks plunge after weak earnings report"]
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=-1)
print(predictions)  # 0=Negative, 1=Neutral, 2=Positive

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-06
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
0.9813	1.0	85	0.8793	0.5929	0.4414
0.8166	2.0	170	0.7464	0.6799	0.6059
0.6847	3.0	255	0.6330	0.7581	0.7392
0.5672	4.0	340	0.5483	0.7935	0.7884
0.4823	5.0	425	0.5025	0.7994	0.7963
0.4187	6.0	510	0.4817	0.8024	0.7996
0.3761	7.0	595	0.4661	0.8024	0.8022
0.3453	8.0	680	0.4580	0.8097	0.8088
0.32	9.0	765	0.4561	0.8097	0.8093
0.2931	10.0	850	0.4512	0.8142	0.8136
0.2757	11.0	935	0.4498	0.8156	0.8152
0.2646	12.0	1020	0.4542	0.8127	0.8133
0.2483	13.0	1105	0.4548	0.8201	0.8205
0.2355	14.0	1190	0.4557	0.8156	0.8159
0.2192	15.0	1275	0.4599	0.8156	0.8156

Framework versions

Transformers 4.52.4
Pytorch 2.6.0+cu124
Datasets 3.6.0
Tokenizers 0.21.2

Downloads last month: 7

Safetensors

Model size

67M params

Tensor type

F32

Model tree for Dugerij/news_sentiment_classifier

Base model

distilbert/distilbert-base-uncased

Finetuned

(10502)

this model

Dugerij
/

news_sentiment_classifier