news_sentiment_classifier

This model is a fine-tuned version of distilbert-base-uncased on the Sentiment analysis for financial news dataset. It achieves the following results on the evaluation set:

  • Accuracy: 0.8201
  • F1: 0.8205
  • Loss: 0.4548

Model Description

  • Base Architecture: DistilBERT (distilbert-base-uncased) โ€“ a lightweight transformer architecture distilled from BERT, optimized for efficiency while maintaining strong performance in NLP tasks.
  • Task: Multiclass sentiment classification of financial news headlines
  • Output Classes:
    • 0 โ€“ Negative
    • 1 โ€“ Neutral
    • 2 โ€“ Positive
  • Input Format: Raw English text (financial/news headlines), tokenized using the DistilBERT WordPiece tokenizer with:
    • padding=True (dynamic batch padding)
    • truncation=True (truncate sequences longer than the model max length, 512 tokens)

Intended Uses & Limitations

Intended Uses:

  • Sentiment classification of short text in financial and economic contexts
  • Integration into news monitoring pipelines or financial market analysis tools
  • Batch or streaming inference for classifying sentiment of headlines

Limitations:

  • Model performance may degrade on long-form text, as it is optimized for short inputs like headlines
  • Domain-specific bias: Training data is financial news; model may not generalize to unrelated topics (sports, social media)
  • Language limitation: English only (based on distilbert-base-uncased)
  • Predictions are class labels and probabilities only; the model does not provide reasoning or explanation

Training and Evaluation Data

  • Training Dataset: Custom dataset of financial news headlines, labeled with sentiment (negative, neutral, positive)
  • Dataset Size: 4,838 total examples
    • Training Set: 70% (~3,386 examples)
    • Validation Set: 15% (~726 examples)
    • Test Set: 15% (~726 examples)
  • Evaluation Metrics:
    • Accuracy
    • Weighted F1-score (preferred for multiclass balance)
  • Tokenizer: DistilBERT WordPiece tokenizer with padding=True (dynamic padding) and truncation=True

Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("Dugerij/news_sentiment_classifier")
model = AutoModelForSequenceClassification.from_pretrained("Dugerij/news_sentiment_classifier")

text = ["Stocks plunge after weak earnings report"]
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=-1)
print(predictions)  # 0=Negative, 1=Neutral, 2=Positive

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-06
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
0.9813 1.0 85 0.8793 0.5929 0.4414
0.8166 2.0 170 0.7464 0.6799 0.6059
0.6847 3.0 255 0.6330 0.7581 0.7392
0.5672 4.0 340 0.5483 0.7935 0.7884
0.4823 5.0 425 0.5025 0.7994 0.7963
0.4187 6.0 510 0.4817 0.8024 0.7996
0.3761 7.0 595 0.4661 0.8024 0.8022
0.3453 8.0 680 0.4580 0.8097 0.8088
0.32 9.0 765 0.4561 0.8097 0.8093
0.2931 10.0 850 0.4512 0.8142 0.8136
0.2757 11.0 935 0.4498 0.8156 0.8152
0.2646 12.0 1020 0.4542 0.8127 0.8133
0.2483 13.0 1105 0.4548 0.8201 0.8205
0.2355 14.0 1190 0.4557 0.8156 0.8159
0.2192 15.0 1275 0.4599 0.8156 0.8156

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.2
Downloads last month
7
Safetensors
Model size
67M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Dugerij/news_sentiment_classifier

Finetuned
(10502)
this model

Evaluation results