Model Card: [DistilBERT for Sentiment Classification]

Overview

Model Name: DistilBERT for Sentiment Classification
Version: v1.0.0
Date Created: 28/10/2025
Last Updated:28/10/2025
Author(s):Razvan Nica & Filip Šarík
Institution / Organization: BUas ADS&AI

Short Description:
This model is a fine-tuned checkpoint of the distilbert/distilbert-base-uncased-finetuned-sst-2-english model.

It's designed to classify emotions in English text, predicting one of seven classes: Ekman's six basic emotions plus a neutral category:

  • 0: "neutral"
  • 1: "anger"
  • 2: "disgust"
  • 3: "fear"
  • 4: "happiness"
  • 5: "sadness"
  • 6: "surprise"

Intended Use

Primary Intended Use

This model was specifically developed and fine-tuned for emotion analysis in video transcripts. Its primary intended use is to accurately classify English text into one of the seven defined emotion categories (Ekman's six basic emotions plus a neutral class) extracted from video or audio data.

The model is suitable for general English text emotion classification, but its performance is optimized for the conversational and language style found in transcribed speech.

Note: For optimal performance on other tasks or significantly different domains, further fine-tuning is strongly recommended.

Intended Users

The model is intended for users who possess a basic working knowledge of the Python programming language, the PyTorch framework, and the Hugging Face Transformers library.


🛑 Out-of-Scope Use

Prohibited Uses: This model must not be used to intentionally create hostile, alienating, or discriminatory environments or content against individuals or groups.

Factual Content: The model was trained for emotion classification, not for generating factual or true representations of people or events. Using this model to create or present content as factually accurate is out-of-scope and could lead to misrepresentation.


Model Details

Model Architecture:

This model utilizes a Transformer architecture, leveraging the base of the DistilBERT model family.

Specifically, it is a fine-tuned checkpoint of the distilbert/distilbert-base-uncased-finetuned-sst-2-english model.

For detailed information on the base architecture, please refer to the link above.

Purpose & Development Context:

This model was specifically developed and fine-tuned for emotion analysis in video transcripts. Its primary purpose is to accurately classify English text into one of the seven defined emotion categories: Ekman's six basic emotions (anger, disgust, fear, happiness, sadness, surprise) plus a neutral class. The fine-tuning process optimized its performance for the conversational language extracted from video and audio data.

This model was commissioned and developed for the Content Intelligence Agency. It serves as a crucial component within a larger, automated data pipeline. Its role is to process transcribed show content, extracting emotional metadata that is subsequently utilized to perform show-specific media analysis. This analysis helps inform content strategy and audience engagement insights.


Dataset Details

Training Data:

The model was trained on a composite dataset formed by combining three distinct emotion and dialogue datasets to enhance generalization and coverage of conversational text:

The emotions in these datasets were either remapped or removed when they didn't match our 7 class distribution.

Validation / Test Data:

A custom test set was created to better align with the model's primary use case (video transcript analysis).

  1. Source Material: A publicly available episode of Kitchen Nightmares was transcribed. The source video is available here.
  2. Annotation Process: Sentences from the transcript were initially annotated using a large language model (LLM). This was followed by a crucial manual re-annotation process to correct LLM errors and ensure high-quality, reliable labels for evaluation.
  3. Size: The final test dataset comprises 1,228 lines of data.

Training Procedure:

  • Learning Rate: 1e-5
  • Batch Size: 32
  • Maximum Sequence Length: 128 (tokens)
  • Weight Decay: 0.01
  • Unfrozen Layers: Last 2 encoder blocks

Recommendations for Use

Model Inputs

The input must be English text data. Before being fed to the model, the text must be tokenized using the specific AutoTokenizer loaded from this model's checkpoint.

This preprocessing step is mandatory and ensures the text is:

  • Processed using the correct vocabulary and token IDs.
  • Truncated or padded to a maximum sequence length of 128 tokens.

Warning: Failure to apply this exact preprocessing step will result in incorrect or unreliable predictions.


Model Outputs

The model returns a tensor containing raw logits or probabilities for each of the seven emotion classes.

To obtain the single, most probable emotion prediction for a given input sentence, we recommend applying the argmax function over the output tensor.

The resulting integer ID can be mapped back to its corresponding emotion label using the following dictionary:

EMOTION_MAP: Dict[int, str] = {
    0: "neutral",
    1: "anger",
    2: "disgust",
    3: "fear",
    4: "happiness",
    5: "sadness",
    6: "surprise"
}

Downloads last month
24
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dafaqboomduck/distilbert-sentiment-fine

Datasets used to train dafaqboomduck/distilbert-sentiment-fine