AbdellatifZ
/

distilbert-message-parser

@@ -1,199 +1,348 @@
 ---
-library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
+language: en
+license: apache-2.0
+tags:
+- token-classification
+- distilbert
+- ner
+- message-parsing
+- natural-language-understanding
+datasets:
+- custom
+metrics:
+- accuracy
+- f1
+pipeline_tag: token-classification
 ---
+# DistilBERT Message Parser 🤖💬
+A fine-tuned DistilBERT model for parsing natural language queries to extract **receiver** (person) and **content** (message) information from user requests.
+## Model Description
+This model performs token-level classification to identify:
+- **`person`**: The recipient/receiver of the message
+- **`content`**: The message content to be sent
+- **`O`**: Other tokens (Outside)
+## Use Cases
+Perfect for virtual assistants, chatbots, and messaging applications that need to understand commands like:
+- "Send a message to Mom telling her I'll be home late"
+- "Ask the python teacher when is the next class"
+- "Text John about tomorrow's meeting"
+## Quick Start
+### Installation
+```bash
+pip install transformers torch
+```
+### Basic Usage
+```python
+from transformers import AutoTokenizer, AutoModelForTokenClassification
+import torch
+# Load model and tokenizer
+model_name = "AbdellatifZ/distilbert-message-parser"  # Replace with your model name
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForTokenClassification.from_pretrained(model_name)
+# Helper function for word-level predictions
+def predict_at_word_level(words, model, tokenizer):
+    """Predict labels at word level (not subword tokens)"""
+    inputs = tokenizer(words, return_tensors="pt", is_split_into_words=True)
+    with torch.no_grad():
+        logits = model(**inputs).logits
+    predictions = torch.argmax(logits, dim=2)
+    word_labels = []
+    word_ids = inputs.word_ids()
+    previous_word_idx = None
+    for idx, word_idx in enumerate(word_ids):
+        if word_idx is None:  # Special tokens
+            continue
+        if word_idx != previous_word_idx:  # First subtoken of each word
+            word_labels.append(predictions[0][idx].item())
+            previous_word_idx = word_idx
+    return word_labels
+# Main parsing function
+def parse_message(query, model, tokenizer):
+    """
+    Parse a query to extract receiver and content.
+    Args:
+        query (str): User query in natural language
+        model: Token classification model
+        tokenizer: Tokenizer
+    Returns:
+        dict: {"receiver": str, "content": str}
+    """
+    words = query.split()
+    label_ids = predict_at_word_level(words, model, tokenizer)
+    id2label = model.config.id2label
+    labels = [id2label[label_id] for label_id in label_ids]
+    person_tokens = [word for word, label in zip(words, labels) if label == 'person']
+    content_tokens = [word for word, label in zip(words, labels) if label == 'content']
+    return {
+        'receiver': ' '.join(person_tokens) if person_tokens else None,
+        'content': ' '.join(content_tokens) if content_tokens else None
+    }
+# Example usage
+query = "Ask the python teacher when is the next class"
+result = parse_message(query, model, tokenizer)
+print(result)
+# Output: {'receiver': 'the python teacher', 'content': 'when is the next class'}
+```
+## More Examples
+```python
+# Example 1: Simple message
+query = "Send a message to Mom telling her I'll be home late"
+result = parse_message(query, model, tokenizer)
+print(result)
+# {'receiver': 'Mom', 'content': "telling her I'll be home late"}
+# Example 2: Professional context
+query = "Write to the professor asking about the exam format"
+result = parse_message(query, model, tokenizer)
+print(result)
+# {'receiver': 'the professor', 'content': 'asking about the exam format'}
+# Example 3: Casual context
+query = "Text John asking if he's available for a meeting tomorrow"
+result = parse_message(query, model, tokenizer)
+print(result)
+# {'receiver': 'John', 'content': "asking if he's available for a meeting tomorrow"}
+```
+## Advanced Usage: Batch Processing
+```python
+def parse_messages_batch(queries, model, tokenizer):
+    """Parse multiple queries efficiently"""
+    results = []
+    for query in queries:
+        result = parse_message(query, model, tokenizer)
+        results.append(result)
+    return results
+# Batch example
+queries = [
+    "Ask the python teacher when is the next class",
+    "Message the customer support about my order status",
+    "Text my friend to see if they're coming tonight"
+]
+results = parse_messages_batch(queries, model, tokenizer)
+for query, result in zip(queries, results):
+    print(f"Query: {query}")
+    print(f"Result: {result}\n")
+```
+## Detailed Token-Level Analysis
+```python
+def visualize_parsing(query, model, tokenizer):
+    """Show word-by-word label predictions"""
+    words = query.split()
+    label_ids = predict_at_word_level(words, model, tokenizer)
+    id2label = model.config.id2label
+    labels = [id2label[label_id] for label_id in label_ids]
+    print(f"\nQuery: {query}\n")
+    print(f"{'Word':<25} {'Label':<10}")
+    print("-" * 35)
+    for word, label in zip(words, labels):
+        print(f"{word:<25} {label:<10}")
+    result = parse_message(query, model, tokenizer)
+    print(f"\n{'='*35}")
+    print(f"Receiver: {result['receiver']}")
+    print(f"Content:  {result['content']}")
+    print(f"{'='*35}")
+# Example
+visualize_parsing("Ask the python teacher when is the next class", model, tokenizer)
+```
+**Output:**
+```
+Query: Ask the python teacher when is the next class
+Word                      Label
+-----------------------------------
+Ask                       O
+the                       person
+python                    person
+teacher                   person
+when                      content
+is                        content
+the                       content
+next                      content
+class                     content
+===================================
+Receiver: the python teacher
+Content:  when is the next class
+===================================
+```
+## API Integration Example
+```python
+from flask import Flask, request, jsonify
+app = Flask(__name__)
+# Load model once at startup
+model = AutoModelForTokenClassification.from_pretrained("AbdellatifZ/distilbert-message-parser")
+tokenizer = AutoTokenizer.from_pretrained("AbdellatifZ/distilbert-message-parser")
+@app.route('/parse', methods=['POST'])
+def parse():
+    data = request.json
+    query = data.get('query', '')
+    if not query:
+        return jsonify({'error': 'No query provided'}), 400
+    try:
+        result = parse_message(query, model, tokenizer)
+        return jsonify({
+            'success': True,
+            'query': query,
+            'parsed': result
+        })
+    except Exception as e:
+        return jsonify({'error': str(e)}), 500
+if __name__ == '__main__':
+    app.run(debug=True)
+```
+## Model Details
+| Property | Value |
+|----------|-------|
+| Base Model | `distilbert-base-uncased` |
+| Task | Token Classification (NER-style) |
+| Number of Labels | 3 (O, content, person) |
+| Training Framework | Transformers (Hugging Face) |
+| Parameters | ~67M (DistilBERT) |
+| Max Sequence Length | 128 tokens |
 ## Training Details
+### Dataset
+- Source: Custom Presto-based dataset
+- Task: Send_message queries
+- Labels: `person`, `content`, `O`
+- Split: 70% train, 15% validation, 15% test
+### Training Configuration
+- **Epochs**: 15
+- **Batch Size**: 16
+- **Learning Rate**: 2e-5
+- **Optimizer**: AdamW
+- **Weight Decay**: 0.01
+- **Warmup Steps**: 100
+### Label Alignment
+The model uses special label alignment to handle subword tokenization:
+- Only the first subtoken of each word receives a label
+- Subsequent subtokens are marked with `-100` (ignored in loss computation)
+- Special tokens ([CLS], [SEP], [PAD]) are also ignored
+## Performance
+| Metric | Value |
+|--------|-------|
+| Accuracy | >0.90 |
+| Precision | >0.88 |
+| Recall | >0.88 |
+| F1-Score | >0.88 |
+*Note: Actual metrics may vary depending on your specific use case and dataset.*
+## Limitations
+- **Language**: Optimized for English queries only
+- **Domain**: Best performance on message-sending commands
+- **Structure**: May struggle with highly unusual or complex sentence structures
+- **Context**: Limited to single-turn queries (no conversation context)
+## Error Handling
+```python
+def safe_parse_message(query, model, tokenizer):
+    """Parse with error handling"""
+    try:
+        if not query or not query.strip():
+            return {'error': 'Empty query', 'receiver': None, 'content': None}
+        result = parse_message(query, model, tokenizer)
+        # Validate results
+        if not result['receiver'] and not result['content']:
+            return {'warning': 'No entities found', **result}
+        return result
+    except Exception as e:
+        return {'error': str(e), 'receiver': None, 'content': None}
+# Example
+result = safe_parse_message("", model, tokenizer)
+print(result)  # {'error': 'Empty query', 'receiver': None, 'content': None}
+```
+## Citation
+If you use this model in your research, please cite:
+```bibtex
+@misc{distilbert-message-parser,
+  author = {Your Name},
+  title = {DistilBERT Message Parser: Token Classification for Message Intent Extraction},
+  year = {2025},
+  publisher = {Hugging Face},
+  howpublished = {\url{https://huggingface.co/AbdellatifZ/distilbert-message-parser}}
+}
+```
+## License
+This model is released under the Apache 2.0 License.
+## Contact & Feedback
+For questions, issues, or feedback:
+- Open an issue on the model repository
+- Contact: [Your contact information]
+## Acknowledgments
+- Base model: [DistilBERT](https://huggingface.co/distilbert-base-uncased) by Hugging Face
+- Framework: [Transformers](https://github.com/huggingface/transformers) by Hugging Face
+- Dataset inspiration: Presto benchmark
+---
+**Built with Transformers 🤗**