AbdellatifZ commited on
Commit
f615f1f
·
verified ·
1 Parent(s): 47f6c9b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +306 -157
README.md CHANGED
@@ -1,199 +1,348 @@
1
  ---
2
- library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
9
 
 
10
 
 
 
 
 
11
 
12
- ## Model Details
13
-
14
- ### Model Description
15
-
16
- <!-- Provide a longer summary of what this model is. -->
17
-
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
-
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
-
28
- ### Model Sources [optional]
29
-
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
-
36
- ## Uses
37
-
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
- ### Direct Use
41
-
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
-
46
- ### Downstream Use [optional]
47
-
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
 
50
- [More Information Needed]
 
 
 
51
 
52
- ### Out-of-Scope Use
53
 
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
 
56
- [More Information Needed]
 
 
57
 
58
- ## Bias, Risks, and Limitations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
-
62
- [More Information Needed]
63
-
64
- ### Recommendations
65
-
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
-
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
-
70
- ## How to Get Started with the Model
71
-
72
- Use the code below to get started with the model.
73
 
74
- [More Information Needed]
 
 
 
 
 
 
 
75
 
76
  ## Training Details
77
 
78
- ### Training Data
79
-
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Training Procedure
85
-
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
-
92
 
93
- #### Training Hyperparameters
 
 
 
 
 
 
94
 
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
96
 
97
- #### Speeds, Sizes, Times [optional]
98
 
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
 
 
 
 
100
 
101
- [More Information Needed]
102
 
103
- ## Evaluation
104
 
105
- <!-- This section describes the evaluation protocols and provides the results. -->
 
 
 
106
 
107
- ### Testing Data, Factors & Metrics
108
 
109
- #### Testing Data
 
 
 
 
 
110
 
111
- <!-- This should link to a Dataset Card if possible. -->
112
 
113
- [More Information Needed]
 
 
114
 
115
- #### Factors
116
 
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
 
118
 
119
- [More Information Needed]
 
 
 
120
 
121
- #### Metrics
122
 
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
 
125
- [More Information Needed]
 
 
 
 
 
 
 
 
126
 
127
- ### Results
128
 
129
- [More Information Needed]
130
 
131
- #### Summary
132
 
 
 
 
133
 
 
134
 
135
- ## Model Examination [optional]
 
 
136
 
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
140
-
141
- ## Environmental Impact
142
-
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
-
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
-
153
- ## Technical Specifications [optional]
154
-
155
- ### Model Architecture and Objective
156
-
157
- [More Information Needed]
158
-
159
- ### Compute Infrastructure
160
-
161
- [More Information Needed]
162
-
163
- #### Hardware
164
-
165
- [More Information Needed]
166
-
167
- #### Software
168
-
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
-
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
-
195
- [More Information Needed]
196
-
197
- ## Model Card Contact
198
 
199
- [More Information Needed]
 
1
  ---
2
+ language: en
3
+ license: apache-2.0
4
+ tags:
5
+ - token-classification
6
+ - distilbert
7
+ - ner
8
+ - message-parsing
9
+ - natural-language-understanding
10
+ datasets:
11
+ - custom
12
+ metrics:
13
+ - accuracy
14
+ - f1
15
+ pipeline_tag: token-classification
16
  ---
17
 
18
+ # DistilBERT Message Parser 🤖💬
19
 
20
+ A fine-tuned DistilBERT model for parsing natural language queries to extract **receiver** (person) and **content** (message) information from user requests.
21
 
22
+ ## Model Description
23
 
24
+ This model performs token-level classification to identify:
25
+ - **`person`**: The recipient/receiver of the message
26
+ - **`content`**: The message content to be sent
27
+ - **`O`**: Other tokens (Outside)
28
 
29
+ ## Use Cases
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
+ Perfect for virtual assistants, chatbots, and messaging applications that need to understand commands like:
32
+ - "Send a message to Mom telling her I'll be home late"
33
+ - "Ask the python teacher when is the next class"
34
+ - "Text John about tomorrow's meeting"
35
 
36
+ ## Quick Start
37
 
38
+ ### Installation
39
 
40
+ ```bash
41
+ pip install transformers torch
42
+ ```
43
 
44
+ ### Basic Usage
45
+
46
+ ```python
47
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
48
+ import torch
49
+
50
+ # Load model and tokenizer
51
+ model_name = "AbdellatifZ/distilbert-message-parser" # Replace with your model name
52
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
53
+ model = AutoModelForTokenClassification.from_pretrained(model_name)
54
+
55
+ # Helper function for word-level predictions
56
+ def predict_at_word_level(words, model, tokenizer):
57
+ """Predict labels at word level (not subword tokens)"""
58
+ inputs = tokenizer(words, return_tensors="pt", is_split_into_words=True)
59
+
60
+ with torch.no_grad():
61
+ logits = model(**inputs).logits
62
+ predictions = torch.argmax(logits, dim=2)
63
+
64
+ word_labels = []
65
+ word_ids = inputs.word_ids()
66
+ previous_word_idx = None
67
+
68
+ for idx, word_idx in enumerate(word_ids):
69
+ if word_idx is None: # Special tokens
70
+ continue
71
+ if word_idx != previous_word_idx: # First subtoken of each word
72
+ word_labels.append(predictions[0][idx].item())
73
+ previous_word_idx = word_idx
74
+
75
+ return word_labels
76
+
77
+ # Main parsing function
78
+ def parse_message(query, model, tokenizer):
79
+ """
80
+ Parse a query to extract receiver and content.
81
+
82
+ Args:
83
+ query (str): User query in natural language
84
+ model: Token classification model
85
+ tokenizer: Tokenizer
86
+
87
+ Returns:
88
+ dict: {"receiver": str, "content": str}
89
+ """
90
+ words = query.split()
91
+ label_ids = predict_at_word_level(words, model, tokenizer)
92
+
93
+ id2label = model.config.id2label
94
+ labels = [id2label[label_id] for label_id in label_ids]
95
+
96
+ person_tokens = [word for word, label in zip(words, labels) if label == 'person']
97
+ content_tokens = [word for word, label in zip(words, labels) if label == 'content']
98
+
99
+ return {
100
+ 'receiver': ' '.join(person_tokens) if person_tokens else None,
101
+ 'content': ' '.join(content_tokens) if content_tokens else None
102
+ }
103
+
104
+ # Example usage
105
+ query = "Ask the python teacher when is the next class"
106
+ result = parse_message(query, model, tokenizer)
107
+ print(result)
108
+ # Output: {'receiver': 'the python teacher', 'content': 'when is the next class'}
109
+ ```
110
+
111
+ ## More Examples
112
+
113
+ ```python
114
+ # Example 1: Simple message
115
+ query = "Send a message to Mom telling her I'll be home late"
116
+ result = parse_message(query, model, tokenizer)
117
+ print(result)
118
+ # {'receiver': 'Mom', 'content': "telling her I'll be home late"}
119
+
120
+ # Example 2: Professional context
121
+ query = "Write to the professor asking about the exam format"
122
+ result = parse_message(query, model, tokenizer)
123
+ print(result)
124
+ # {'receiver': 'the professor', 'content': 'asking about the exam format'}
125
+
126
+ # Example 3: Casual context
127
+ query = "Text John asking if he's available for a meeting tomorrow"
128
+ result = parse_message(query, model, tokenizer)
129
+ print(result)
130
+ # {'receiver': 'John', 'content': "asking if he's available for a meeting tomorrow"}
131
+ ```
132
+
133
+ ## Advanced Usage: Batch Processing
134
+
135
+ ```python
136
+ def parse_messages_batch(queries, model, tokenizer):
137
+ """Parse multiple queries efficiently"""
138
+ results = []
139
+ for query in queries:
140
+ result = parse_message(query, model, tokenizer)
141
+ results.append(result)
142
+ return results
143
+
144
+ # Batch example
145
+ queries = [
146
+ "Ask the python teacher when is the next class",
147
+ "Message the customer support about my order status",
148
+ "Text my friend to see if they're coming tonight"
149
+ ]
150
+
151
+ results = parse_messages_batch(queries, model, tokenizer)
152
+ for query, result in zip(queries, results):
153
+ print(f"Query: {query}")
154
+ print(f"Result: {result}\n")
155
+ ```
156
+
157
+ ## Detailed Token-Level Analysis
158
+
159
+ ```python
160
+ def visualize_parsing(query, model, tokenizer):
161
+ """Show word-by-word label predictions"""
162
+ words = query.split()
163
+ label_ids = predict_at_word_level(words, model, tokenizer)
164
+
165
+ id2label = model.config.id2label
166
+ labels = [id2label[label_id] for label_id in label_ids]
167
+
168
+ print(f"\nQuery: {query}\n")
169
+ print(f"{'Word':<25} {'Label':<10}")
170
+ print("-" * 35)
171
+
172
+ for word, label in zip(words, labels):
173
+ print(f"{word:<25} {label:<10}")
174
+
175
+ result = parse_message(query, model, tokenizer)
176
+ print(f"\n{'='*35}")
177
+ print(f"Receiver: {result['receiver']}")
178
+ print(f"Content: {result['content']}")
179
+ print(f"{'='*35}")
180
+
181
+ # Example
182
+ visualize_parsing("Ask the python teacher when is the next class", model, tokenizer)
183
+ ```
184
+
185
+ **Output:**
186
+ ```
187
+ Query: Ask the python teacher when is the next class
188
+
189
+ Word Label
190
+ -----------------------------------
191
+ Ask O
192
+ the person
193
+ python person
194
+ teacher person
195
+ when content
196
+ is content
197
+ the content
198
+ next content
199
+ class content
200
+
201
+ ===================================
202
+ Receiver: the python teacher
203
+ Content: when is the next class
204
+ ===================================
205
+ ```
206
+
207
+ ## API Integration Example
208
+
209
+ ```python
210
+ from flask import Flask, request, jsonify
211
+
212
+ app = Flask(__name__)
213
+
214
+ # Load model once at startup
215
+ model = AutoModelForTokenClassification.from_pretrained("AbdellatifZ/distilbert-message-parser")
216
+ tokenizer = AutoTokenizer.from_pretrained("AbdellatifZ/distilbert-message-parser")
217
+
218
+ @app.route('/parse', methods=['POST'])
219
+ def parse():
220
+ data = request.json
221
+ query = data.get('query', '')
222
+
223
+ if not query:
224
+ return jsonify({'error': 'No query provided'}), 400
225
+
226
+ try:
227
+ result = parse_message(query, model, tokenizer)
228
+ return jsonify({
229
+ 'success': True,
230
+ 'query': query,
231
+ 'parsed': result
232
+ })
233
+ except Exception as e:
234
+ return jsonify({'error': str(e)}), 500
235
+
236
+ if __name__ == '__main__':
237
+ app.run(debug=True)
238
+ ```
239
 
240
+ ## Model Details
 
 
 
 
 
 
 
 
 
 
 
 
241
 
242
+ | Property | Value |
243
+ |----------|-------|
244
+ | Base Model | `distilbert-base-uncased` |
245
+ | Task | Token Classification (NER-style) |
246
+ | Number of Labels | 3 (O, content, person) |
247
+ | Training Framework | Transformers (Hugging Face) |
248
+ | Parameters | ~67M (DistilBERT) |
249
+ | Max Sequence Length | 128 tokens |
250
 
251
  ## Training Details
252
 
253
+ ### Dataset
254
+ - Source: Custom Presto-based dataset
255
+ - Task: Send_message queries
256
+ - Labels: `person`, `content`, `O`
257
+ - Split: 70% train, 15% validation, 15% test
 
 
 
 
 
 
 
 
 
258
 
259
+ ### Training Configuration
260
+ - **Epochs**: 15
261
+ - **Batch Size**: 16
262
+ - **Learning Rate**: 2e-5
263
+ - **Optimizer**: AdamW
264
+ - **Weight Decay**: 0.01
265
+ - **Warmup Steps**: 100
266
 
267
+ ### Label Alignment
268
+ The model uses special label alignment to handle subword tokenization:
269
+ - Only the first subtoken of each word receives a label
270
+ - Subsequent subtokens are marked with `-100` (ignored in loss computation)
271
+ - Special tokens ([CLS], [SEP], [PAD]) are also ignored
272
 
273
+ ## Performance
274
 
275
+ | Metric | Value |
276
+ |--------|-------|
277
+ | Accuracy | >0.90 |
278
+ | Precision | >0.88 |
279
+ | Recall | >0.88 |
280
+ | F1-Score | >0.88 |
281
 
282
+ *Note: Actual metrics may vary depending on your specific use case and dataset.*
283
 
284
+ ## Limitations
285
 
286
+ - **Language**: Optimized for English queries only
287
+ - **Domain**: Best performance on message-sending commands
288
+ - **Structure**: May struggle with highly unusual or complex sentence structures
289
+ - **Context**: Limited to single-turn queries (no conversation context)
290
 
291
+ ## Error Handling
292
 
293
+ ```python
294
+ def safe_parse_message(query, model, tokenizer):
295
+ """Parse with error handling"""
296
+ try:
297
+ if not query or not query.strip():
298
+ return {'error': 'Empty query', 'receiver': None, 'content': None}
299
 
300
+ result = parse_message(query, model, tokenizer)
301
 
302
+ # Validate results
303
+ if not result['receiver'] and not result['content']:
304
+ return {'warning': 'No entities found', **result}
305
 
306
+ return result
307
 
308
+ except Exception as e:
309
+ return {'error': str(e), 'receiver': None, 'content': None}
310
 
311
+ # Example
312
+ result = safe_parse_message("", model, tokenizer)
313
+ print(result) # {'error': 'Empty query', 'receiver': None, 'content': None}
314
+ ```
315
 
316
+ ## Citation
317
 
318
+ If you use this model in your research, please cite:
319
 
320
+ ```bibtex
321
+ @misc{distilbert-message-parser,
322
+ author = {Your Name},
323
+ title = {DistilBERT Message Parser: Token Classification for Message Intent Extraction},
324
+ year = {2025},
325
+ publisher = {Hugging Face},
326
+ howpublished = {\url{https://huggingface.co/AbdellatifZ/distilbert-message-parser}}
327
+ }
328
+ ```
329
 
330
+ ## License
331
 
332
+ This model is released under the Apache 2.0 License.
333
 
334
+ ## Contact & Feedback
335
 
336
+ For questions, issues, or feedback:
337
+ - Open an issue on the model repository
338
+ - Contact: [Your contact information]
339
 
340
+ ## Acknowledgments
341
 
342
+ - Base model: [DistilBERT](https://huggingface.co/distilbert-base-uncased) by Hugging Face
343
+ - Framework: [Transformers](https://github.com/huggingface/transformers) by Hugging Face
344
+ - Dataset inspiration: Presto benchmark
345
 
346
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
347
 
348
+ **Built with Transformers 🤗**