🍽️ Gemma-3-270M Restaurant Reservation NER Model
A specialized fine-tuned version of Google's Gemma-3-270M model designed for extracting restaurant reservation information from user messages with robust handling of ASR-generated text.
✨ Key Features
- 🎯 Entity Extraction: Identifies three key reservation elements
- 🌐 Bilingual Support: Handles both Chinese and English input
- 🎙️ ASR Robust: Optimized for noisy speech recognition output
- 📱 Phone Focus: Specialized for Taiwanese mobile number extraction
📋 Comprehensive Example: All Three Entities
Complex Input Text:"Hi, I'd like to make a reservation for 2 adults and 3 children on the 15th of next month around 7:30 in the evening, and you can reach me at +886-912-345-678"
Extracted Output:
{
  "num_people": "5",
  "reservation_date": "15th of next month at 7:30 PM", 
  "phone_num": "0912345678"
}
Entity Breakdown from this Complex Example:
| Entity | Extracted From Input | Normalized Output | 
|---|---|---|
| num_people | "2 adults and 3 children" | "5"(summed total) | 
| reservation_date | "15th of next month around 7:30 in the evening" | "15th of next month at 7:30 PM"(normalized time format) | 
| phone_num | "+886-912-345-678" | "0912345678"(international format converted to local) | 
⚠️ Important Note: Phone Number Handling
This model exclusively extracts Taiwanese 10-digit mobile numbers (09XXXXXXXX format):
✅ Extracted: Mobile numbers with complex variations
- "+886-912-345-678"→- 0912345678(international format)
- "零九一二三四五六七八"→- 0912345678(Chinese characters)
- "09 12 34 56 78"→- 0912345678(spaced format)
❌ Ignored: Non-mobile numbers
- "市話02-1234-5678"→- ""(landline)
- "國際電話+1-555-123-4567"→- ""(international non-Taiwanese)
- "免付費0800-123-456"→- ""(toll-free)
🚀 Quick Start
Installation
pip install transformers torch
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import json
# Load model and tokenizer
model_name = "Luigi/dinercall-ner"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
# System prompt (in original Chinese)
system_prompt = """你是一個助理,負責從用戶消息中提取預訂資訊並以JSON格式輸出。
JSON必須包含三個字段: num_people, reservation_date, phone_num。
如果某個字段沒有信息,使用空字符串。只輸出JSON,不要添加任何其他文字。"""
# Example with complex input
user_input = "Hi, I'd like to make a reservation for 2 adults and 3 children on the 15th of next month around 7:30 in the evening, and you can reach me at +886-912-345-678"
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_input}
]
# Generate response
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=64,
        temperature=0.1,
        do_sample=False
    )
# Process output
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
result = json.loads(response)
print(result)
# Output: {"num_people": "5", "reservation_date": "15th of next month at 7:30 PM", "phone_num": "0912345678"}
🎯 Use Cases
Perfect For
- 🗣️ Voice assistant reservation systems
- 🤖 Chatbot booking interfaces
- 📞 Call center automation
- 📱 Mobile app reservation features
Ideal Input Types
- English: "Book for 6 people next Friday at 8 PM"
- Chinese: "預約明天晚上7點,四位成人"
- Mixed: "我想book 4位,tomorrow at 7 PM"
📊 Training Details
Dataset
- Source: dinercall-ner dataset
- Samples: 20,000 synthetic reservation requests
- Language: 70% Chinese, 30% English
- Features: ASR noise simulation, realistic error patterns
Configuration
| Parameter | Value | 
|---|---|
| Base Model | unsloth/gemma-3-270m-it-unsloth-bnb-4bit | 
| Max Sequence Length | 256 tokens | 
| Learning Rate | 2e-5 | 
| Batch Size | 4 (gradient accumulation: 2) | 
| Training Epochs | 10 | 
| LoRA Rank | 32 | 
📝 Advanced Examples & Outputs
Complex Input Examples with Outputs
# Example 1: Complex English with mixed formatting
input_text = "Could you please reserve a table for 3 adults and 2 children on December 24th around 8 PM? My contact is +886-987-654-321"
output = {
  "num_people": "5",
  "reservation_date": "December 24th at 8 PM",
  "phone_num": "0987654321"
}
# Example 2: Chinese with complex date and mixed digits
input_text = "我們想要預約下個月15號晚上7點半,4大2小,電話是零九八七-六五四三二一"
output = {
  "num_people": "6",
  "reservation_date": "下個月15號晚上7點半",
  "phone_num": "0987654321"
}
# Example 3: Noisy ASR input with complex elements
input_text = "Book for for 2 adullts and 1 childreen onn nexts Friday at 6:45 PM, fone 09八七六五四三二一"
output = {
  "num_people": "3",
  "reservation_date": "next Friday at 6:45 PM",
  "phone_num": "0987654321"
}
# Example 4: Mixed language with complex request
input_text = "我想book 3大人2小孩,time是next Wednesday at 7:30 PM,contact number是0912-345-678"
output = {
  "num_people": "5",
  "reservation_date": "next Wednesday at 7:30 PM",
  "phone_num": "0912345678"
}
⚠️ Limitations & Considerations
Technical Limitations
- 🎯 Phone Numbers: Only Taiwanese mobile numbers (09XXXXXXXX)
- 🌍 Geography: Optimized for Taiwanese reservation patterns
- 🎙️ ASR Types: Best performance on simulated ASR errors similar to training data
- 💬 Language Mix: Handles Chinese/English mixing but may struggle with other languages
Ethical Considerations
- 🔒 Privacy: Only extracts mobile numbers; landline numbers are ignored
- 📋 Consent: Ensure proper user consent for data processing
- ⚖️ Compliance: Follow local regulations for data handling
📚 Citation
If you use this model in your research, please cite:
@software{dinercall_ner_model_2025,
  author = {Luigi},
  title = {Gemma-3-270M Fine-tuned for Restaurant Reservation NER},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Luigi/dinercall-ner}
}
🆘 Support
For questions, issues, or contributions:
- 📧 Open an issue on the Hugging Face repository
- 💬 Check the examples above for common usage patterns
- 🔧 Review the limitations section before deployment
📄 License
This model inherits the license terms of the base Gemma model. Please review Google's license terms for specific usage rights and restrictions.
- Downloads last month
- 135
