Dauji.ai Sales & CRM Consultant - Gemma 2B
This model is a fine-tuned version of Google's Gemma 2B, specialized for sales consultation and CRM advisory services for Dauji.ai platform.
Model Description
- Base Model: google/gemma-2-2b
- Fine-tuning Method: LoRA (Low-Rank Adaptation) via Unsloth
- Training Dataset: 5,000 custom scenarios covering Dauji.ai platform expertise
- Specialization: B2B sales acceleration, CRM integration, and revenue optimization
Training Details
- Training Examples: 5,000 comprehensive scenarios
- Training Steps: 200
- Final Training Loss: 0.077
- LoRA Rank: 16
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Capabilities
Core Expertise
- Dauji.ai platform features and benefits
- CRM integration strategies (Salesforce, HubSpot, Pipedrive)
- Sales process optimization and automation
- Lead qualification and scoring methodologies
- Revenue leak identification and prevention
Consultation Areas
- B2B sales acceleration strategies
- Multi-channel engagement optimization
- ROI calculation and business case development
- Technical implementation guidance
- Competitive positioning and analysis
Supported Industries
- SaaS companies
- Manufacturing
- Healthcare
- Financial services
- Professional services
- And 10+ other industries
Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
# Check GPU availability
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU device count: {torch.cuda.device_count()}")
if torch.cuda.is_available():
print(f"Current GPU: {torch.cuda.get_device_name(0)}")
# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# Load model and tokenizer
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained("ritvik77/dauji-ai-sales-crm-consultant_v.0.01")
print("Loading model...")
model = AutoModelForCausalLM.from_pretrained(
"ritvik77/dauji-ai-sales-crm-consultant_v.0.01",
dtype=torch.float16, # Use half precision to save GPU memory (updated from torch_dtype)
device_map="auto" # Automatically distribute model across available GPUs
)
# Alternative manual GPU placement (use this if device_map="auto" doesn't work)
# model = model.to(device)
print(f"Model device: {next(model.parameters()).device}")
# Consultation prompt template
def dauji_consultation(question):
prompt = '''Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
You are an expert Sales & CRM Consultant for Dauji.ai, the always-on AI Sales Agent platform with deep CRM integration. Focus on measurable business outcomes, CRM optimization, and sales acceleration strategies.
### Input:
{}
### Response:
'''.format(question)
# Tokenize and move inputs to GPU
inputs = tokenizer(prompt, return_tensors="pt").to(device)
# Generate with GPU
with torch.no_grad(): # Save memory during inference
outputs = model.generate(
**inputs,
max_new_tokens=400,
temperature=0.3, # Lower temperature for more focused responses
do_sample=True,
top_k=50,
top_p=0.95,
pad_token_id=tokenizer.eos_token_id,
eos_token_id=tokenizer.eos_token_id,
repetition_penalty=1.15,
no_repeat_ngram_size=3, # Prevent 3-gram repetition
early_stopping=True
)
# Decode only the generated part (excluding input prompt)
input_length = inputs.input_ids.shape[1]
generated_tokens = outputs[0][input_length:]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
return response.strip()
# Alternative function with better prompt engineering
def dauji_consultation_v2(question):
prompt = f"""You are Dauji.ai's expert Sales & CRM consultant. Answer the following question with specific, actionable advice about Dauji.ai's capabilities.
Question: {question}
Answer:"""
inputs = tokenizer(prompt, return_tensors="pt").to(device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=300,
temperature=0.2, # Very low temperature for consistency
do_sample=False, # Use greedy decoding for more predictable outputs
pad_token_id=tokenizer.eos_token_id,
eos_token_id=tokenizer.eos_token_id,
repetition_penalty=1.2
)
input_length = inputs.input_ids.shape[1]
generated_tokens = outputs[0][input_length:]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
return response.strip()
# Debug function to check model behavior
def debug_model_response(question):
prompt = f"Question: {question}\nAnswer:"
print(f"Input prompt: {prompt}")
print("-" * 50)
inputs = tokenizer(prompt, return_tensors="pt").to(device)
print(f"Input token IDs: {inputs.input_ids[0][:20]}...") # First 20 tokens
print(f"Input length: {inputs.input_ids.shape[1]} tokens")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=100,
temperature=0.1,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
full_response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"Full response: {full_response}")
return full_response
# Check GPU memory usage
if torch.cuda.is_available():
print(f"GPU memory allocated: {torch.cuda.memory_allocated(0) / 1024**3:.2f} GB")
print(f"GPU memory reserved: {torch.cuda.memory_reserved(0) / 1024**3:.2f} GB")
# Example usage
print("\nGenerating response...")
response = dauji_consultation("How can Dauji.ai improve our CRM conversion rates?")
print("\nResponse:")
print(response)
# Check GPU memory usage after inference
if torch.cuda.is_available():
print(f"\nGPU memory allocated after inference: {torch.cuda.memory_allocated(0) / 1024**3:.2f} GB")
print(f"GPU memory reserved after inference: {torch.cuda.memory_reserved(0) / 1024**3:.2f} GB")
Performance Metrics
- Specialized knowledge across 10+ consultation categories
- Handles complex CRM integration scenarios
- Provides ROI-focused recommendations
- Maintains consistent Dauji.ai messaging and positioning
Training Data Categories
- Core Value Proposition - Platform differentiation and benefits
- CRM Integration - Technical implementation and optimization
- Competitive Analysis - Positioning vs Drift, HubSpot, Salesforce
- ROI & Pricing - Business case development and justification
- Industry Specific - Tailored solutions for different verticals
- Technical Implementation - Setup, security, and integration guidance
- Sales Process Optimization - Workflow automation and efficiency
- Objection Handling - Common concerns and responses
- Enterprise Sales - Complex deal management and stakeholder engagement
- Advanced Features - Knowledge graph, analytics, and reporting
Model Architecture
Built on Google's Gemma 2B with LoRA fine-tuning:
- Total Parameters: 2.6B
- Trainable Parameters: 20.7M (0.79%)
- Memory Efficient: 4-bit quantization support
- Fast Inference: Optimized with Unsloth
Limitations
- Specialized for Dauji.ai platform consultation
- Focused on B2B sales and CRM use cases
- English language optimized
- May require context for highly technical integrations
Ethical Considerations
This model is designed for professional sales consultation and should be used responsibly:
- Provides accurate information based on training data
- Maintains professional and ethical sales practices
- Respects customer privacy and data protection standards
Citation
If you use this model, please cite:
@misc{dauji_ai_consultant_2024,
title={Dauji.ai Sales & CRM Consultant - Gemma 2B},
author={Your Name},
year={2024},
howpublished={Hugging Face Model Hub},
url={https://huggingface.co/ritvik77/dauji-ai-sales-crm-consultant_v.0.01}
}
Contact
For questions about this model or Dauji.ai platform consultation, please contact [your contact information].
- Downloads last month
- 6
Model tree for ritvik77/dauji-ai-sales-crm-consultant_v.0.01
Base model
google/gemma-2-2b