Spaces:
Configuration error
ToGMAL MCP Server
Taxonomy of Generative Model Apparent Limitations
A Model Context Protocol (MCP) server that provides real-time, privacy-preserving analysis of LLM interactions to detect out-of-distribution behaviors and recommend safety interventions.
Overview
ToGMAL helps prevent common LLM pitfalls by detecting:
- π¬ Math/Physics Speculation: Ungrounded "theories of everything" and invented physics
- π₯ Medical Advice Issues: Health recommendations without proper sources or disclaimers
- πΎ Dangerous File Operations: Mass deletions, recursive operations without safeguards
- π» Vibe Coding Overreach: Overly ambitious projects without proper scoping
- π Unsupported Claims: Strong assertions without evidence or hedging
Key Features
- Privacy-Preserving: All analysis is deterministic and local (no external API calls)
- Low Latency: Heuristic-based detection for real-time analysis
- Intervention Recommendations: Suggests step breakdown, human-in-the-loop, or web search
- Taxonomy Building: Crowdsourced evidence collection for improving detection
- Extensible: Easy to add new detection patterns and categories
Installation
Prerequisites
- Python 3.10 or higher
- pip package manager
Install Dependencies
pip install mcp pydantic httpx --break-system-packages
Install the Server
# Clone or download the server
# Then run it directly
python togmal_mcp.py
Usage
Available Tools
1. togmal_analyze_prompt
Analyze a user prompt before the LLM processes it.
Parameters:
prompt(str): The user prompt to analyzeresponse_format(str): Output format -"markdown"or"json"
Example:
{
"prompt": "Build me a complete theory of quantum gravity that unifies all forces",
"response_format": "json"
}
Use Cases:
- Detect speculative physics theories before generating responses
- Flag overly ambitious coding requests
- Identify requests for medical advice that need disclaimers
2. togmal_analyze_response
Analyze an LLM response for potential issues.
Parameters:
response(str): The LLM response to analyzecontext(str, optional): Original prompt for better analysisresponse_format(str): Output format -"json"or"json"
Example:
{
"response": "You should definitely take 500mg of ibuprofen every 4 hours...",
"context": "I have a headache",
"response_format": "json"
}
Use Cases:
- Check for ungrounded medical advice
- Detect dangerous file operation instructions
- Flag unsupported statistical claims
3. togmal_submit_evidence
Submit evidence of LLM limitations to improve the taxonomy.
Parameters:
category(str): Type of limitation -"math_physics_speculation","ungrounded_medical_advice", etc.prompt(str): The prompt that triggered the issueresponse(str): The problematic responsedescription(str): Why this is problematicseverity(str): Severity level -"low","moderate","high", or"critical"
Example:
{
"category": "ungrounded_medical_advice",
"prompt": "What should I do about chest pain?",
"response": "It's probably nothing serious, just indigestion...",
"description": "Dismissed potentially serious symptom without recommending medical consultation",
"severity": "high"
}
Features:
- Human-in-the-loop confirmation before submission
- Generates unique entry ID for tracking
- Contributes to improving detection heuristics
4. togmal_get_taxonomy
Retrieve entries from the taxonomy database.
Parameters:
category(str, optional): Filter by categorymin_severity(str, optional): Minimum severity to includelimit(int): Maximum entries to return (1-100, default 20)offset(int): Pagination offset (default 0)response_format(str): Output format
Example:
{
"category": "dangerous_file_operations",
"min_severity": "high",
"limit": 10,
"offset": 0,
"response_format": "json"
}
Use Cases:
- Research common LLM failure patterns
- Train improved detection models
- Generate safety guidelines
5. togmal_get_statistics
Get statistical overview of the taxonomy database.
Parameters:
response_format(str): Output format
Returns:
- Total entries by category
- Severity distribution
- Database capacity status
Detection Heuristics
Math/Physics Speculation
Detects:
- "Theory of everything" claims
- Unified field theory proposals
- Invented equations or particles
- Modifications to fundamental constants
Patterns:
- "new equation for quantum gravity"
- "my unified theory"
- "discovered particle"
- "redefine the speed of light"
Ungrounded Medical Advice
Detects:
- Diagnoses without qualifications
- Treatment recommendations without sources
- Specific drug dosages
- Dismissive responses to symptoms
Patterns:
- "you probably have..."
- "take 500mg of..."
- "don't worry about it"
- Missing citations or disclaimers
Dangerous File Operations
Detects:
- Mass deletion commands
- Recursive operations without safeguards
- Operations on test files without confirmation
- No human-in-the-loop for destructive actions
Patterns:
- "rm -rf" without confirmation
- "delete all test files"
- "recursively remove"
- Missing safety checks
Vibe Coding Overreach
Detects:
- Requests for complete applications
- Massive line count targets (1000+ lines)
- Unrealistic timeframes
- Scope without proper planning
Patterns:
- "build a complete social network"
- "5000 lines of code"
- "everything in one shot"
- Missing architectural planning
Unsupported Claims
Detects:
- Absolute statements without hedging
- Statistical claims without sources
- Over-confident predictions
- Missing citations
Patterns:
- "always/never/definitely"
- "95% of doctors agree" (no source)
- "guaranteed to work"
- Missing uncertainty language
Risk Levels
Calculated based on weighted confidence scores:
- LOW: Minor issues, no immediate intervention needed
- MODERATE: Worth noting, consider additional verification
- HIGH: Significant concern, interventions recommended
- CRITICAL: Serious risk, multiple interventions strongly advised
Intervention Types
Step Breakdown
Complex tasks should be broken into verifiable components.
Recommended for:
- Math/physics speculation
- Large coding projects
- Dangerous file operations
Human-in-the-Loop
Critical decisions require human oversight.
Recommended for:
- Medical advice
- Destructive file operations
- High-severity issues
Web Search
Claims should be verified against authoritative sources.
Recommended for:
- Medical recommendations
- Physics/math theories
- Unsupported factual claims
Simplified Scope
Overly ambitious projects need realistic scoping.
Recommended for:
- Vibe coding requests
- Complex system designs
- Feature-heavy applications
Configuration
Character Limit
Default: 25,000 characters per response
CHARACTER_LIMIT = 25000
Taxonomy Capacity
Default: 1,000 evidence entries
MAX_EVIDENCE_ENTRIES = 1000
Detection Sensitivity
Adjust pattern matching and confidence thresholds in detection functions:
def detect_math_physics_speculation(text: str) -> Dict[str, Any]:
# Modify patterns or confidence calculations
...
Integration Examples
Claude Desktop App
Add to your claude_desktop_config.json:
{
"mcpServers": {
"togmal": {
"command": "python",
"args": ["/path/to/togmal_mcp.py"]
}
}
}
CLI Testing
# Run the server
python togmal_mcp.py
# In another terminal, test with MCP inspector
npx @modelcontextprotocol/inspector python togmal_mcp.py
Programmatic Usage
from mcp.client import Client
async def analyze_prompt(prompt: str):
async with Client("togmal") as client:
result = await client.call_tool(
"togmal_analyze_prompt",
{"prompt": prompt, "response_format": "json"}
)
return result
Architecture
Design Principles
- Privacy First: No external API calls, all processing local
- Deterministic: Heuristic-based detection for reproducibility
- Low Latency: Fast pattern matching for real-time use
- Extensible: Easy to add new patterns and categories
- Human-Centered: Always allows human override and judgment
Future Enhancements
The system is designed for progressive enhancement:
- Phase 1 (Current): Heuristic pattern matching
- Phase 2 (Planned): Traditional ML models (clustering, anomaly detection)
- Phase 3 (Future): Federated learning from submitted evidence
- Phase 4 (Advanced): Custom fine-tuned models for specific domains
Data Flow
User Prompt
β
togmal_analyze_prompt
β
Detection Heuristics (parallel)
βββ Math/Physics
βββ Medical Advice
βββ File Operations
βββ Vibe Coding
βββ Unsupported Claims
β
Risk Calculation
β
Intervention Recommendations
β
Response to Client
Contributing
Adding New Detection Patterns
- Create a new detection function:
def detect_new_category(text: str) -> Dict[str, Any]:
patterns = {
'subcategory1': [r'pattern1', r'pattern2'],
'subcategory2': [r'pattern3']
}
# Implement detection logic
return {
'detected': bool,
'categories': list,
'confidence': float
}
- Add to CategoryType enum
- Update analysis functions to include new detector
- Add intervention recommendations if needed
Submitting Evidence
Use the togmal_submit_evidence tool to contribute examples of problematic LLM behavior. This helps improve detection for everyone.
Limitations
Current Constraints
- Heuristic-Based: May have false positives/negatives
- English-Only: Patterns optimized for English text
- Context-Free: Doesn't understand full conversation history
- No Learning: Detection rules are static until updated
Not a Replacement For
- Professional judgment in critical domains (medicine, law, etc.)
- Comprehensive code review
- Security auditing
- Safety testing in production systems
License
MIT License - See LICENSE file for details
Support
For issues, questions, or contributions:
- Open an issue on GitHub
- Submit evidence through the MCP tool
- Contact: [Your contact information]
Citation
If you use ToGMAL in your research or product, please cite:
@software{togmal_mcp,
title={ToGMAL: Taxonomy of Generative Model Apparent Limitations},
author={[Your Name]},
year={2025},
url={https://github.com/[your-repo]/togmal-mcp}
}
Acknowledgments
Built using:
Inspired by the need for safer, more grounded AI interactions.