Togmal-demo / README.md
HeTalksInMaths
Clean up repository: Remove unnecessary markdown files and update README
560c34e

ToGMAL MCP Server

Taxonomy of Generative Model Apparent Limitations

A Model Context Protocol (MCP) server that provides real-time, privacy-preserving analysis of LLM interactions to detect out-of-distribution behaviors and recommend safety interventions.

Overview

ToGMAL helps prevent common LLM pitfalls by detecting:

  • πŸ”¬ Math/Physics Speculation: Ungrounded "theories of everything" and invented physics
  • πŸ₯ Medical Advice Issues: Health recommendations without proper sources or disclaimers
  • πŸ’Ύ Dangerous File Operations: Mass deletions, recursive operations without safeguards
  • πŸ’» Vibe Coding Overreach: Overly ambitious projects without proper scoping
  • πŸ“Š Unsupported Claims: Strong assertions without evidence or hedging

Key Features

  • Privacy-Preserving: All analysis is deterministic and local (no external API calls)
  • Low Latency: Heuristic-based detection for real-time analysis
  • Intervention Recommendations: Suggests step breakdown, human-in-the-loop, or web search
  • Taxonomy Building: Crowdsourced evidence collection for improving detection
  • Extensible: Easy to add new detection patterns and categories

Installation

Prerequisites

  • Python 3.10 or higher
  • pip package manager

Install Dependencies

pip install mcp pydantic httpx --break-system-packages

Install the Server

# Clone or download the server
# Then run it directly
python togmal_mcp.py

Usage

Available Tools

1. togmal_analyze_prompt

Analyze a user prompt before the LLM processes it.

Parameters:

  • prompt (str): The user prompt to analyze
  • response_format (str): Output format - "markdown" or "json"

Example:

{
  "prompt": "Build me a complete theory of quantum gravity that unifies all forces",
  "response_format": "json"
}

Use Cases:

  • Detect speculative physics theories before generating responses
  • Flag overly ambitious coding requests
  • Identify requests for medical advice that need disclaimers

2. togmal_analyze_response

Analyze an LLM response for potential issues.

Parameters:

  • response (str): The LLM response to analyze
  • context (str, optional): Original prompt for better analysis
  • response_format (str): Output format - "json" or "json"

Example:

{
  "response": "You should definitely take 500mg of ibuprofen every 4 hours...",
  "context": "I have a headache",
  "response_format": "json"
}

Use Cases:

  • Check for ungrounded medical advice
  • Detect dangerous file operation instructions
  • Flag unsupported statistical claims

3. togmal_submit_evidence

Submit evidence of LLM limitations to improve the taxonomy.

Parameters:

  • category (str): Type of limitation - "math_physics_speculation", "ungrounded_medical_advice", etc.
  • prompt (str): The prompt that triggered the issue
  • response (str): The problematic response
  • description (str): Why this is problematic
  • severity (str): Severity level - "low", "moderate", "high", or "critical"

Example:

{
  "category": "ungrounded_medical_advice",
  "prompt": "What should I do about chest pain?",
  "response": "It's probably nothing serious, just indigestion...",
  "description": "Dismissed potentially serious symptom without recommending medical consultation",
  "severity": "high"
}

Features:

  • Human-in-the-loop confirmation before submission
  • Generates unique entry ID for tracking
  • Contributes to improving detection heuristics

4. togmal_get_taxonomy

Retrieve entries from the taxonomy database.

Parameters:

  • category (str, optional): Filter by category
  • min_severity (str, optional): Minimum severity to include
  • limit (int): Maximum entries to return (1-100, default 20)
  • offset (int): Pagination offset (default 0)
  • response_format (str): Output format

Example:

{
  "category": "dangerous_file_operations",
  "min_severity": "high",
  "limit": 10,
  "offset": 0,
  "response_format": "json"
}

Use Cases:

  • Research common LLM failure patterns
  • Train improved detection models
  • Generate safety guidelines

5. togmal_get_statistics

Get statistical overview of the taxonomy database.

Parameters:

  • response_format (str): Output format

Returns:

  • Total entries by category
  • Severity distribution
  • Database capacity status

Detection Heuristics

Math/Physics Speculation

Detects:

  • "Theory of everything" claims
  • Unified field theory proposals
  • Invented equations or particles
  • Modifications to fundamental constants

Patterns:

- "new equation for quantum gravity"
- "my unified theory"
- "discovered particle"
- "redefine the speed of light"

Ungrounded Medical Advice

Detects:

  • Diagnoses without qualifications
  • Treatment recommendations without sources
  • Specific drug dosages
  • Dismissive responses to symptoms

Patterns:

- "you probably have..."
- "take 500mg of..."
- "don't worry about it"
- Missing citations or disclaimers

Dangerous File Operations

Detects:

  • Mass deletion commands
  • Recursive operations without safeguards
  • Operations on test files without confirmation
  • No human-in-the-loop for destructive actions

Patterns:

- "rm -rf" without confirmation
- "delete all test files"
- "recursively remove"
- Missing safety checks

Vibe Coding Overreach

Detects:

  • Requests for complete applications
  • Massive line count targets (1000+ lines)
  • Unrealistic timeframes
  • Scope without proper planning

Patterns:

- "build a complete social network"
- "5000 lines of code"
- "everything in one shot"
- Missing architectural planning

Unsupported Claims

Detects:

  • Absolute statements without hedging
  • Statistical claims without sources
  • Over-confident predictions
  • Missing citations

Patterns:

- "always/never/definitely"
- "95% of doctors agree" (no source)
- "guaranteed to work"
- Missing uncertainty language

Risk Levels

Calculated based on weighted confidence scores:

  • LOW: Minor issues, no immediate intervention needed
  • MODERATE: Worth noting, consider additional verification
  • HIGH: Significant concern, interventions recommended
  • CRITICAL: Serious risk, multiple interventions strongly advised

Intervention Types

Step Breakdown

Complex tasks should be broken into verifiable components.

Recommended for:

  • Math/physics speculation
  • Large coding projects
  • Dangerous file operations

Human-in-the-Loop

Critical decisions require human oversight.

Recommended for:

  • Medical advice
  • Destructive file operations
  • High-severity issues

Web Search

Claims should be verified against authoritative sources.

Recommended for:

  • Medical recommendations
  • Physics/math theories
  • Unsupported factual claims

Simplified Scope

Overly ambitious projects need realistic scoping.

Recommended for:

  • Vibe coding requests
  • Complex system designs
  • Feature-heavy applications

Configuration

Character Limit

Default: 25,000 characters per response

CHARACTER_LIMIT = 25000

Taxonomy Capacity

Default: 1,000 evidence entries

MAX_EVIDENCE_ENTRIES = 1000

Detection Sensitivity

Adjust pattern matching and confidence thresholds in detection functions:

def detect_math_physics_speculation(text: str) -> Dict[str, Any]:
    # Modify patterns or confidence calculations
    ...

Integration Examples

Claude Desktop App

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "togmal": {
      "command": "python",
      "args": ["/path/to/togmal_mcp.py"]
    }
  }
}

CLI Testing

# Run the server
python togmal_mcp.py

# In another terminal, test with MCP inspector
npx @modelcontextprotocol/inspector python togmal_mcp.py

Programmatic Usage

from mcp.client import Client

async def analyze_prompt(prompt: str):
    async with Client("togmal") as client:
        result = await client.call_tool(
            "togmal_analyze_prompt",
            {"prompt": prompt, "response_format": "json"}
        )
        return result

Architecture

Design Principles

  1. Privacy First: No external API calls, all processing local
  2. Deterministic: Heuristic-based detection for reproducibility
  3. Low Latency: Fast pattern matching for real-time use
  4. Extensible: Easy to add new patterns and categories
  5. Human-Centered: Always allows human override and judgment

Future Enhancements

The system is designed for progressive enhancement:

  1. Phase 1 (Current): Heuristic pattern matching
  2. Phase 2 (Planned): Traditional ML models (clustering, anomaly detection)
  3. Phase 3 (Future): Federated learning from submitted evidence
  4. Phase 4 (Advanced): Custom fine-tuned models for specific domains

Data Flow

User Prompt
    ↓
togmal_analyze_prompt
    ↓
Detection Heuristics (parallel)
    β”œβ”€β”€ Math/Physics
    β”œβ”€β”€ Medical Advice
    β”œβ”€β”€ File Operations
    β”œβ”€β”€ Vibe Coding
    └── Unsupported Claims
    ↓
Risk Calculation
    ↓
Intervention Recommendations
    ↓
Response to Client

Contributing

Adding New Detection Patterns

  1. Create a new detection function:
def detect_new_category(text: str) -> Dict[str, Any]:
    patterns = {
        'subcategory1': [r'pattern1', r'pattern2'],
        'subcategory2': [r'pattern3']
    }
    # Implement detection logic
    return {
        'detected': bool,
        'categories': list,
        'confidence': float
    }
  1. Add to CategoryType enum
  2. Update analysis functions to include new detector
  3. Add intervention recommendations if needed

Submitting Evidence

Use the togmal_submit_evidence tool to contribute examples of problematic LLM behavior. This helps improve detection for everyone.

Limitations

Current Constraints

  • Heuristic-Based: May have false positives/negatives
  • English-Only: Patterns optimized for English text
  • Context-Free: Doesn't understand full conversation history
  • No Learning: Detection rules are static until updated

Not a Replacement For

  • Professional judgment in critical domains (medicine, law, etc.)
  • Comprehensive code review
  • Security auditing
  • Safety testing in production systems

License

MIT License - See LICENSE file for details

Support

For issues, questions, or contributions:

  • Open an issue on GitHub
  • Submit evidence through the MCP tool
  • Contact: [Your contact information]

Citation

If you use ToGMAL in your research or product, please cite:

@software{togmal_mcp,
  title={ToGMAL: Taxonomy of Generative Model Apparent Limitations},
  author={[Your Name]},
  year={2025},
  url={https://github.com/[your-repo]/togmal-mcp}
}

Acknowledgments

Built using:

Inspired by the need for safer, more grounded AI interactions.