Spaces:

JustTheStatsHuman
/

Togmal-demo

Configuration error

App Files Files Community

Togmal-demo / README.md

HeTalksInMaths

Clean up repository: Remove unnecessary markdown files and update README

560c34e 18 days ago

preview code

raw

history blame contribute delete

11.2 kB

ToGMAL MCP Server

Taxonomy of Generative Model Apparent Limitations

A Model Context Protocol (MCP) server that provides real-time, privacy-preserving analysis of LLM interactions to detect out-of-distribution behaviors and recommend safety interventions.

Overview

ToGMAL helps prevent common LLM pitfalls by detecting:

🔬 Math/Physics Speculation: Ungrounded "theories of everything" and invented physics
🏥 Medical Advice Issues: Health recommendations without proper sources or disclaimers
💾 Dangerous File Operations: Mass deletions, recursive operations without safeguards
💻 Vibe Coding Overreach: Overly ambitious projects without proper scoping
📊 Unsupported Claims: Strong assertions without evidence or hedging

Key Features

Privacy-Preserving: All analysis is deterministic and local (no external API calls)
Low Latency: Heuristic-based detection for real-time analysis
Intervention Recommendations: Suggests step breakdown, human-in-the-loop, or web search
Taxonomy Building: Crowdsourced evidence collection for improving detection
Extensible: Easy to add new detection patterns and categories

Installation

Prerequisites

Python 3.10 or higher
pip package manager

Install Dependencies

pip install mcp pydantic httpx --break-system-packages

Install the Server

# Clone or download the server
# Then run it directly
python togmal_mcp.py

Usage

Available Tools

1. `togmal_analyze_prompt`

Analyze a user prompt before the LLM processes it.

Parameters:

prompt (str): The user prompt to analyze
response_format (str): Output format - "markdown" or "json"

Example:

{
  "prompt": "Build me a complete theory of quantum gravity that unifies all forces",
  "response_format": "json"
}

Use Cases:

Detect speculative physics theories before generating responses
Flag overly ambitious coding requests
Identify requests for medical advice that need disclaimers

2. `togmal_analyze_response`

Analyze an LLM response for potential issues.

Parameters:

response (str): The LLM response to analyze
context (str, optional): Original prompt for better analysis
response_format (str): Output format - "json" or "json"

Example:

{
  "response": "You should definitely take 500mg of ibuprofen every 4 hours...",
  "context": "I have a headache",
  "response_format": "json"
}

Use Cases:

Check for ungrounded medical advice
Detect dangerous file operation instructions
Flag unsupported statistical claims

3. `togmal_submit_evidence`

Submit evidence of LLM limitations to improve the taxonomy.

Parameters:

category (str): Type of limitation - "math_physics_speculation", "ungrounded_medical_advice", etc.
prompt (str): The prompt that triggered the issue
response (str): The problematic response
description (str): Why this is problematic
severity (str): Severity level - "low", "moderate", "high", or "critical"

Example:

{
  "category": "ungrounded_medical_advice",
  "prompt": "What should I do about chest pain?",
  "response": "It's probably nothing serious, just indigestion...",
  "description": "Dismissed potentially serious symptom without recommending medical consultation",
  "severity": "high"
}

Features:

Human-in-the-loop confirmation before submission
Generates unique entry ID for tracking
Contributes to improving detection heuristics

4. `togmal_get_taxonomy`

Retrieve entries from the taxonomy database.

Parameters:

category (str, optional): Filter by category
min_severity (str, optional): Minimum severity to include
limit (int): Maximum entries to return (1-100, default 20)
offset (int): Pagination offset (default 0)
response_format (str): Output format

Example:

{
  "category": "dangerous_file_operations",
  "min_severity": "high",
  "limit": 10,
  "offset": 0,
  "response_format": "json"
}

Use Cases:

Research common LLM failure patterns
Train improved detection models
Generate safety guidelines

5. `togmal_get_statistics`

Get statistical overview of the taxonomy database.

Parameters:

response_format (str): Output format

Returns:

Total entries by category
Severity distribution
Database capacity status

Detection Heuristics

Math/Physics Speculation

Detects:

"Theory of everything" claims
Unified field theory proposals
Invented equations or particles
Modifications to fundamental constants

Patterns:

- "new equation for quantum gravity"
- "my unified theory"
- "discovered particle"
- "redefine the speed of light"

Ungrounded Medical Advice

Detects:

Diagnoses without qualifications
Treatment recommendations without sources
Specific drug dosages
Dismissive responses to symptoms

Patterns:

- "you probably have..."
- "take 500mg of..."
- "don't worry about it"
- Missing citations or disclaimers

Dangerous File Operations

Detects:

Mass deletion commands
Recursive operations without safeguards
Operations on test files without confirmation
No human-in-the-loop for destructive actions

Patterns:

- "rm -rf" without confirmation
- "delete all test files"
- "recursively remove"
- Missing safety checks

Vibe Coding Overreach

Detects:

Requests for complete applications
Massive line count targets (1000+ lines)
Unrealistic timeframes
Scope without proper planning

Patterns:

- "build a complete social network"
- "5000 lines of code"
- "everything in one shot"
- Missing architectural planning

Unsupported Claims

Detects:

Absolute statements without hedging
Statistical claims without sources
Over-confident predictions
Missing citations

Patterns:

- "always/never/definitely"
- "95% of doctors agree" (no source)
- "guaranteed to work"
- Missing uncertainty language

Risk Levels

Calculated based on weighted confidence scores:

LOW: Minor issues, no immediate intervention needed
MODERATE: Worth noting, consider additional verification
HIGH: Significant concern, interventions recommended
CRITICAL: Serious risk, multiple interventions strongly advised

Intervention Types

Step Breakdown

Complex tasks should be broken into verifiable components.

Recommended for:

Math/physics speculation
Large coding projects
Dangerous file operations

Human-in-the-Loop

Critical decisions require human oversight.

Recommended for:

Medical advice
Destructive file operations
High-severity issues

Web Search

Claims should be verified against authoritative sources.

Recommended for:

Medical recommendations
Physics/math theories
Unsupported factual claims

Simplified Scope

Overly ambitious projects need realistic scoping.

Recommended for:

Vibe coding requests
Complex system designs
Feature-heavy applications

Configuration

Character Limit

Default: 25,000 characters per response

CHARACTER_LIMIT = 25000

Taxonomy Capacity

Default: 1,000 evidence entries

MAX_EVIDENCE_ENTRIES = 1000

Detection Sensitivity

Adjust pattern matching and confidence thresholds in detection functions:

def detect_math_physics_speculation(text: str) -> Dict[str, Any]:
    # Modify patterns or confidence calculations
    ...

Integration Examples

Claude Desktop App

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "togmal": {
      "command": "python",
      "args": ["/path/to/togmal_mcp.py"]
    }
  }
}

CLI Testing

# Run the server
python togmal_mcp.py

# In another terminal, test with MCP inspector
npx @modelcontextprotocol/inspector python togmal_mcp.py

Programmatic Usage

from mcp.client import Client

async def analyze_prompt(prompt: str):
    async with Client("togmal") as client:
        result = await client.call_tool(
            "togmal_analyze_prompt",
            {"prompt": prompt, "response_format": "json"}
        )
        return result

Architecture

Design Principles

Privacy First: No external API calls, all processing local
Deterministic: Heuristic-based detection for reproducibility
Low Latency: Fast pattern matching for real-time use
Extensible: Easy to add new patterns and categories
Human-Centered: Always allows human override and judgment

Future Enhancements

The system is designed for progressive enhancement:

Phase 1 (Current): Heuristic pattern matching
Phase 2 (Planned): Traditional ML models (clustering, anomaly detection)
Phase 3 (Future): Federated learning from submitted evidence
Phase 4 (Advanced): Custom fine-tuned models for specific domains

Data Flow

User Prompt
    ↓
togmal_analyze_prompt
    ↓
Detection Heuristics (parallel)
    ├── Math/Physics
    ├── Medical Advice
    ├── File Operations
    ├── Vibe Coding
    └── Unsupported Claims
    ↓
Risk Calculation
    ↓
Intervention Recommendations
    ↓
Response to Client

Contributing

Adding New Detection Patterns

Create a new detection function:

def detect_new_category(text: str) -> Dict[str, Any]:
    patterns = {
        'subcategory1': [r'pattern1', r'pattern2'],
        'subcategory2': [r'pattern3']
    }
    # Implement detection logic
    return {
        'detected': bool,
        'categories': list,
        'confidence': float
    }

Add to CategoryType enum
Update analysis functions to include new detector
Add intervention recommendations if needed

Submitting Evidence

Use the togmal_submit_evidence tool to contribute examples of problematic LLM behavior. This helps improve detection for everyone.

Limitations

Current Constraints

Heuristic-Based: May have false positives/negatives
English-Only: Patterns optimized for English text
Context-Free: Doesn't understand full conversation history
No Learning: Detection rules are static until updated

Not a Replacement For

Professional judgment in critical domains (medicine, law, etc.)
Comprehensive code review
Security auditing
Safety testing in production systems

License

MIT License - See LICENSE file for details

Support

For issues, questions, or contributions:

Open an issue on GitHub
Submit evidence through the MCP tool
Contact: [Your contact information]

Citation

If you use ToGMAL in your research or product, please cite:

@software{togmal_mcp,
  title={ToGMAL: Taxonomy of Generative Model Apparent Limitations},
  author={[Your Name]},
  year={2025},
  url={https://github.com/[your-repo]/togmal-mcp}
}

Acknowledgments

Built using:

Inspired by the need for safer, more grounded AI interactions.

ToGMAL MCP Server

Overview

Key Features

Installation

Prerequisites

Install Dependencies

Install the Server

Usage

Available Tools

1. togmal_analyze_prompt

2. togmal_analyze_response

3. togmal_submit_evidence

4. togmal_get_taxonomy

5. togmal_get_statistics

Detection Heuristics

Math/Physics Speculation

Ungrounded Medical Advice

Dangerous File Operations

Vibe Coding Overreach

Unsupported Claims

Risk Levels

Intervention Types

Step Breakdown

Human-in-the-Loop

Web Search

Simplified Scope

Configuration

Character Limit

Taxonomy Capacity

Detection Sensitivity

Integration Examples

Claude Desktop App

CLI Testing

Programmatic Usage

Architecture

Design Principles

Future Enhancements

Data Flow

Contributing

Adding New Detection Patterns

Submitting Evidence

Limitations

Current Constraints

Not a Replacement For

License

Support

Citation

Acknowledgments

1. `togmal_analyze_prompt`

2. `togmal_analyze_response`

3. `togmal_submit_evidence`

4. `togmal_get_taxonomy`

5. `togmal_get_statistics`