talk2data

Sleeping

App Files Files Community

cevheri commited on May 7

Commit

95d6173

1 Parent(s): 80aa96a

docs: add cursor rules

Browse files

Files changed (6) hide show

.cursor/rules/ai-agent-workflow.mdc +81 -0
.cursor/rules/coding-standards.mdc +97 -0
.cursor/rules/database-query-guidelines.mdc +62 -0
.cursor/rules/security-guidelines.mdc +64 -0
.cursor/rules/technical-architecture.mdc +79 -0
.cursor/rules/visualization-guidelines.mdc +106 -0

.cursor/rules/ai-agent-workflow.mdc ADDED Viewed

	@@ -0,0 +1,81 @@

+---
+description:
+globs:
+alwaysApply: false
+---
+# AI Agent Workflow
+This document outlines the workflow of the AI agent system and its interaction with various components.
+## Agent Interaction Flow
+### 1. User Input Processing
+- User submits natural language query through [app.py](mdc:app.py)
+- Query is processed by LangChain agent
+- Context is retrieved from memory store
+- System prompt is generated with relevant tools and database context
+### 2. Tool Selection & Execution
+- Agent analyzes query intent
+- Selects appropriate MCP tools from [postgre_mcp_server.py](mdc:postgre_mcp_server.py)
+- Tools available:
+  - `execute_query`: For SQL query execution
+  - `visualize_results`: For data visualization
+  - `list_tables`: For schema exploration
+  - `get_table_schema`: For detailed table information
+  - `find_relationships`: For relationship analysis
+### 3. Response Generation
+- Tool execution results are processed
+- If visualization requested:
+  - Data is formatted for PandasAI
+  - Visualization is generated
+  - Image is converted to base64 for display
+- Response is formatted with:
+  - Query results
+  - Visualization (if applicable)
+  - Explanation of results
+  - SQL query used
+## Memory Management
+### Conversation History
+- Implemented in [memory_store.py](mdc:memory_store.py)
+- Maintains context between interactions
+- Stores:
+  - User messages
+  - AI responses
+  - Tool execution results
+- Supports context clearing with `/clear-cache` command
+## Error Handling
+### Common Scenarios
+- Invalid SQL queries
+- Database connection issues
+- Visualization generation failures
+- Tool execution errors
+- Memory management issues
+### Recovery Strategies
+- Graceful error messages
+- Fallback responses
+- Automatic retry mechanisms
+- Context preservation
+- User-friendly error explanations
+## Performance Considerations
+### Optimization Techniques
+- Async operations for database queries
+- Efficient memory usage
+- Caching of common queries
+- Resource cleanup
+- Connection pooling
+### Monitoring
+- Logging of operations
+- Performance metrics
+- Error tracking
+- Resource utilization
+- Response time monitoring

.cursor/rules/coding-standards.mdc ADDED Viewed

	@@ -0,0 +1,97 @@

+---
+description:
+globs:
+alwaysApply: false
+---
+# Coding Standards
+This document outlines the coding standards for the AI-powered database interface, emphasizing the use of English in all code-related content and adherence to SOLID principles, Clean Code, and Design Patterns.
+## Language Consistency
+### Comments and Notes
+- All comments and notes must be written in English.
+- Avoid using any language other than English in code comments.
+- Ensure clarity and conciseness in comments to aid understanding.
+### Variable and Function Naming
+- Use English for all variable and function names.
+- Follow a consistent naming convention (e.g., camelCase, snake_case).
+- Avoid using non-English characters or words in identifiers.
+### Documentation
+- All documentation, including README files and inline documentation, should be in English.
+- Ensure that all code examples and explanations are clear and accessible to English-speaking developers.
+## Best Practices
+### Code Readability
+- Write clear and descriptive comments.
+- Use meaningful variable and function names.
+- Maintain a consistent style throughout the codebase.
+### Collaboration
+- Encourage team members to adhere to these standards.
+- Regular code reviews to ensure compliance with language consistency.
+- Provide feedback and support for maintaining these standards.
+## SOLID Principles
+### Single Responsibility Principle (SRP)
+- Each class should have only one reason to change.
+- Ensure that classes are focused on a single functionality.
+### Open/Closed Principle (OCP)
+- Software entities should be open for extension but closed for modification.
+- Use inheritance and polymorphism to extend functionality.
+### Liskov Substitution Principle (LSP)
+- Subtypes must be substitutable for their base types.
+- Ensure that derived classes can replace base classes without affecting the correctness of the program.
+### Interface Segregation Principle (ISP)
+- Clients should not be forced to depend on interfaces they do not use.
+- Design interfaces to be specific to client needs.
+### Dependency Inversion Principle (DIP)
+- High-level modules should not depend on low-level modules. Both should depend on abstractions.
+- Use dependency injection to manage dependencies.
+## Clean Code
+### Meaningful Names
+- Use descriptive names for variables, functions, and classes.
+- Avoid abbreviations and unclear names.
+### Functions
+- Functions should be small and do one thing.
+- Use descriptive names and avoid side effects.
+### Comments
+- Comments should explain why, not what.
+- Avoid unnecessary comments and ensure they are up-to-date.
+### Formatting
+- Maintain consistent formatting and indentation.
+- Use whitespace effectively to improve readability.
+## Design Patterns
+### Creational Patterns
+- Use patterns like Singleton, Factory, and Builder to manage object creation.
+### Structural Patterns
+- Use patterns like Adapter, Bridge, and Composite to manage relationships between objects.
+### Behavioral Patterns
+- Use patterns like Observer, Strategy, and Command to manage communication between objects.
+## Compliance
+### Review Process
+- Regular audits of the codebase to ensure adherence to language standards and design principles.
+- Address any deviations promptly to maintain consistency.
+### Training
+- Provide training and resources to help team members understand and follow these standards.
+- Encourage continuous learning and improvement in coding practices.

.cursor/rules/database-query-guidelines.mdc ADDED Viewed

	@@ -0,0 +1,62 @@

+---
+description:
+globs:
+alwaysApply: false
+---
+# Database Query Guidelines
+This document outlines the guidelines for executing database queries and interacting with the database in the AI-powered interface.
+## Query Execution
+### SQL Query Generation
+- Implemented in [postgre_mcp_server.py](mdc:postgre_mcp_server.py)
+- Supports read-only SQL queries (SELECT, COUNT, GROUP BY, ORDER BY)
+- Destructive operations (DELETE, UPDATE, INSERT, DROP) are not allowed
+- SQL syntax validation before execution
+- Query results are formatted for display
+### Database Connection
+- Asynchronous connection management using AsyncPG
+- Connection pooling for efficient resource utilization
+- Error handling for connection issues
+- Secure database URL management
+## Best Practices
+### Query Design
+- Use explicit column names instead of *
+- Include LIMIT clauses to restrict result sets
+- Add WHERE clauses to filter results
+- Consider indexing for performance
+- Format SQL with proper indentation and line breaks
+### Performance Optimization
+- Efficient use of connection pooling
+- Asynchronous query execution
+- Proper error handling and logging
+- Resource cleanup after query execution
+## Error Handling
+### Common Issues
+- Invalid SQL syntax
+- Table or column not found
+- Connection failures
+- Query timeout
+- Resource constraints
+### Recovery Strategies
+- Graceful error messages
+- Fallback responses
+- Automatic retry mechanisms
+- User-friendly error explanations
+## Monitoring
+### Logging
+- Query execution logs
+- Performance metrics
+- Error tracking
+- Resource utilization
+- Response time monitoring

.cursor/rules/security-guidelines.mdc ADDED Viewed

	@@ -0,0 +1,64 @@

+---
+description:
+globs:
+alwaysApply: false
+---
+# Security Guidelines
+This document outlines the security practices and guidelines for the AI-powered database interface.
+## API Key Management
+### Secure Storage
+- API keys are stored in environment variables
+- Use of `.env` files for local development
+- Secure handling of API keys in production
+- Regular rotation of API keys
+### Access Control
+- Read-only database operations
+- No destructive SQL operations
+- Secure database URL management
+- User authentication and authorization
+## Best Practices
+### Data Security
+- Encrypt sensitive data
+- Use secure connections for database access
+- Implement proper error handling to avoid information leakage
+- Regular security audits and updates
+### Code Security
+- Avoid hardcoding sensitive information
+- Use secure coding practices
+- Regular code reviews for security vulnerabilities
+- Implement logging and monitoring for suspicious activities
+## Error Handling
+### Security Issues
+- Unauthorized access attempts
+- API key exposure
+- Database connection breaches
+- Resource misuse
+### Recovery Strategies
+- Immediate revocation of compromised keys
+- Logging of security incidents
+- User notification of security breaches
+- Regular security training and updates
+## Monitoring
+### Security Logging
+- Access logs
+- Error logs
+- Security incident logs
+- Resource usage logs
+### Incident Response
+- Immediate action on security incidents
+- Regular incident response drills
+- User communication during incidents
+- Post-incident analysis and learning

.cursor/rules/technical-architecture.mdc ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+description:
+globs:
+alwaysApply: false
+---
+# Technical Architecture
+This project implements an AI-powered database query interface using natural language processing and visualization capabilities.
+## Core Components
+### AI Agent & Chatbot
+- [app.py](mdc:app.py): Implements the Gradio web interface and orchestrates the AI agent
+  - Uses LangChain for AI agent orchestration
+  - Implements chat history management
+  - Handles visualization requests
+  - Manages user interactions through Gradio UI
+### MCP Server Implementation
+- [postgre_mcp_server.py](mdc:postgre_mcp_server.py): Core MCP (Model Context Protocol) server
+  - Implements FastMCP for efficient model-context communication
+  - Provides database connection management
+  - Implements SQL query execution tools
+  - Handles data visualization requests
+  - Manages database schema information
+### LangChain Integration
+- [langchain_mcp_client.py](mdc:langchain_mcp_client.py): LangChain-MCP integration layer
+  - Connects LangChain agents with MCP tools
+  - Manages conversation context
+  - Handles tool execution and response formatting
+  - Implements memory management for conversations
+## Key Technologies
+### AI & ML Stack
+- LangChain: For AI agent orchestration and tool management
+- OpenAI/Gemini: LLM providers for natural language understanding
+- PandasAI: For data visualization and analysis
+### Web & UI
+- Gradio: For building the web interface
+- Custom CSS: For UI styling and responsiveness
+### Database & Data Processing
+- PostgreSQL: Primary database
+- Pandas: For data manipulation and analysis
+- AsyncPG: For asynchronous database operations
+## Architecture Patterns
+### Model-Context Protocol (MCP)
+- FastMCP implementation for efficient model-context communication
+- Tool-based architecture for extensible functionality
+- Context-aware request handling
+- Resource management and lifecycle control
+### AI Agent Architecture
+- ReAct pattern implementation
+- Tool-based reasoning
+- Context-aware memory management
+- Natural language to SQL conversion
+- Visualization request handling
+## Development Guidelines
+### Code Organization
+- Modular tool implementation
+- Clear separation of concerns
+- Async-first approach
+- Error handling and logging
+- Type hints and documentation
+### Best Practices
+- Read-only database operations
+- Secure API key management
+- Efficient resource utilization
+- Proper error handling
+- Comprehensive logging

.cursor/rules/visualization-guidelines.mdc ADDED Viewed

	@@ -0,0 +1,106 @@

+---
+description:
+globs:
+alwaysApply: false
+---
+# Visualization Guidelines
+This document outlines the visualization capabilities and best practices for the AI-powered database interface.
+## Visualization Components
+### PandasAI Integration
+- Implemented in [postgre_mcp_server.py](mdc:postgre_mcp_server.py)
+- Uses OpenAI/Gemini for visualization generation
+- Supports multiple chart types:
+  - Bar charts
+  - Line charts
+  - Pie charts
+  - Scatter plots
+  - Box plots
+### Data Processing
+- Data formatting in [app.py](mdc:app.py)
+- JSON to DataFrame conversion
+- Column type handling
+- Data cleaning and preparation
+- Long text truncation
+## Visualization Workflow
+### 1. Request Processing
+- Natural language visualization request
+- Data extraction from query results
+- JSON data formatting
+- Visualization prompt generation
+### 2. Chart Generation
+- PandasAI initialization
+- LLM-based chart type selection
+- Customization parameters:
+  - Colors
+  - Labels
+  - Legends
+  - Axis formatting
+  - Title and description
+### 3. Output Handling
+- Image file generation
+- Base64 encoding for web display
+- Temporary file management
+- Cleanup procedures
+## Best Practices
+### Data Preparation
+- Appropriate data types
+- Missing value handling
+- Outlier management
+- Data aggregation
+- Column selection
+### Visualization Design
+- Clear labels and titles
+- Appropriate chart types
+- Color scheme consistency
+- Legend placement
+- Axis formatting
+### Performance
+- Efficient data processing
+- Memory management
+- File cleanup
+- Caching strategies
+- Resource optimization
+## Common Use Cases
+### Business Analytics
+- Sales trends
+- Customer distribution
+- Product performance
+- Time series analysis
+- Comparative analysis
+### Data Exploration
+- Distribution analysis
+- Correlation visualization
+- Pattern identification
+- Anomaly detection
+- Trend analysis
+## Error Handling
+### Common Issues
+- Data format errors
+- Visualization generation failures
+- Memory constraints
+- File system issues
+- API limitations
+### Recovery Strategies
+- Fallback visualizations
+- Error messages
+- Data validation
+- Resource management
+- User feedback