--- title: Websearch emoji: 🔎 colorFrom: red colorTo: green sdk: gradio sdk_version: 5.36.2 app_file: app.py pinned: false --- # Web Search MCP Server A Model Context Protocol (MCP) server that provides web search capabilities to LLMs, allowing them to fetch and extract content from recent news articles. ## Features - **Real-time web search**: Search for recent news on any topic - **Content extraction**: Automatically extracts main article content, removing ads and boilerplate - **Rate limiting**: Built-in rate limiting (200 requests/hour) to prevent API abuse - **Structured output**: Returns formatted content with metadata (title, source, date, URL) - **Flexible results**: Control the number of results (1-20) ## Prerequisites 1. **Serper API Key**: Sign up at [serper.dev](https://serper.dev) to get your API key 2. **Python 3.8+**: Ensure you have Python installed 3. **MCP-compatible LLM client**: Such as Claude Desktop, Cursor, or any MCP-enabled application ## Installation 1. Clone or download this repository 2. Install dependencies: ```bash pip install -r requirements.txt ``` Or install manually: ```bash pip install "gradio[mcp]" httpx trafilatura python-dateutil limits ``` 3. Set your Serper API key: ```bash export SERPER_API_KEY="your-api-key-here" ``` ## Usage ### Starting the MCP Server ```bash python app_mcp.py ``` The server will start on `http://localhost:7860` with the MCP endpoint at: ``` http://localhost:7860/gradio_api/mcp/sse ``` ### Connecting to LLM Clients #### Claude Desktop Add to your `claude_desktop_config.json`: ```json { "mcpServers": { "web-search": { "command": "python", "args": ["/path/to/app_mcp.py"], "env": { "SERPER_API_KEY": "your-api-key-here" } } } } ``` #### Direct URL Connection For clients that support URL-based MCP servers: 1. Start the server: `python app_mcp.py` 2. Connect to: `http://localhost:7860/gradio_api/mcp/sse` ## Tool Documentation ### `search_web` Function **Purpose**: Search the web for recent news and extract article content. **Parameters**: - `query` (str, **REQUIRED**): The search query - Examples: "OpenAI news", "climate change 2024", "python updates" - `num_results` (int, **OPTIONAL**): Number of results to fetch - Default: 4 - Range: 1-20 - More results provide more context but take longer **Returns**: Formatted text containing: - Summary of extraction results - For each article: - Title - Source and date - URL - Extracted main content **Example Usage in LLM**: ``` "Search for recent developments in artificial intelligence" "Find 10 articles about climate change in 2024" "Get news about Python programming language updates" ``` ## Error Handling The tool handles various error scenarios: - Missing API key: Clear error message with setup instructions - Rate limiting: Informs when limit is exceeded - Failed extractions: Reports which articles couldn't be extracted - Network errors: Graceful error messages ## Testing You can test the server manually: 1. Open `http://localhost:7860` in your browser 2. Enter a search query 3. Adjust the number of results 4. Click "Search" to see the extracted content ## Tips for LLM Usage 1. **Be specific with queries**: More specific queries yield better results 2. **Adjust result count**: Use fewer results for quick searches, more for comprehensive research 3. **Check dates**: The tool shows article dates for temporal context 4. **Follow up**: Use the extracted content to ask follow-up questions ## Limitations - Rate limited to 200 requests per hour - Only searches news articles (not general web pages) - Extraction quality depends on website structure - Some websites may block automated access ## Troubleshooting 1. **"SERPER_API_KEY is not set"**: Ensure the environment variable is exported 2. **Rate limit errors**: Wait before making more requests 3. **No content extracted**: Some websites block scrapers; try different queries 4. **Connection errors**: Check your internet connection and firewall settings