Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 3 additions & 8 deletions .eslintrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,13 @@
"es2022": true,
"node": true
},
"extends": [
"eslint:recommended",
"@typescript-eslint/recommended"
],
"extends": ["eslint:recommended", "@typescript-eslint/recommended"],
"parser": "@typescript-eslint/parser",
"parserOptions": {
"ecmaVersion": "latest",
"sourceType": "module"
},
"plugins": [
"@typescript-eslint"
],
"plugins": ["@typescript-eslint"],
"rules": {
"@typescript-eslint/no-unused-vars": "error",
"@typescript-eslint/no-explicit-any": "warn",
Expand All @@ -25,4 +20,4 @@
"no-var": "error",
"no-console": "warn"
}
}
}
4 changes: 2 additions & 2 deletions .prettierrc
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
"semi": true,
"trailingComma": "es5",
"singleQuote": true,
"printWidth": 80,
"printWidth": 120,
"tabWidth": 2,
"useTabs": false,
"bracketSpacing": true,
"arrowParens": "avoid"
}
}
42 changes: 37 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@ A TypeScript MCP (Model Context Protocol) server that provides comprehensive web
The server provides three specialised tools for different web search needs:

### 1. `full-web-search` (Main Tool)

When a comprehensive search is requested, the server uses an **optimised search strategy**:

1. **Browser-based Bing Search** - Primary method using dedicated Chromium instance
2. **Browser-based Brave Search** - Secondary option using dedicated Firefox instance
3. **Axios DuckDuckGo Search** - Final fallback using traditional HTTP
Expand All @@ -25,13 +27,17 @@ When a comprehensive search is requested, the server uses an **optimised search
7. **HTTP/2 error recovery**: Automatically falls back to HTTP/1.1 when protocol errors occur

### 2. `get-web-search-summaries` (Lightweight Alternative)

For quick search results without full content extraction:

1. Performs the same optimised multi-engine search as `full-web-search`
2. Returns only the search result snippets/descriptions
3. Does not follow links to extract full page content

### 3. `get-single-web-page-content` (Utility Tool)

For extracting content from a specific webpage:

1. Takes a single URL as input
2. Follows the URL and extracts the main page content
3. Removes navigation, ads, and other non-content elements
Expand All @@ -41,7 +47,8 @@ For extracting content from a specific webpage:
This MCP server has been developed and tested with **LM Studio** and **LibreChat**. It has not been tested with other MCP clients.

### Model Compatibility
**Important:** Prioritise using more recent models designated for tool use.

**Important:** Prioritise using more recent models designated for tool use.

Older models (even those with tool use specified) may not work or may work erratically. This seems to be the case with Llama and Deepseek. Qwen3 and Gemma 3 currently have the best restults.

Expand All @@ -56,20 +63,24 @@ Older models (even those with tool use specified) may not work or may work errat
## Installation (Recommended)

**Requirements:**

- Node.js 18.0.0 or higher
- npm 8.0.0 or higher

1. Download the latest release zip file from the [Releases page](https://github.com/mrkrsl/web-search-mcp/releases)
2. Extract the zip file to a location on your system (e.g., `~/mcp-servers/web-search-mcp/`)
3. **Open a terminal in the extracted folder and run:**

```bash
npm install
npx playwright install
npm run build
```

This will create a `node_modules` folder with all required dependencies, install Playwright browsers, and build the project.

**Note:** You must run `npm install` in the root of the extracted folder (not in `dist/`).

4. Configure your `mcp.json` to point to the extracted `dist/index.js` file:

```json
Expand All @@ -82,33 +93,39 @@ Older models (even those with tool use specified) may not work or may work errat
}
}
```

**Example paths:**

- macOS/Linux: `~/mcp-servers/web-search-mcp/dist/index.js`
- Windows: `C:\\mcp-servers\\web-search-mcp\\dist\\index.js`

In LibreChat, you can include the MCP server in the librechat.yaml. If you are running LibreChat in Docker, you must first mount your local directory in docker-compose.override.yml.

in `docker-compose.override.yml`:

```yaml
services:
api:
volumes:
- type: bind
source: /path/to/your/mcp/directory
target: /app/mcp
- type: bind
source: /path/to/your/mcp/directory
target: /app/mcp
```

in `librechat.yaml`:

```yaml
mcpServers:
web-search:
type: stdio
command: node
args:
- /app/mcp/web-search-mcp/dist/index.js
- /app/mcp/web-search-mcp/dist/index.js
serverInstructions: true
```

**Troubleshooting:**

- If `npm install` fails, try updating Node.js to version 18+ and npm to version 8+
- If `npm run build` fails, ensure you have the latest Node.js version installed
- For older Node.js versions, you may need to use an older release of this project
Expand Down Expand Up @@ -151,28 +168,33 @@ The server supports several environment variables for configuration:
## Troubleshooting

### Slow Response Times

- **Optimised timeouts**: Default timeout reduced to 6 seconds with concurrent processing for faster results
- **Concurrent extraction**: Content is now extracted from multiple pages simultaneously
- **Reduce timeouts further**: Set `DEFAULT_TIMEOUT=4000` for even faster responses (may reduce success rate)
- **Use fewer browsers**: Set `MAX_BROWSERS=1` to reduce memory usage

### Search Failures

- **Check browser installation**: Run `npx playwright install` to ensure browsers are available
- **Try headless mode**: Ensure `BROWSER_HEADLESS=true` (default) for server environments
- **Network restrictions**: Some networks block browser automation - try different network or VPN
- **HTTP/2 issues**: The server automatically handles HTTP/2 protocol errors with fallback to HTTP/1.1

### Search Quality Issues

- **Enable quality checking**: Set `ENABLE_RELEVANCE_CHECKING=true` (enabled by default)
- **Adjust quality threshold**: Set `RELEVANCE_THRESHOLD=0.5` for stricter quality requirements
- **Force multi-engine search**: Set `FORCE_MULTI_ENGINE_SEARCH=true` to try all engines and return the best results

### Memory Usage

- **Automatic cleanup**: Browsers are automatically cleaned up after each operation to prevent memory leaks
- **Limit browsers**: Reduce `MAX_BROWSERS` (default: 3)
- **EventEmitter warnings**: Fixed - browsers are properly closed to prevent listener accumulation

## For Development

```bash
git clone https://github.com/mrkrsl/web-search-mcp.git
cd web-search-mcp
Expand All @@ -195,14 +217,17 @@ npm run format # Run Prettier
This server provides three specialised tools for different web search needs:

### 1. `full-web-search` (Main Tool)

The most comprehensive web search tool that:

1. Takes a search query and optional number of results (1-10, default 5)
2. Performs a web search (tries Bing, then Brave, then DuckDuckGo if needed)
3. Fetches full page content from each result URL with concurrent processing
4. Returns structured data with search results and extracted content
5. **Enhanced reliability**: HTTP/2 error recovery, reduced timeouts, and better error handling

**Example Usage:**

```json
{
"name": "full-web-search",
Expand All @@ -215,13 +240,16 @@ The most comprehensive web search tool that:
```

### 2. `get-web-search-summaries` (Lightweight Alternative)

A lightweight alternative for quick search results:

1. Takes a search query and optional number of results (1-10, default 5)
2. Performs the same optimised multi-engine search as `full-web-search`
3. Returns only search result snippets/descriptions (no content extraction)
4. Faster and more efficient for quick research

**Example Usage:**

```json
{
"name": "get-web-search-summaries",
Expand All @@ -233,13 +261,16 @@ A lightweight alternative for quick search results:
```

### 3. `get-single-web-page-content` (Utility Tool)

A utility tool for extracting content from a specific webpage:

1. Takes a single URL as input
2. Follows the URL and extracts the main page content
3. Removes navigation, ads, and other non-content elements
4. Useful for getting detailed content from a known webpage

**Example Usage:**

```json
{
"name": "get-single-web-page-content",
Expand All @@ -253,6 +284,7 @@ A utility tool for extracting content from a specific webpage:
## Standalone Usage

You can also run the server directly:

```bash
# If running from source
npm start
Expand Down
Loading