Add domain filtering options to WebSearchTool

## Feature Request

Add domain filtering options to the WebSearchTool in the OpenAI Agents SDK.

## Problem Statement

Currently, the WebSearchTool in the OpenAI Agents SDK provides basic web search functionality without domain filtering capabilities:

```python
from agents import Agent, WebSearchTool

agent = Agent(
    name="Research Assistant",
    tools=[WebSearchTool()],
    instructions="You help with research tasks"
)
```

This limitation prevents developers from:
- Focusing searches on specific trusted sources
- Excluding known unreliable or irrelevant domains
- Building agents that require domain-specific information (e.g., academic research, official documentation)

## Proposed Solution

Add optional domain filtering parameters to the WebSearchTool constructor:

1. `include_domains`: List of domains to limit search results to
2. `exclude_domains`: List of domains to exclude from search results

## Example Implementation

### Current Implementation:
```python
from agents import Agent, WebSearchTool, Runner

agent = Agent(
    name="Research Assistant",
    tools=[WebSearchTool()],
    instructions="Help users find reliable research papers"
)

# All web results are included, including potentially unreliable sources
result = await Runner.run(agent, "Find recent AI research papers")
```

### Proposed Enhancement:
```python
from agents import Agent, WebSearchTool, Runner

# Academic research agent with domain filtering
academic_agent = Agent(
    name="Academic Research Assistant",
    tools=[
        WebSearchTool(
            include_domains=["arxiv.org", "nature.com", "science.org", "ieee.org"],
            exclude_domains=["medium.com", "reddit.com", "quora.com"]
        )
    ],
    instructions="Find peer-reviewed research papers from academic sources"
)

# Official documentation agent
docs_agent = Agent(
    name="Documentation Assistant",
    tools=[
        WebSearchTool(
            include_domains=["docs.python.org", "docs.openai.com", "github.com"],
            exclude_domains=["stackoverflow.com", "w3schools.com"]
        )
    ],
    instructions="Find official documentation and API references"
)

# News aggregation agent
news_agent = Agent(
    name="News Assistant",
    tools=[
        WebSearchTool(
            include_domains=["reuters.com", "apnews.com", "bbc.com"],
            exclude_domains=["tabloid-site.com", "clickbait-news.com"]
        )
    ],
    instructions="Find news from reputable sources"
)
```

## Reference: Perplexity AI Implementation

Perplexity's web search API already supports this functionality:

```python
import requests

response = requests.post(
    "https://api.perplexity.ai/search",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "query": "machine learning research",
        "include_domains": ["arxiv.org", "ieee.org", "acm.org"],
        "exclude_domains": ["blogspot.com", "wordpress.com"]
    }
)
```

## Use Cases

1. **Academic Research Agents**: Restrict to .edu domains and peer-reviewed journals
2. **Technical Documentation Agents**: Focus on official docs, exclude forums/Q&A sites
3. **Government Information Agents**: Include only .gov domains
4. **Medical Information Agents**: Restrict to verified medical sources
5. **Corporate Intelligence Agents**: Focus on official company websites and press releases

## Implementation Considerations

- Both parameters should be optional to maintain backward compatibility
- Support for wildcard patterns (e.g., `*.edu`, `*.gov`) would be valuable
- Consider validation of domain format
- Document any limitations on the number of domains
- Since WebSearchTool is a hosted tool, the filtering would need to be implemented at the API level

## Alternative API Design

If modifying the hosted tool is complex, consider allowing configuration through the tool's instantiation:

```python
# Option 1: Constructor parameters (proposed above)
tool = WebSearchTool(
    include_domains=["example.com"],
    exclude_domains=["spam.com"]
)

# Option 2: Configuration object
tool = WebSearchTool(
    config={
        "include_domains": ["example.com"],
        "exclude_domains": ["spam.com"],
        "max_results": 10
    }
)

# Option 3: Builder pattern
tool = (WebSearchTool()
    .include_domains(["example.com"])
    .exclude_domains(["spam.com"])
    .build())
```

## Benefits

- Enables building specialized agents with reliable information sources
- Reduces noise and improves result quality
- Provides feature parity with competing solutions
- Allows compliance with organizational policies on information sources
- Improves agent reliability for production use cases

## Related Enhancements

This feature would complement other potential WebSearchTool improvements:
- Date range filtering
- Language filtering
- SafeSearch options
- Custom ranking/sorting options

This enhancement would significantly improve the utility of the WebSearchTool for building production-ready agents that require high-quality, domain-specific information retrieval.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add domain filtering options to WebSearchTool #1542

Feature Request

Problem Statement

Proposed Solution

Example Implementation

Current Implementation:

Proposed Enhancement:

Reference: Perplexity AI Implementation

Use Cases

Implementation Considerations

Alternative API Design

Benefits

Related Enhancements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add domain filtering options to WebSearchTool #1542

Description

Feature Request

Problem Statement

Proposed Solution

Example Implementation

Current Implementation:

Proposed Enhancement:

Reference: Perplexity AI Implementation

Use Cases

Implementation Considerations

Alternative API Design

Benefits

Related Enhancements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions