MCP Server for Microsoft Purview Integration - with an optional D&D flavour.
This project implements a Model Context Protocol (MCP) server that integrates with Microsoft Purview, allowing LLMs to interact with Purview data through a secure interface. The server provides tools to monitor sensitivity label changes, analyze audit logs, manage data sources, and gain insights from your Microsoft Purview implementation.
- π Audit Log Analysis: Access and analyze Purview audit logs to monitor data governance activities
- π·οΈ Sensitivity Label Tracking: Monitor changes to sensitivity labels in emails and documents
- π Data Source Scanning: Trigger scans of your data sources programmatically
- π Data Catalog Insights: Get summary statistics about your entire data estate
- π Data Lineage Exploration: Visualize and analyze how data flows through your organization
- Python 3.8 or higher
- An Azure subscription with Purview configured
- Appropriate permissions to access Purview resources
- UV package manager installed
-
Clone this repository:
git clone <your-repo-url> cd str-mcp-purview
-
Configure your environment variables:
cd src cp .env.template .env
Then edit the
.env
file with your Purview account details and authentication information. -
Run the server, and install dependencies: at the same time
uv run server.py
The server uses environment variables for configuration. Copy the .env.template
file to .env
and fill in:
# Azure Purview Configuration
PURVIEW_ACCOUNT_NAME=your-purview-account-name
PURVIEW_ENDPOINT=https://your-purview-account-name.purview.azure.com
# Azure Subscription Information
AZURE_SUBSCRIPTION_ID=your-subscription-id
AZURE_RESOURCE_GROUP=your-resource-group-name
# Authentication (DefaultAzureCredential will be used if these are not provided)
# For service principal authentication
AZURE_TENANT_ID=your-tenant-id
AZURE_CLIENT_ID=your-client-id
AZURE_CLIENT_SECRET=your-client-secret
This server supports multiple authentication methods following Azure best practices:
- Managed Identity: When deployed to Azure, uses system-assigned or user-assigned managed identities (recommended)
- DefaultAzureCredential: Tries multiple authentication methods in sequence, including environment variables, managed identity, and interactive login
- Service Principal: Falls back to client secret authentication if client ID, client secret, and tenant ID are provided
Start the server using one of these methods:
cd str-mcp-purview
python src/server.py
# Standard mode
mcp run src/server.py
# Development mode with inspector
mcp dev src/server.py
To install the server as an MCP extension:
mcp install src/server.py --name "Purview Insights"
The MCP server exposes these tools for LLMs:
Retrieve audit logs from Purview for a specified time period.
Parameters:
start_time
: Start time in ISO format (YYYY-MM-DDTHH:MM:SS)end_time
: (Optional) End time in ISO format, defaults to current timelimit
: Maximum number of logs to return (default: 100)
Example usage:
logs = await get_audit_logs(start_time="2025-04-10T00:00:00", limit=50)
Get a report of sensitivity label changes in a specified time period.
Parameters:
start_time
: Start time in ISO format (YYYY-MM-DDTHH:MM:SS)end_time
: (Optional) End time in ISO format, defaults to current time
Example usage:
report = await get_sensitivity_label_changes(start_time="2025-04-01T00:00:00")
Initiate a scan on a Purview data source.
Parameters:
data_source_name
: Name of the data source to scanscan_level
: Type of scan (Incremental or Full)
Example usage:
result = await scan_data_source(data_source_name="MyDataLake", scan_level="Full")
Get a summary of the data catalog including asset counts by type.
Example usage:
summary = await get_data_catalog_summary()
Get data lineage information for a specific entity.
Parameters:
entity_id
: ID of the entity to retrieve lineage fordepth
: Depth of lineage graph to retrieve (default: 3)
Example usage:
lineage = await get_data_lineage(entity_id="guid-123-456", depth=5)
The server provides these information resources:
Provides an overview of your Purview account configuration and status.
Provides guidance on email sensitivity labels and their management.
This server follows Azure best practices for security:
- Secure Authentication: Uses DefaultAzureCredential for proper authentication chains
- No Hardcoded Credentials: All sensitive information is stored in environment variables
- Error Handling: Comprehensive error handling prevents information leakage
- Least Privilege: Use RBAC in Azure to provide minimal required permissions to the service principal
To add new tools:
- Create a new function with the
@mcp.tool()
decorator - Define parameters and return types
- Implement the tool functionality using the Purview client
To add new resources:
- Create a new function with the
@mcp.resource(path="your-path")
decorator - Return the content as a string (Markdown format recommended)
If you encounter issues:
- Authentication Errors: Verify your environment variables and check if the service principal has sufficient permissions
- Connection Issues: Ensure your Purview endpoint is correctly specified
- Tool Errors: Check the error logs for specific error messages
- Microsoft Purview documentation
- Microsoft Purview Python SDK tutorial
- Azure Identity authentication library
- Microsoft Purview sensitivity labels
- Model Context Protocol (MCP) Python SDK
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.