diff --git a/docs/configuration/devices/microsoft-graph-api.mdx b/docs/configuration/devices/microsoft-graph-api.mdx new file mode 100644 index 00000000..3be496ab --- /dev/null +++ b/docs/configuration/devices/microsoft-graph-api.mdx @@ -0,0 +1,362 @@ +# Microsoft Graph API + +Microsoft AzurePull + +## Synopsis + +Creates a poller that fetches data from the Microsoft Graph API on a configurable interval and forwards records to downstream pipelines. Supports incremental polling via timestamp cursors or delta queries, automatic pagination, retry logic, and mutual TLS. + +## Schema + +```yaml {1,2,4,9-11,13} +- id: + name: + description: + type: graphapi + tags: + pipelines: + status: + properties: + tenant_id: + client_id: + client_secret: + version: + resource: + query_params: + poll_interval: + workers: + reuse: + timeout: + max_retries: + retry_delay: + enable_pagination: + max_pages: + enable_delta_query: + timestamp_field: + tls: + status: + cert_name: + key_name: + insecure_skip_verify: +``` + +## Configuration + +The following fields are used to define the device: + +### Device + +|Field|Required|Default|Description| +|---|---|---|---| +|`id`|Y||Unique identifier| +|`name`|Y||Device name| +|`description`|N|-|Optional description| +|`type`|Y||Must be `graphapi`| +|`tags`|N|-|Optional tags| +|`pipelines`|N|-|Optional pre-processor pipelines| +|`status`|N|`true`|Enable/disable the device| + +### Authentication + +|Field|Required|Default|Description| +|---|---|---|---| +|`tenant_id`|Y||Azure AD tenant ID| +|`client_id`|Y||App registration client ID| +|`client_secret`|Y||App registration client secret| + +### API + +|Field|Required|Default|Description| +|---|---|---|---| +|`version`|N|`v1.0`|Graph API version: `v1.0` or `beta`| +|`resource`|Y||Graph API resource path (see [Allowed Resources](#allowed-resources))| +|`query_params`|N|`""`|OData query parameters (e.g., `$select=id,createdDateTime&$top=100`)| + +### Polling + +|Field|Required|Default|Description| +|---|---|---|---| +|`poll_interval`|N|`10`|Seconds between polls (must be > 0)| +|`workers`|N|`1`|Number of parallel worker goroutines| +|`reuse`|N|`true`|Reuse HTTP connections across requests| + +### HTTP + +|Field|Required|Default|Description| +|---|---|---|---| +|`timeout`|N|`30`|Request timeout in seconds| +|`max_retries`|N|`3`|Number of retry attempts on failure| +|`retry_delay`|N|`5`|Seconds to wait between retries| + +### Pagination + +|Field|Required|Default|Description| +|---|---|---|---| +|`enable_pagination`|N|`true`|Follow `@odata.nextLink` pages automatically| +|`max_pages`|N|`10`|Maximum pages per poll cycle (`0` = unlimited)| + +### Incremental Polling + +Two strategies control incremental data retrieval. If both are configured, `timestamp_field` takes priority. + +|Field|Required|Default|Description| +|---|---|---|---| +|`timestamp_field`|N|`""`|JSON field name used as a time-based cursor — injects `$filter= ge ` (e.g., `createdDateTime`)| +|`enable_delta_query`|N|`false`|Appends `/$delta` to the resource URL for delta query support| + +### TLS + +|Field|Required|Default|Description| +|---|---|---|---| +|`tls.status`|N|`false`|Enable mutual TLS| +|`tls.cert_name`|N*|`client_cert.pem`|Client certificate file path| +|`tls.key_name`|N*|`client_key.pem`|Client private key file path| +|`tls.insecure_skip_verify`|N|`false`|Skip server certificate verification| + +\* = Conditionally required (only when `tls.status: true`) + +:::note +TLS certificate and key files must be placed in the service root directory. +::: + +## Details + +### Authentication + +The device uses OAuth2 client credentials flow to obtain an access token from Azure AD. Tokens are cached per device/tenant/client combination and automatically refreshed on authentication failure (HTTP 401/403). + +:::tip Managed Identity +When `client_id` and `client_secret` are omitted, the credential provider automatically falls back to Azure Managed Identity. This allows the device to authenticate without explicit credentials when running on an Azure-hosted VM or container with a system-assigned or user-assigned managed identity. +::: + +### Incremental Polling + +Two strategies prevent re-fetching previously collected data: + +**Timestamp Cursor** (`timestamp_field`): Injects an OData `$filter` using the poll start time from the previous successful run. Effective for resources that expose a datetime field (e.g., `createdDateTime` on `auditLogs/signIns`). The cursor merges with any existing `$filter` in `query_params`. + +**Delta Query** (`enable_delta_query`): Appends `/$delta` to the resource URL. The Graph API returns an `@odata.deltaLink` on the final page, which the device stores and uses on the next poll to retrieve only changed records. Supported only by directory resources (`users`, `groups`, `devices`, etc.). If the stored delta link expires or becomes invalid, the device clears it and falls back to a full query. + +When both are configured, `timestamp_field` takes priority. If neither is set, each poll fetches the full result set. + +### Allowed Resources + +The device enforces an allow list of Graph API resource paths. Requests to unlisted resources are rejected. Sub-paths of allowed resources (e.g., `auditLogs/directoryAudits/someId`) are also accepted. + +#### Audit Logs + +|Resource|Required Permission| +|--:|:--| +|`auditLogs/signIns`|`AuditLog.Read.All`| +|`auditLogs/directoryAudits`|`AuditLog.Read.All`| +|`auditLogs/provisioning`|`AuditLog.Read.All`| + +#### Security + +|Resource|Required Permission| +|--:|:--| +|`security/alerts`|`SecurityEvents.Read.All` or `SecurityAlert.Read.All`| +|`security/alerts_v2`|`SecurityEvents.Read.All` or `SecurityAlert.Read.All`| +|`security/incidents`|`SecurityEvents.Read.All` or `SecurityAlert.Read.All`| +|`security/secureScores`|`SecurityEvents.Read.All`| +|`security/secureScoreControlProfiles`|`SecurityEvents.Read.All`| +|`security/attackSimulation`|`AttackSimulation.Read.All`| +|`security/cloudAppSecurityProfiles`|`SecurityEvents.Read.All`| +|`security/tiIndicators`|`ThreatIndicators.Read.All`| +|`security/cases/ediscoveryCases`|`eDiscovery.Read.All`| +|`security/threatIntelligence/articles`|`ThreatIntelligence.Read.All`| +|`security/threatIntelligence/hosts`|`ThreatIntelligence.Read.All`| + +#### Identity Protection + +|Resource|Required Permission| +|--:|:--| +|`identityProtection/riskDetections`|`IdentityRiskEvent.Read.All`| +|`identityProtection/riskyUsers`|`IdentityRiskyUser.Read.All`| +|`identityProtection/riskyServicePrincipals`|`IdentityRiskyServicePrincipal.Read.All`| + +#### Reports + +|Resource|Required Permission| +|--:|:--| +|`reports/getEmailActivityUserDetail`|`Reports.Read.All`| +|`reports/getEmailActivityCounts`|`Reports.Read.All`| +|`reports/getEmailAppUsageUserDetail`|`Reports.Read.All`| +|`reports/getMailboxUsageDetail`|`Reports.Read.All`| +|`reports/getOffice365ActivationsUserDetail`|`Reports.Read.All`| +|`reports/getOffice365ActiveUserDetail`|`Reports.Read.All`| +|`reports/getOffice365GroupsActivityDetail`|`Reports.Read.All`| +|`reports/getOneDriveActivityUserDetail`|`Reports.Read.All`| +|`reports/getOneDriveUsageAccountDetail`|`Reports.Read.All`| +|`reports/getSharePointActivityUserDetail`|`Reports.Read.All`| +|`reports/getSharePointSiteUsageDetail`|`Reports.Read.All`| +|`reports/getTeamsUserActivityUserDetail`|`Reports.Read.All`| +|`reports/getTeamsDeviceUsageUserDetail`|`Reports.Read.All`| +|`reports/getYammerActivityUserDetail`|`Reports.Read.All`| +|`reports/authenticationMethods/userRegistrationDetails`|`Reports.Read.All` + `UserAuthenticationMethod.Read.All`| +|`reports/credentialUserRegistrationDetails`|`Reports.Read.All` + `UserAuthenticationMethod.Read.All`| +|`reports/userCredentialUsageDetails`|`Reports.Read.All` + `UserAuthenticationMethod.Read.All`| +|`reports/applicationSignInDetailedSummary`|`Reports.Read.All` + `AuditLog.Read.All`| +|`reports/messageTrace`|`Reports.Read.All`| + +#### Service Announcements + +|Resource|Required Permission| +|--:|:--| +|`admin/serviceAnnouncement/issues`|`ServiceHealth.Read.All`| +|`admin/serviceAnnouncement/messages`|`ServiceHealth.Read.All`| +|`admin/serviceAnnouncement/healthOverviews`|`ServiceHealth.Read.All`| + +### Required Permissions + +To cover all allowed resources, add the following **Application** permissions to the Azure App Registration and grant admin consent: + +- `AuditLog.Read.All` +- `SecurityEvents.Read.All` +- `SecurityAlert.Read.All` +- `AttackSimulation.Read.All` +- `ThreatIndicators.Read.All` +- `eDiscovery.Read.All` +- `ThreatIntelligence.Read.All` +- `IdentityRiskEvent.Read.All` +- `IdentityRiskyUser.Read.All` +- `IdentityRiskyServicePrincipal.Read.All` +- `Reports.Read.All` +- `UserAuthenticationMethod.Read.All` +- `ServiceHealth.Read.All` + +:::note +All permissions require admin consent. After adding them in the Azure portal, click **Grant admin consent for [your tenant]** to activate them. Only add permissions for the resources you intend to poll. +::: + +## Examples + +The following are commonly used configuration types. + +### Basic + + + + Polling sign-in logs with a timestamp cursor for incremental collection... + + + ```yaml + devices: + - id: 1 + name: graphapi_signins + type: graphapi + properties: + tenant_id: "00000000-0000-0000-0000-000000000000" + client_id: "11111111-1111-1111-1111-111111111111" + client_secret: "your-client-secret" + resource: "auditLogs/signIns" + poll_interval: 30 + timestamp_field: "createdDateTime" + ``` + + + +### Delta Query + + + + Using delta query for directory resources to fetch only changed records... + + + ```yaml + devices: + - id: 2 + name: graphapi_users + type: graphapi + properties: + tenant_id: "00000000-0000-0000-0000-000000000000" + client_id: "11111111-1111-1111-1111-111111111111" + client_secret: "your-client-secret" + resource: "security/alerts_v2" + query_params: "$select=id,title,severity,status,createdDateTime" + enable_delta_query: true + poll_interval: 60 + ``` + + + +### Security Alerts + + + + Collecting security alerts with OData filtering and preprocessing pipelines... + + + ```yaml + devices: + - id: 3 + name: graphapi_alerts + type: graphapi + pipelines: + - alert_enricher + - severity_classifier + properties: + tenant_id: "00000000-0000-0000-0000-000000000000" + client_id: "11111111-1111-1111-1111-111111111111" + client_secret: "your-client-secret" + version: "v1.0" + resource: "security/incidents" + query_params: "$filter=severity eq 'high'" + poll_interval: 15 + timestamp_field: "createdDateTime" + ``` + + + +### High-Volume + + + + Optimizing for high data volumes with multiple workers and increased pagination... + + + ```yaml + devices: + - id: 4 + name: graphapi_reports + type: graphapi + properties: + tenant_id: "00000000-0000-0000-0000-000000000000" + client_id: "11111111-1111-1111-1111-111111111111" + client_secret: "your-client-secret" + resource: "reports/getOffice365ActiveUserDetail" + poll_interval: 300 + workers: 4 + timeout: 60 + max_retries: 5 + max_pages: 50 + ``` + + + +### Secure Connection + + + + Enabling mutual TLS for outbound Graph API requests... + + + ```yaml + devices: + - id: 5 + name: graphapi_secure + type: graphapi + properties: + tenant_id: "00000000-0000-0000-0000-000000000000" + client_id: "11111111-1111-1111-1111-111111111111" + client_secret: "your-client-secret" + resource: "identityProtection/riskDetections" + poll_interval: 60 + timestamp_field: "activityDateTime" + tls: + status: true + cert_name: "graphapi.crt" + key_name: "graphapi.key" + ``` + + diff --git a/docs/configuration/devices/overview.mdx b/docs/configuration/devices/overview.mdx index a9a6cd03..854e5b76 100644 --- a/docs/configuration/devices/overview.mdx +++ b/docs/configuration/devices/overview.mdx @@ -45,7 +45,7 @@ Devices operate in two fundamental modes that affect how data flows into DataStr - Event Hubs, RabbitMQ, Redis **Pull-based devices** actively fetch data from external sources on a schedule or trigger: -- Kafka (consumer), Azure Monitor, Microsoft Sentinel +- Kafka (consumer), Azure Monitor, Microsoft Graph API, Microsoft Sentinel - Azure Blob Storage - Windows/Linux Agents (collect local logs and forward to Director) @@ -118,6 +118,7 @@ The system supports the following device types: * **Azure Blob Storage**: Pulls data from Azure Blob containers * **Azure Monitor**: Collects logs from Azure Log Analytics workspaces * **Event Hubs**: Consumes events from Azure Event Hubs + * **Microsoft Graph API**: Polls Microsoft Graph API for audit logs, security events, identity protection, and reports * **Microsoft Sentinel**: Pulls security data from Microsoft Sentinel * **Message Queue** - These devices consume from messaging platforms: diff --git a/sidebars.ts b/sidebars.ts index 3e9f0e58..3c2427f6 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -98,6 +98,7 @@ const sidebars: SidebarsConfig = { "configuration/devices/ipfix", "configuration/devices/kafka", "configuration/devices/linux", + "configuration/devices/microsoft-graph-api", "configuration/devices/microsoft-sentinel", "configuration/devices/nats", "configuration/devices/netflow", diff --git a/topics.json b/topics.json index 75587e15..e208066a 100644 --- a/topics.json +++ b/topics.json @@ -36,6 +36,7 @@ "devices-udp": "/configuration/devices/udp", "devices-windows": "/configuration/devices/windows", "devices-linux": "/configuration/devices/linux", + "devices-graphapi": "/configuration/devices/microsoft-graph-api", "snmp-authentication": "/configuration/devices/snmp-trap#authentication-protocols", "snmp-privacy": "/configuration/devices/snmp-trap#privacy-protocols",