From 71e688121c6dd5ac663361fd1c82be917d72a157 Mon Sep 17 00:00:00 2001 From: Bryce Eadie Date: Thu, 20 Nov 2025 13:28:00 -0800 Subject: [PATCH] [DOCS-12667] Add log collection setup information --- .../integrations/google_cloud.md | 110 +++++++++++++++++- 1 file changed, 107 insertions(+), 3 deletions(-) diff --git a/content/en/getting_started/integrations/google_cloud.md b/content/en/getting_started/integrations/google_cloud.md index 91fe51c11f29e..924565bd9819b 100644 --- a/content/en/getting_started/integrations/google_cloud.md +++ b/content/en/getting_started/integrations/google_cloud.md @@ -372,11 +372,108 @@ To enable this feature: Forwarding logs from your Google Cloud environment enables near real-time monitoring of the resources and activities taking place in your organization or folder. You can set up [log monitors][37] to be notified of issues, use [Cloud SIEM][38] to detect threats, or leverage [Watchdog][39] to identify unknown issues or anomalous behavior. -Use the [Datadog Dataflow template][14] to batch and compresses your log events before forwarding them to Datadog through [Google Cloud Dataflow][15]. This is the most network-efficient way to forward your logs. To specify which logs are forwarded, configure the [Google Cloud Logging sink][40] with any inclusion or exclusion queries using Google Cloud's [Logging query language][56]. +Logs are forwarded by [Google Cloud Dataflow][15] using the [Datadog Dataflow template][14]. This approach offers batching and compression of your log events before forwarding them to Datadog, which is the most network-efficient way to forward your logs. You can specify which logs are forwarded with inclusion and exclusion filters. -You can use the [terraform-gcp-datadog-integration][64] module to manage this infrastructure through Terraform, or follow [the instructions listed here][16] to set up Log Collection. You can also use the [Stream logs from Google Cloud to Datadog][9] guide in the Google Cloud architecture center, for a more detailed explanation of the steps and architecture involved in log forwarding. For a deep dive into the benefits of the Pub/Sub to Datadog template, read [Stream your Google Cloud logs to Datadog with Dataflow][17] in the Datadog blog. +### Setup -
The Dataflow API must be enabled to use Google Cloud Dataflow. See Enabling APIs in the Google Cloud documentation for more information.
+{{% collapse-content title="Quick Start (recommended)" level="h4" id="quick-start-log-setup" %}} +#### Choose the Quick Start setup method if… + +- You are setting up log forwarding from Google Cloud for the first time. +- You prefer a UI-based workflow and want to minimize the time it takes to create and configure the necessary resources. +- You want to automate setup steps in scripts or CI/CD pipelines. + +#### Instructions + +1. In the [Google Cloud integration tile][100], select the **Log Collection** tab. +1. Select **Quick Start**. A setup script, configured with your Datadog credentials and site, is automatically generated. +1. Copy the setup script. You can choose to run the script locally, or in Google Cloud Shell: + - Running the script locally may be faster, but requires that you have your Google Cloud credentials available and the [gcloud CLI][101] installed on your machine. + - Click **Open Google Cloud Shell** to run the script in the [Google Cloud Shell][102]. +1. After running the script, return to the Google Cloud integration tile. +1. In the **Select Projects** section, select the folders and projects to forward logs from. If you select a folder, logs are forwarded from all of its child projects. + **Note**: Only folders and projects that you have the necessary access and permissions for appear in this section. Likewise, folders and projects without a display name do not appear. +1. In the **Dataflow Job Configuration** section, specify configuration options for the Dataflow job: + - Select deployment settings (Google Cloud region and project to host the created resources---Pub/Sub topics and subscriptions, a log routing sink, a Secret Manager entry, a service account, a Cloud Storage bucket, and a Dataflow job) + **Note**: You cannot name the created resources---the script uses predefined names, so it can skip creation if it finds preexisting resources with the same name. + - Select scaling settings (number of workers and maximum workers) + - Select performance settings (maximum number of parallel requests and batch size) + - Select execution options (Streaming Engine is enabled by default; read more about its [benefits][103]) + **Note**: If you select to enable [Dataflow Prime][104], you cannot configure worker machine type in the **Advanced Configuration** section. +1. In the **Advanced Configuration** section, optionally specify the machine type for your Dataflow worker VMs. If no machine type is selected, Dataflow automatically chooses an appropriate machine type based on your job requirements. +1. Optionally, choose to specify inclusion and exclusion filters using Google Cloud's [logging query language][105]. +1. Review the steps to be executed in the **Complete Setup** section. If everything is satisfactory, click **Complete Setup**. + + +[100]: https://app.datadoghq.com/integrations/gcp +[101]: https://docs.cloud.google.com/sdk/docs/install +[102]: https://docs.cloud.google.com/shell/docs +[103]: https://docs.cloud.google.com/dataflow/docs/streaming-engine#benefits +[104]: https://docs.cloud.google.com/dataflow/docs/guides/enable-dataflow-prime +[105]: https://cloud.google.com/logging/docs/view/logging-query-language +{{% /collapse-content %}} + +{{% collapse-content title="Terraform" level="h4" id="terraform-log-setup" %}} +#### Choose the Terraform setup method if… + +- You manage infrastructure as code and want to keep the Datadog Google Cloud integration under version control. +- You need to configure multiple folders or projects consistently with reusable provider blocks. +- You want a repeatable, auditable deployment process that fits into your Terraform-managed environment. + +#### Instructions + +1. In the [Google Cloud integration tile][200], select the **Log Collection** tab. +1. Select **Terraform**. +1. In the **Select Projects** section, select the folders and projects to forward logs from. If you select a folder, logs are forwarded from all of its child projects. + **Note**: Only folders and projects that you have the necessary access and permissions for appear in this section. Likewise, folders and projects without a display name do not appear. +1. In the **Dataflow Job Configuration** section, specify configuration options for the Dataflow job: + - Select deployment settings (Google Cloud region and project to host the created resources---Pub/Sub topics and subscriptions, a log routing sink, a Secret Manager entry, a service account, a Cloud Storage bucket, and a Dataflow job) + **Note**: You cannot name the created resources---the script uses predefined names, so it can skip creation if it finds preexisting resources with the same name. + - Select scaling settings (number of workers and maximum workers) + - Select performance settings (maximum number of parallel requests and batch size) + - Select execution options (Streaming Engine is enabled by default; read more about its [benefits][201]) + **Note**: If you select to enable [Dataflow Prime][202], you cannot configure worker machine type in the **Advanced Configuration** section. +1. In the **Advanced Configuration** section, optionally specify the machine type for your Dataflow worker VMs. If no machine type is selected, Dataflow automatically chooses an appropriate machine type based on your job requirements. +1. Optionally, choose to specify inclusion and exclusion filters using Google Cloud's [logging query language][203]. + + +[200]: https://app.datadoghq.com/integrations/gcp +[201]: https://docs.cloud.google.com/dataflow/docs/streaming-engine#benefits +[202]: https://docs.cloud.google.com/dataflow/docs/guides/enable-dataflow-prime +[203]: https://cloud.google.com/logging/docs/view/logging-query-language +{{% /collapse-content %}} + +{{% collapse-content title="Pub/Sub Push subscription (legacy; not recommended)" level="h4" id="pub-sub-push-logging-setup" %}} + +Collecting Google Cloud logs with a Pub/Sub Push subscription is in the process of being **deprecated**. + +The above documentation for the **Push** subscription is only maintained for troubleshooting or modifying legacy setups. + +Use a **Pull** subscription with the Datadog Dataflow template as described under [Dataflow Method][105] to forward your Google Cloud logs to Datadog instead. +{{% /collapse-content %}} + +See the [Stream logs from Google Cloud to Datadog][9] guide in the Google Cloud architecture center for a more detailed explanation of the steps and architecture involved in log forwarding. For a deep dive into the benefits of the Pub/Sub to Datadog template, read [Stream your Google Cloud logs to Datadog with Dataflow][17] in the Datadog blog. + +### Validation + +New logging events delivered to the Cloud Pub/Sub topic appear in the [Datadog Log Explorer][67]. + +**Note**: You can use the [Google Cloud Pricing Calculator][68] to calculate potential costs. + +### Monitor the Cloud Pub/Sub log forwarding + +The [Google Cloud Pub/Sub integration][69] provides helpful metrics to monitor the status of the log forwarding: + + - `gcp.pubsub.subscription.num_undelivered_messages` for the number of messages pending delivery + - `gcp.pubsub.subscription.oldest_unacked_message_age` for the age of the oldest unacknowledged message in a subscription + +Use the metrics above with a [metric monitor][70] to receive alerts for the messages in your input and deadletter subscriptions. + +### Monitor the Dataflow pipeline + +Use Datadog's [Google Cloud Dataflow integration][71] to monitor all aspects of your Dataflow pipelines. You can see all your key Dataflow metrics on the out-of-the-box dashboard, enriched with contextual data such as information about the GCE instances running your Dataflow workloads, and your Pub/Sub throughput. + +You can also use a preconfigured [Recommended Monitor][72] to set up notifications for increases in backlog time in your pipeline. For more information, read [Monitor your Dataflow pipelines with Datadog][73] in the Datadog blog. ## Leveraging the Datadog Agent @@ -510,3 +607,10 @@ You can get granular visibility into your BigQuery environments to monitor the p [64]: https://github.com/GoogleCloudPlatform/terraform-gcp-datadog-integration [65]: /integrations/google_cloud_platform/#expanded-bigquery-monitoring [66]: https://cloud.google.com/identity/docs/overview +[67]: https://app.datadoghq.com/logs +[68]: https://cloud.google.com/products/calculator +[69]: /integrations/google-cloud-pubsub/ +[70]: /monitors/types/metric/ +[71]: /integrations/google-cloud-dataflow/ +[72]: https://www.datadoghq.com/blog/datadog-recommended-monitors/ +[73]: https://www.datadoghq.com/blog/monitor-dataflow-pipelines-with-datadog/