Skip to content

Commit bb24a19

Browse files
committed
Editing pass
1 parent 0e2bb6f commit bb24a19

File tree

1 file changed

+40
-41
lines changed
  • src/connections/storage/catalog/bigquery

1 file changed

+40
-41
lines changed

src/connections/storage/catalog/bigquery/index.md

Lines changed: 40 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -5,68 +5,68 @@ redirect_from:
55
- '/connections/warehouses/catalog/bigquery/'
66
---
77

8-
Segment's [BigQuery](https://cloud.google.com/bigquery/) connector makes it easy
8+
Segment's [BigQuery](https://cloud.google.com/bigquery/){:target="_blank"} connector makes it easy
99
to load web, mobile, and third-party source data like Salesforce, Zendesk, and
1010
Google AdWords into a BigQuery data warehouse. This guide will explain how to
1111
set up BigQuery and start loading data into it.
1212

1313
The Segment warehouse connector runs a periodic ETL (Extract - Transform - Load)
1414
process to pull raw events and objects and load them into your BigQuery cluster.
1515

16-
Using BigQuery through Segment means you'll get a fully managed data pipeline
16+
Using BigQuery with Segment means you'll get a fully managed data pipeline
1717
loaded into one of the most powerful and cost-effective data warehouses today.
1818

1919
## Getting Started
2020

2121
To store your Segment data in BigQuery, complete the following steps:
22-
- [Enable BigQuery for your Google Cloud project](#create-a-project-and-enable-bigquery)
23-
- [Create a GCP service account for Segment to assume](#create-a-service-account-for-segment)
24-
- [Create a warehouse in the Segment app](#create-the-warehouse-in-segment)
22+
1. [Enable BigQuery for your Google Cloud project](#create-a-project-and-enable-bigquery)
23+
2. [Create a GCP service account for Segment to assume](#create-a-service-account-for-segment)
24+
3. [Create a warehouse in the Segment app](#create-the-warehouse-in-segment)
2525

2626
### Create a Project and Enable BigQuery
2727

28-
1. Navigate to the [Google Developers Console](https://console.developers.google.com/)
29-
2. Configure [Cloud Platform](https://console.cloud.google.com/):
30-
- If you don't have a project already, [create one](https://support.google.com/cloud/answer/6251787?hl=en&ref_topic=6158848).
31-
- If you have an existing project, you will need to [enable the BigQuery API](https://cloud.google.com/bigquery/quickstart-web-ui).
28+
1. Navigate to the [Google Developers Console](https://console.developers.google.com/){:target="_blank"}.
29+
2. Configure [Cloud Platform](https://console.cloud.google.com/){:target="_blank"}:
30+
- If you don't have a project already, [create one](https://support.google.com/cloud/answer/6251787?hl=en&ref_topic=6158848){:target="_blank"}.
31+
- If you have an existing project, you will need to [enable the BigQuery API](https://cloud.google.com/bigquery/quickstart-web-ui){:target="_blank"}.
3232
Once you've done so, you should see BigQuery in the "Resources" section of Cloud Platform.
33-
- **Note:** make sure [billing is enabled](https://support.google.com/cloud/answer/6293499#enable-billing) on your project, or Segment will not be able to write into the cluster.
33+
- **Note:** make sure [billing is enabled](https://support.google.com/cloud/answer/6293499#enable-billing){:target="_blank"} on your project, or Segment will not be able to write into the cluster.
3434
3. Copy the project ID. You will need it when you create a warehouse source in the Segment app.
3535

3636
### Create a Service Account for Segment
3737

38-
Refer to [Google Cloud's documentation about service accounts](https://cloud.google.com/iam/docs/creating-managing-service-accounts)
39-
for more information.
40-
41-
1. From the Navigation panel on the left, go to **IAM & admin** > **Service accounts**
42-
2. Click **Create Service Account** along the top
43-
3. Enter a name for the service account (for example: "segment-warehouses") and click **Create**
38+
1. From the Navigation panel on the left, select **IAM & admin** > **Service accounts**.
39+
2. Click **Create Service Account** along the top.
40+
3. Enter a name for the service account (for example: "segment-warehouses") and click **Create**.
4441
4. When assigning permissions, make sure to grant the following roles:
4542
- `BigQuery Data Owner`
4643
- `BigQuery Job User`
47-
5. [Create a JSON key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys).
48-
The downloaded file will be used to create your warehouse in the next section.
44+
5. [Create a JSON key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys){:target="_blank"}.
45+
The downloaded file will be used to create your warehouse in the Segment app.
46+
47+
Refer to [Google Cloud's documentation about service accounts](https://cloud.google.com/iam/docs/creating-managing-service-accounts){:target="_blank"} for more information.
4948

5049
### Create the Warehouse in Segment
5150

52-
1. In Segment, go to **Workspace** > **Add destination** > Search for "BigQuery"
53-
2. Select **BigQuery**
54-
3. Add a name for the destination to the **Name your destination** field
55-
4. Enter your project ID in the **Project** field
56-
5. Copy the contents of the credentials (the JSON key) into the **Credentials** field <br/>
57-
**Optional:** Enter a [region code](https://cloud.google.com/compute/docs/regions-zones/) in the **Location** field (the default will be "US")
58-
6. Click **Connect**
59-
7. If Segment can connect with the provided **Project ID** and **Credentials**, a warehouse will be created and your first sync should begin shortly
51+
1. In Segment, go to **Workspace** > **Add Destination** > Search for "BigQuery"
52+
2. Click **BigQuery**.
53+
3. Select the source(s) you'd like to sync with the BigQuery destination, and click **Next**.
54+
3. Add a name for the destination to the **Name your destination** field.
55+
4. Enter your project ID in the **Project ID** field.
56+
**Optional:** Enter a [region code](https://cloud.google.com/compute/docs/regions-zones/){:target="_blank"} in the **Location** field (the default will be "US".)
57+
5. Copy the contents of the JSON key into the **Credentials** field.
58+
6. Click **Connect**.
59+
7. If Segment can connect with the provided **Project ID** and **Credentials**, a warehouse will be created and your first sync should begin shortly.
6060

61-
### Schema
61+
## Schema
6262

6363
BigQuery datasets are broken down into **tables** and **views**. **Tables**
6464
contain duplicate data, **views** do _not_.
6565

66-
#### Partitioned Tables
66+
### Partitioned Tables
6767

6868
The Segment connector takes advantage of [partitioned
69-
tables](https://cloud.google.com/bigquery/docs/partitioned-tables). Partitioned
69+
tables](https://cloud.google.com/bigquery/docs/partitioned-tables){:target="_blank"}. Partitioned
7070
tables allow you to query a subset of data, thus increasing query performance
7171
and decreasing costs.
7272

@@ -85,11 +85,11 @@ select *
8585
from <project-id>.<source-name>.<collection-name>$20160809
8686
```
8787

88-
#### Views
88+
### Views
8989

90-
A [view](https://cloud.google.com/bigquery/querying-data#views) is a virtual
90+
A [view](https://cloud.google.com/bigquery/querying-data#views){:target="_blank"} is a virtual
9191
table defined by a SQL query. Segment uses views in the de-duplication process to
92-
ensure that events that you are querying unique events, and the latest objects
92+
ensure that events that you are querying are unique events and contain the latest objects
9393
from third-party data. All Segment views are set up to show information from the last
9494
60 days. Whenever possible, query from these views.
9595

@@ -125,13 +125,13 @@ You can remove access to the shared Service Account
125125

126126
1. Create a [new Service Account for Segment](#create-a-service-account-for-segment) using the linked instructions.
127127
2. Verify that the data is loading into your warehouse.
128-
3. Sign in to the [Google Developers Console](https://console.developers.google.com).
128+
3. Sign in to the [Google Developers Console](https://console.developers.google.com){:target="_blank"}.
129129
4. Open the IAM & Admin product, and select **IAM**.
130130
5. From the list of projects, select the project that has BigQuery enabled.
131131
6. On the project's page, select the **Permissions** tab, and then click **view by PRINCIPALS**.
132132
7. Select the checkbox for the `[email protected]` account and then click **Remove** to remove access to this shared Service Account.
133133

134-
For more information about managing IAM access, see Google's documentation, [Manage access to projects, folders, and organization](https://cloud.google.com/iam/docs/granting-changing-revoking-access).
134+
For more information about managing IAM access, see Google's documentation, [Manage access to projects, folders, and organization](https://cloud.google.com/iam/docs/granting-changing-revoking-access){:target="_blank"}.
135135

136136

137137
## Best Practices
@@ -150,7 +150,7 @@ views are not cached.
150150
> referenced directly or indirectly by the top-level query.
151151
152152
To save more money, you can query the view and set a [destination
153-
table](https://cloud.google.com/bigquery/docs/tables), and then query the
153+
table](https://cloud.google.com/bigquery/docs/tables){:target="_blank"}, and then query the
154154
destination table.
155155

156156
### Query structure
@@ -191,13 +191,13 @@ WHERE ROW_NUMBER = 1
191191

192192
BigQuery offers both a scalable, pay-as-you-go pricing plan based on the amount
193193
of data scanned, or a flat-rate monthly cost. You can learn more about BigQuery
194-
pricing [here](https://cloud.google.com/bigquery/pricing).
194+
pricing [here](https://cloud.google.com/bigquery/pricing){:target="_blank"}.
195195

196196
BigQuery allows you to set up [Cost Controls and
197-
Alerts](https://cloud.google.com/bigquery/cost-controls) to help control and
197+
Alerts](https://cloud.google.com/bigquery/cost-controls){:target="_blank"} to help control and
198198
monitor costs. If you want to learn more about what BigQuery will cost you,
199199
they've provided [this
200-
calculator](https://cloud.google.com/products/calculator/) to estimate your
200+
calculator](https://cloud.google.com/products/calculator/){:target="_blank"} to estimate your
201201
costs.
202202

203203
### How do I query my data in BigQuery?
@@ -212,7 +212,7 @@ functions.
212212
### Does Segment support streaming inserts?
213213

214214
Segment's connector does not support streaming inserts at this time. If you have
215-
a need for streaming data into BigQuery, [contact Segment support](https://segment.com/requests/integrations/).
215+
a need for streaming data into BigQuery, [contact Segment support](https://segment.com/requests/integrations/){:target="_blank"}.
216216

217217
### Can I customize my sync schedule?
218218

@@ -224,5 +224,4 @@ a need for streaming data into BigQuery, [contact Segment support](https://segme
224224

225225
### I'm seeing duplicates in my tables.
226226

227-
This behavior is expected. Segment only de-duplicates data in your views. See the
228-
section on [views](#views) for more details.
227+
This behavior is expected. Segment only de-duplicates data in your views. See the [schema section](#schema) for more details.

0 commit comments

Comments
 (0)