|
| 1 | +--- |
| 2 | +title: Applying Data Masking with GitHub Actions - Part 1 |
| 3 | +author: Ningjing |
| 4 | +updated_at: 2024/11/19 18:00 |
| 5 | +tags: Tutorial |
| 6 | +integrations: General, API |
| 7 | +level: Advanced |
| 8 | +estimated_time: '30 mins' |
| 9 | +description: 'Learn how to automate database masking policies using GitHub Actions and Bytebase API' |
| 10 | +--- |
| 11 | +Bytebase is a database DevSecOps platform designed for developers, security, DBA, and platform engineering teams. While it offers an intuitive GUI for managing database schema changes and access control, some teams may want to integrate Bytebase into their existing DevOps platforms using the [Bytebase API](/docs/api/overview/). |
| 12 | + |
| 13 | +Bytebase provides database [dynamic data masking feature](/docs/security/data-masking/overview/) in the **Enterprise Plan**, which can mask sensitive data in the SQL Editor query result based on the context. It helps organizations to protect sensitive data from being exposed to unauthorized users. |
| 14 | + |
| 15 | +By using GitHub Actions with Bytebase API, you can implement policy-as-code to apply database masking policies when a pull request is merged. This tutorial will guide you through the process. |
| 16 | + |
| 17 | +--- |
| 18 | + |
| 19 | +This is Part 1 of our tutorial series on implementing automated database masking using GitHub Actions: |
| 20 | + |
| 21 | +- Part 1: Applying Data Masking with GitHub Actions (this one) |
| 22 | +- Part 2: Customizing Data Masking Algorithm with GitHub Actions |
| 23 | +- Part 3: Data Classification and Global Masking with GitHub Actions |
| 24 | + |
| 25 | +## Overview |
| 26 | + |
| 27 | +In this tutorial, you'll learn how to automate database masking policies using GitHub Actions and the Bytebase API. This integration allows you to: |
| 28 | + |
| 29 | +- Manage data masking rules as code |
| 30 | +- Automatically apply masking policies when PRs are merged |
| 31 | + |
| 32 | +Here is [a merged pull request](https://github.com/bytebase/database-security-github-actions-example/pull/5) as an example. |
| 33 | + |
| 34 | +<HintBlock type="info"> |
| 35 | + |
| 36 | +The complete code for this tutorial is available at: [database-security-github-actions-example](https://github.com/bytebase/database-security-github-actions-example) |
| 37 | + |
| 38 | +</HintBlock> |
| 39 | + |
| 40 | +## Prerequisites |
| 41 | + |
| 42 | +Before you begin, make sure you have: |
| 43 | +- [Docker](https://www.docker.com/) installed |
| 44 | +- A [GitHub](https://github.com/) account |
| 45 | +- An[ngrok](http://ngrok.com/) account |
| 46 | +- Bytebase Enterprise Plan subscription |
| 47 | + |
| 48 | +## Setup Instructions |
| 49 | + |
| 50 | +### Step 1 - Start Bytebase in Docker and set the External URL generated by ngrok |
| 51 | + |
| 52 | +<IncludeBlock url="/docs/get-started/install/vcs-with-ngrok"></IncludeBlock> |
| 53 | + |
| 54 | +### Step 2 - Create Service Account |
| 55 | + |
| 56 | +<IncludeBlock url="/docs/share/tutorials/create-service-account"></IncludeBlock> |
| 57 | + |
| 58 | +### Step 3 - Prepare Test Data |
| 59 | + |
| 60 | +1. Bytebase by default provides a project `Sample Project` with two database `hr_test` and `hr_prod`. |
| 61 | +1. Click **IAM & Admin > Users & Groups ** on the left sidebar. Add users: `[email protected]`, `[email protected]` and `[email protected]` with no roles. |
| 62 | +1. Add a group `[email protected]` with `[email protected]` as a member. |
| 63 | +1. Go to project `Sample Project`, click **Manage > Members** on the left sidebar. |
| 64 | +1. Click **Grant Access ** and select users `[email protected]` and `[email protected]` with `Developer` role and group `[email protected]` with `Querier` role. |
| 65 | + |
| 66 | +### Step 4 - Configure GitHub Actions |
| 67 | + |
| 68 | +1. Go to [Database Security GitHub Actions Example](https://github.com/bytebase/database-security-github-actions-example) and clone it. |
| 69 | + |
| 70 | +1. Click **Settings** and then click **Secrets and variables > Actions**. Add the following secrets: |
| 71 | + |
| 72 | + - `BYTEBASE_URL`: ngrok external URL |
| 73 | + - `BYTEBASE_SERVICE_KEY`: `[email protected]` |
| 74 | + - `BYTEBASE_SERVICE_SECRET`: service key copied in previous step |
| 75 | + |
| 76 | +## Understanding the Workflow |
| 77 | + |
| 78 | +Let's dig into the GitHub Actionsworkflow [code](https://github.com/bytebase/database-security-github-actions-example/blob/main/.github/workflows/bb-masking-1.yml): |
| 79 | + |
| 80 | +1. **Trigger**: Workflow runs when PRs are merged to `main`. |
| 81 | + |
| 82 | +1. **Authentication**: The step `Login Bytebase` will log in Bytebase using an action [bytebase-login](https://github.com/marketplace/actions/bytebase-login). The variables you configured in the GitHub **Secrets and variables** are mapped to the variables in the action. |
| 83 | + |
| 84 | +1. **File Detection**: The step `Get changed files` will monitor the changed files in the pull request. For this workflow, we only care about column masking and masking exception. So `masking/databases/**/**/column-masking.json` and `masking/projects/**/masking-exception.json` are filtered out. |
| 85 | + |
| 86 | +1. **Apply Masking Columns**: Then step `Apply column masking` will apply the column masking to the database. First it will parse all the column masking files and then do a loop to apply the column masking to the database one by one. The code it calls Bytebase API is as follows: |
| 87 | + |
| 88 | + ```bash |
| 89 | + response=$(curl -s -w "\n%{http_code}" --request PATCH "${BYTEBASE_API_URL}/instances/${INSTANCE_NAME}/databases/${DATABASE_NAME}/policies/masking?allow_missing=true&update_mask=payload" \ |
| 90 | + --header "Authorization: Bearer ${BYTEBASE_TOKEN}" \ |
| 91 | + --header "Content-Type: application/json" \ |
| 92 | + --data @"$CHANGED_FILE") |
| 93 | + ``` |
| 94 | + |
| 95 | +1. **Apply Masking Exceptions**: The step `Apply masking exception` will apply the masking exception to the database and the process is similar, the code it calls Bytebase API is as follows: |
| 96 | + |
| 97 | + ```bash |
| 98 | + response=$(curl -s -w "\n%{http_code}" --request PATCH "${BYTEBASE_API_URL}/projects/${PROJECT_NAME}/policies/masking_exception?allow_missing=true&update_mask=payload" \ |
| 99 | + --header "Authorization: Bearer ${BYTEBASE_TOKEN}" \ |
| 100 | + --header "Content-Type: application/json" \ |
| 101 | + --data @"$CHANGED_FILE") |
| 102 | + ``` |
| 103 | + |
| 104 | +1. **PR Feedback**: The step `Comment on PR` will comment on the merged pull to notify the result. |
| 105 | + |
| 106 | +## Verifying the Setup |
| 107 | + |
| 108 | +1. Create and merge a test PR with masking changes. |
| 109 | + |
| 110 | +1. Log in Bytebase console, at the workspace level, click **Data Access > Data Masking**. Click **Explicit Masked Columns**, you can see the column masking is applied to the database. |
| 111 | + |
| 112 | +  |
| 113 | + |
| 114 | +1. Go to the project `Sample Project`, click **Database > Masking Access**, you can see the masking exception is applied to the database. |
| 115 | + |
| 116 | +  |
| 117 | + |
| 118 | +## Next Steps |
| 119 | + |
| 120 | +Now you have successfully applied data masking policies using GitHub Actions and Bytebase API. In the next part of this tutorial, you'll learn how to customize the masking algorithm. Stay tuned! |
0 commit comments