You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(gcloud): Enhance CUJ framework and add advanced use cases
This commit builds upon the foundational CUJ framework by ingesting
battle-tested logic from numerous sources and implementing the initial set
of comprehensive, production-like Critical User Journeys.
The framework is now enhanced with a powerful, modular library and the
first advanced CUJs, making it a robust tool for end-to-end testing.
Key Enhancements:
* **Modular Library (`lib/`)**:
The monolithic `common.sh` is refactored into a modular library
with components organized by function (`_core.sh`, `_network.sh`,
`_dataproc.sh`, `_database.sh`, `_security.sh`). This incorporates
advanced, parameterized, and idempotent functions for managing a
wide range of GCP resources.
* **Advanced Onboarding (`onboarding/`)**:
New scripts are added to provision persistent, shared infrastructure,
including a High-Availability Cloud SQL instance with VPC Peering and
a dual-NIC Squid Proxy VM, following GCP best practices.
* **New Critical User Journeys (`cuj/`)**:
* `gce/standard`: This CUJ is enhanced to provision a full,
NAT-based network environment.
* `gce/proxy-egress`: A new CUJ is added to test Dataproc
clusters that use a proxy for all outbound traffic.
* `gke/standard`: A new CUJ is added for the standard Dataproc
on GKE use case.
* **Enhanced CI/CD (`ci/`)**:
`pristine_check.sh` is upgraded to use a robust, tag-based cleanup
strategy, making it scalable to any number of CUJs without
modification.
* **Finalized Configuration (`env.json`)**:
The `env.json.sample` file is finalized with a simplified structure
that defines the shared test environment and a `cuj_set` for test
orchestration, abstracting implementation details from the user.
* **Comprehensive Documentation (`README.md`)**:
The README is updated to be a complete guide for the new framework,
explaining its philosophy and providing a clear "Getting Started"
workflow for new users.
@@ -16,200 +16,91 @@ limitations under the License.
16
16
17
17
-->
18
18
19
-
## Introduction
19
+
Of course. Here is a new `gcloud/README.md` file that explains the purpose and usage of the Critical User Journey (CUJ) framework we have designed together.
20
20
21
-
This README file describes how to use this collection of gcloud bash examples to
22
-
reproduce common Dataproc cluster creation problems relating to the GCE startup
23
-
script, Dataproc startup script, and Dataproc initialization-actions scripts.
21
+
It covers the framework's philosophy, a step-by-step guide for new users, and an overview of the available CUJs, incorporating all of our recent design decisions.
This directory contains a collection of scripts that form a test framework for exercising Critical User Journeys (CUJs) on Google Cloud Dataproc. The goal of this framework is to provide a robust, maintainable, and automated way to reproduce and validate the common and complex use cases that are essential for our customers.
35
28
36
-
First, copy `env.json.sample` to `env.json` and modify the environment
37
-
variable names and their values in `env.json` to match your
38
-
environment:
29
+
This framework replaces the previous monolithic scripts with a modular, scalable, and self-documenting structure designed for both interactive use and CI/CD automation.
39
30
31
+
## Framework Overview
32
+
33
+
The framework is organized into several key directories, each with a distinct purpose:
34
+
35
+
***`onboarding/`**: Contains idempotent scripts to set up persistent, shared infrastructure that multiple CUJs might depend on. These are typically run once per project. Examples include setting up a shared Cloud SQL instance or a Squid proxy VM.
36
+
37
+
***`cuj/`**: The heart of the framework. This directory contains the individual, self-contained CUJs, grouped by the Dataproc platform (`gce`, `gke`, `s8s`). Each CUJ represents a specific, testable customer scenario.
38
+
39
+
***`lib/`**: A collection of modular bash script libraries (`_core.sh`, `_network.sh`, `_database.sh`, etc.). These files contain all the powerful, reusable functions for creating and managing GCP resources, forming a shared API for all `onboarding` and `cuj` scripts.
40
+
41
+
***`ci/`**: Includes scripts specifically for CI/CD automation. The `pristine_check.sh` script is designed to enforce a clean project state before and after test runs, preventing bitrot and ensuring reproducibility.
42
+
43
+
## Getting Started
44
+
45
+
Follow these steps to configure your environment and run your first CUJ.
46
+
47
+
### 1. Prerequisites
48
+
49
+
Ensure you have the following tools installed and configured:
50
+
*`gcloud` CLI (authenticated to your Google account)
51
+
*`jq`
52
+
* A Google Cloud project with billing enabled.
53
+
54
+
### 2. Configure Your Environment
55
+
56
+
Copy the sample configuration file and edit it to match your environment.
57
+
58
+
```bash
59
+
cp gcloud/env.json.sample gcloud/env.json
60
+
vi gcloud/env.json
40
61
```
41
-
{
42
-
"PROJECT_ID":"ldap-example-yyyy-nn",
43
-
"ORG_NUMBER":"100000000001",
44
-
"DOMAIN": "your-domain-goes-here.com",
45
-
"BILLING_ACCOUNT":"100000-000000-000001",
46
-
"FOLDER_NUMBER":"100000000001",
47
-
"REGION":"us-west4",
48
-
"RANGE":"10.00.01.0/24",
49
-
"IDLE_TIMEOUT":"30m",
50
-
"ASN_NUMBER":"65531",
51
-
"IMAGE_VERSION":"2.2,
52
-
"BIGTABLE_INSTANCE":"my-bigtable"
53
-
}
54
-
```
55
62
56
-
The values that you enter here will be used to build reasonable defaults in
57
-
`lib/env.sh` ; you can view and modify `lib/env.sh` to more finely tune your
58
-
environment. The code in lib/env.sh is sourced and executed at the head of many
59
-
scripts in this suite to ensure that the environment is tuned for use with this
60
-
reproduction.
61
-
62
-
#### Dataproc on GCE
63
-
64
-
To tune the reproduction environment for your (customer's) GCE use case, review
65
-
the `create_dpgce_cluster` function in the `lib/shared-functions.sh` file. This
66
-
is where you can select which arguments are passed to the `gcloud dataproc
67
-
clusters create ${CLUSTER_NAME}` command. There exist many examples in the
68
-
comments of common use cases below the call to gcloud itself.
69
-
70
-
## creation phase
71
-
72
-
When reviewing `lib/shared-functions.sh`, pay attention to the
73
-
`--metadata startup-script="..."` and `--initialization-actions
74
-
"${INIT_ACTIONS_ROOT}/<script-name>"` arguments. These can be used to
75
-
execute arbitrary code during the creation of Dataproc clusters. Many
76
-
Google Cloud Support cases relate to failures during either a)
77
-
Dataproc's internal startup script, which runs after the `--metadata
78
-
startup-script="..."`, or b) scripts passed using the
actions](https://github.com/GoogleCloudDataproc/initialization-actions) in order
165
-
of specification.
166
-
167
-
The Dataproc startup script runs before the initialization actions, and logs its
168
-
output to `/var/log/dataproc-startup-script.log`. It is linked to by
169
-
`/usr/local/share/google/dataproc/startup-script.sh` on all dataproc nodes. The
170
-
tasks which the startup script run are influenced by the following arguments.
171
-
This is not an exhaustive list. If you are troubleshooting startup errors,
172
-
determine whether any arguments or properties are being supplied to the
173
-
`clusters create` command, especially any similar to the following.
63
+
You only need to edit the universal and onboarding settings. The `load_config` function in the library will dynamically generate a `PROJECT_ID` if the default value is present.
64
+
65
+
### 3. Run Onboarding Scripts
66
+
67
+
Before running any CUJs, you must set up the shared infrastructure for your project. These scripts are idempotent and can be run multiple times safely.
68
+
69
+
```bash
70
+
# Set up the shared Cloud SQL instance with VPC Peering
on the filesystem of each cluster node, where ${INDEX} is the script number,
210
-
starting with 0, and incrementing for each additional script. The URL of the
211
-
script can be found by querying the metadata server for
212
-
`attributes/dataproc-initialization-action-script-${INDEX}`. From within the
213
-
script itself, you can refer to `attributes/$0`.
214
-
215
-
Logs for each initialization action script are created under /var/log
94
+
Each `manage.sh` script supports several commands:
95
+
***`up`**: Creates all resources for the CUJ.
96
+
***`down`**: Deletes all resources created by this CUJ.
97
+
***`rebuild`**: Runs `down` and then `up` for a full cycle.
98
+
***`validate`**: Checks for prerequisites, such as required APIs or shared infrastructure.
99
+
100
+
## Available CUJs
101
+
102
+
This framework includes the following initial CUJs:
103
+
104
+
***`gce/standard`**: Creates a standard Dataproc on GCE cluster in a dedicated VPC with a Cloud NAT gateway for secure internet egress.
105
+
***`gce/proxy-egress`**: Creates a Dataproc on GCE cluster in a private network configured to use the shared Squid proxy for all outbound internet traffic.
106
+
***`gke/standard`**: Creates a standard Dataproc on GKE virtual cluster on a new GKE cluster.
0 commit comments