Skip to content

Commit 1df5bc6

Browse files
committed
Restructuring test environments strategy
1 parent 0b2674d commit 1df5bc6

21 files changed

+917
-892
lines changed

.gitignore

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -134,9 +134,7 @@ python/
134134

135135
# SAM
136136
.aws-sam
137-
testing/*parameters-*.properties
138-
testing/*requirements*.txt
139-
testing/coverage/*
137+
coverage/*
140138
building/*requirements*.txt
141139
building/arrow
142140
building/lambda/arrow

CONTRIBUTING.md

Lines changed: 99 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -58,52 +58,134 @@ See the [LICENSE](https://github.com/awslabs/aws-data-wrangler/blob/master/LICEN
5858

5959
We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes.
6060

61-
## Environment
61+
## Environments
6262

63-
* AWS Data Wrangler practically only makes integrations with Databases and AWS APIs. So we prefer to dedicate our energy / time writing integration tests instead of unit tests. We really like an end-to-end approach for all features.
63+
We have hundreds of test functions that runs against several AWS Services. You don't need to test everything to open a Pull Request.
64+
You can choose from three different environments to test your fixes/changes, based on what makes sense for your case.
6465

65-
* All integration tests are between the development environment and a remote and real AWS service.
66+
* [Mocked test environment](#mocked-test-environment)
67+
* Based on [moto](https://github.com/spulec/moto).
68+
* Does not require real AWS resources
69+
* Fastest approach
70+
* Basically Limited only for Amazon S3 tests
6671

67-
* We have a Cloudformation to set up the AWS end (testing/cloudformation.yaml).
72+
* [Data Lake test environment](#data-lake-test-environment)
73+
* Requires some AWS services.
74+
* Amazon S3, Amazon Athena, AWS Glue Catalog, AWS KMS
75+
* Enable real tests on typical Data Lake cases
76+
77+
* [Full test environment](#full-test-environment)
78+
* Requires a bunch of real AWS services
79+
* Amazon S3, Amazon Athena, AWS Glue Catalog, AWS KMS, Amazon Redshift, Aurora PostgreSQL, Aurora MySQL, etc
80+
* Enable real tests on all use cases.
6881

6982
## Step-by-step
7083

71-
**DISCLAIMER**: Make sure to know what you are doing. This steps will charge some services on your AWS account and requires a minimum security skill to keep your environment safe.
84+
### Mocked test environment
7285

7386
* Pick up a Linux or MacOS.
74-
7587
* Install Python 3.6, 3.7 or 3.8
88+
* Fork the AWS Data Wrangler repository and clone that into your development environment
89+
* Go to the project's directory create a Python's virtual environment for the project
90+
91+
`python -m venv .venv && source .venv/bin/activate`
92+
93+
* Then run the command bellow to install all dependencies:
94+
95+
``./requirements.sh``
96+
97+
* Run the validation script:
98+
99+
``./validation.sh``
76100

101+
* To run a specific test function:
102+
103+
``pytest tests/test_moto::test_get_bucket_region_succeed``
104+
105+
* To run all mocked test functions (Using 8 parallel processes):
106+
107+
``pytest -n 8 tests/test_moto``
108+
109+
### Data Lake test environment
110+
111+
**DISCLAIMER**: Make sure to know what you are doing. This steps will charge some services on your AWS account and requires a minimum security skill to keep your environment safe.
112+
113+
* Pick up a Linux or MacOS.
114+
* Install Python 3.6, 3.7 or 3.8
77115
* Fork the AWS Data Wrangler repository and clone that into your development environment
116+
* Go to the project's directory create a Python's virtual environment for the project
117+
118+
`python -m venv .venv && source .venv/bin/activate`
78119

120+
* Then run the command bellow to install all dependencies:
121+
122+
``./requirements.sh``
123+
124+
* Go to the ``cloudformation`` directory
125+
126+
``cd cloudformation``
127+
128+
* Deploy the Cloudformation template `base.yaml`
129+
130+
``./deploy-base.sh``
131+
132+
* Return to the project root directory
133+
134+
``cd ..``
135+
136+
* Run the validation script:
137+
138+
``./validation.sh``
139+
140+
* To run a specific test function:
141+
142+
``pytest tests/test_s3_athena::test_to_parquet_modes``
143+
144+
* To run all data lake test functions (Using 8 parallel processes):
145+
146+
``pytest -n 8 tests/test_s3_athena``
147+
148+
### Full test environment
149+
150+
**DISCLAIMER**: Make sure to know what you are doing. This steps will charge some services on your AWS account and requires a minimum security skill to keep your environment safe.
151+
152+
* Pick up a Linux or MacOS.
153+
* Install Python 3.6, 3.7 and 3.8
154+
* Fork the AWS Data Wrangler repository and clone that into your development environment
79155
* Go to the project's directory create a Python's virtual environment for the project
80156

81157
`python -m venv .venv && source .venv/bin/activate`
82158

83159
* Then run the command bellow to install all dependencies:
84160

85-
`./requirements.sh`
161+
``./requirements.sh``
86162

87-
* Go to the ``testing`` directory
163+
* Go to the ``cloudformation`` directory
88164

89-
`cd testing`
165+
``cd cloudformation``
90166

91-
* Deploy the Cloudformation stack
167+
* Deploy the Cloudformation templates `base.yaml` and `databases.yaml`
92168

93-
``./cloudformation.sh``
169+
``./deploy-base.sh``
170+
``./deploy-databases.sh``
94171

95172
* Go to the `EC2 -> SecurityGroups` console, open the `aws-data-wrangler-*` security group and configure to accept your IP from any TCP port.
96173

97174
``P.S Make sure that your security group will not be open to the World! Configure your security group to only give access for your IP.``
98175

99-
* To run the validations:
176+
* Return to the project root directory
177+
178+
``cd ..``
179+
180+
* Run the validation script:
181+
182+
``./validation.sh``
100183

101-
``./validations.sh``
184+
* To run a specific test function:
102185

103-
* To run the complete test:
186+
``pytest tests/test_s3_athena::test_to_parquet_modes``
104187

105-
``./tests.sh``
188+
* To run all data lake test functions for all python versions:
106189

107-
* To run a specific test:
190+
``./test.sh``
108191

109-
``pytest test_awswrangler/test_data_lake::test_athena_nested``

0 commit comments

Comments
 (0)