Skip to content

Commit 534330c

Browse files
committed
Add Contributing section in the docs
1 parent 17635e7 commit 534330c

File tree

3 files changed

+83
-9
lines changed

3 files changed

+83
-9
lines changed

README.md

Lines changed: 45 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,18 @@
88

99
---
1010

11-
*Contents:* **[Use Cases](#Use-Cases)** | **[Installation](#Installation)** | **[Examples](#Examples)** | **[Diving Deep](#Diving-Deep)**
11+
*Contents:* **[Use Cases](#Use-Cases)** | **[Installation](#Installation)** | **[Examples](#Examples)** | **[Diving Deep](#Diving-Deep)** | **[Contributing](#Contributing)**
1212

1313
---
1414

1515
## Use Cases
1616

1717
### Pandas
18-
* Pandas -> Parquet (S3) (Parallel :rocket:)
19-
* Pandas -> CSV (S3) (Parallel :rocket:)
18+
* Pandas -> Parquet (S3) (Parallel)
19+
* Pandas -> CSV (S3) (Parallel)
2020
* Pandas -> Glue Catalog
21-
* Pandas -> Athena (Parallel :rocket:)
22-
* Pandas -> Redshift (Parallel :rocket:)
21+
* Pandas -> Athena (Parallel)
22+
* Pandas -> Redshift (Parallel)
2323
* CSV (S3) -> Pandas (One shot or Batching)
2424
* Athena -> Pandas (One shot or Batching)
2525
* CloudWatch Logs Insights -> Pandas (NEW :star:)
@@ -29,10 +29,10 @@
2929
* PySpark -> Redshift (Parallel :rocket:) (NEW :star:)
3030

3131
### General
32-
* List S3 objects (Parallel :rocket:)
33-
* Delete S3 objects (Parallel :rocket:)
34-
* Delete listed S3 objects (Parallel :rocket:)
35-
* Delete NOT listed S3 objects (Parallel :rocket:)
32+
* List S3 objects (Parallel)
33+
* Delete S3 objects (Parallel)
34+
* Delete listed S3 objects (Parallel)
35+
* Delete NOT listed S3 objects (Parallel)
3636
* Copy listed S3 objects (Parallel :rocket:)
3737
* Get the size of S3 objects (Parallel :rocket:)
3838
* Get CloudWatch Logs Insights query results (NEW :star:)
@@ -194,3 +194,39 @@ results = session.cloudwatchlogs.query(
194194
### Spark to Redshift Flow
195195

196196
![Spark to Redshift Flow](docs/source/_static/spark-to-redshift-flow.jpg?raw=true "Spark to Redshift Flow")
197+
198+
## Contributing
199+
200+
* AWS Data Wrangler practically only makes integrations. So we prefer to dedicate our energy / time writing integration tests instead of unit tests. We really like an end-to-end approach for all features.
201+
202+
* All integration tests are between a local Docker container and a remote/real AWS service.
203+
204+
* We have a Docker recipe to set up the local end (testing/Dockerfile).
205+
206+
* We have a Cloudformation to set up the AWS end (testing/template.yaml).
207+
208+
### Step-by-step
209+
210+
**DISCLAIMER**: Make sure to know what you are doing. This steps will charge some services on your AWS account. And requires a minimum security skills to keep your environment safe.
211+
212+
* Pick up a Linux or MacOS.
213+
214+
* Install Python 3.6+
215+
216+
* Install Docker and configure at least 4 cores and 8 GB of memory
217+
218+
* Fork the AWS Data Wrangler repository and clone that into your development environment
219+
220+
* Go to the project's directory create a Python's virtual environment for the project **python -m venv venv && source source venv/bin/activate**
221+
222+
* Run **./install-dev.sh**
223+
224+
* Go to the *testing* directory
225+
226+
* Configure the parameters.json file with your AWS environment infos (Make sure that your Redshift will not be open for the World!)
227+
228+
* Deploy the Cloudformation stack **./deploy-cloudformation.sh**
229+
230+
* Open the docker image **./open-image.sh**
231+
232+
* Inside the image you finally can run **./run-tests.sh**

docs/source/contributing.rst

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
.. _doc_contributing:
2+
3+
Contributing
4+
============
5+
6+
* AWS Data Wrangler practically only makes integrations. So we prefer to dedicate our energy / time writing integration tests instead of unit tests. We really like an end-to-end approach for all features.
7+
8+
* All integration tests are between a local Docker container and a remote/real AWS service.
9+
10+
* We have a Docker recipe to set up the local end (testing/Dockerfile).
11+
12+
* We have a Cloudformation to set up the AWS end (testing/template.yaml).
13+
14+
Step-by-step
15+
------------
16+
17+
**DISCLAIMER**: Make sure to know what you are doing. This steps will charge some services on your AWS account. And requires a minimum security skills to keep your environment safe.
18+
19+
* Pick up a Linux or MacOS.
20+
21+
* Install Python 3.6+
22+
23+
* Install Docker and configure at least 4 cores and 8 GB of memory
24+
25+
* Fork the AWS Data Wrangler repository and clone that into your development environment
26+
27+
* Go to the project's directory create a Python's virtual environment for the project **python -m venv venv && source source venv/bin/activate**
28+
29+
* Run **./install-dev.sh**
30+
31+
* Go to the *testing* directory
32+
33+
* Configure the parameters.json file with your AWS environment infos (Make sure that your Redshift will not be open for the World!)
34+
35+
* Deploy the Cloudformation stack **./deploy-cloudformation.sh**
36+
37+
* Open the docker image **./open-image.sh**

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,5 +47,6 @@ Table Of Contents
4747
installation
4848
examples
4949
divingdeep
50+
contributing
5051
api/modules
5152
license

0 commit comments

Comments
 (0)