Skip to content

Commit 5dbc716

Browse files
readme documentation
1 parent 0764cb4 commit 5dbc716

File tree

1 file changed

+128
-2
lines changed

1 file changed

+128
-2
lines changed

README.md

Lines changed: 128 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Rony - Data Engineering made simple
22

33
[![PyPI version fury.io](https://badge.fury.io/py/rony.svg)](https://pypi.python.org/pypi/rony/)
4-
![Rony](https://github.com/A3Data/rony/workflows/rony/badge.svg)
4+
![Test package](https://github.com/A3Data/rony/workflows/Test%20package/badge.svg)
55
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
66
[![GitHub issues](https://img.shields.io/github/issues/a3data/rony.svg)](https://GitHub.com/a3data/rony/issues/)
77
[![GitHub issues-closed](https://img.shields.io/github/issues-closed/a3data/rony.svg)](https://GitHub.com/a3data/rony/issues?q=is%3Aissue+is%3Aclosed)
@@ -15,4 +15,130 @@ Developed with ❤️ by <a href="http://www.a3data.com.br/" target="_blank">A3D
1515

1616
## What is Rony
1717

18-
Rony is an **open source** framework that helps Data Engineers setting up more organized code and build, test and deploy data pipelines faster.
18+
Rony is an **open source** framework that helps Data Engineers setting up more organized code and build, test and deploy data pipelines faster.
19+
20+
## Why Rony?
21+
22+
Rony is <a href="https://github.com/A3Data/hermione" target="_blank">Hermione</a>'s *best friend* (or so...).
23+
This was a perfect choice for naming the second framework
24+
released by A3Data, this one focusing on Data Engineering.
25+
26+
In many years on helping companies building their data analytics projects and cloud infrastructure, we acquired
27+
a knowledge basis that led to a collection of code snippets and automation procedures that speed things up
28+
when it comes to developing data structure and data pipelines.
29+
30+
## Some choices we made
31+
32+
Rony relies on top of a few decisions that make sense for the majority of projects conducted by A3Data:
33+
34+
- <a href="https://www.terraform.io/intro/index.html" target="_blank">Terraform (>= 0.13)</a>
35+
- <a href="https://www.docker.com/" target="_blank">Docker</a>
36+
- <a href="https://airflow.apache.org/" target="_blank">Apache Airflow</a>
37+
- <a href="https://aws.amazon.com/" target="_blank">AWS</a>
38+
39+
You are free to change this decisions as you wish (that's the whole point of the framework - **flexibility**).
40+
41+
# Installing
42+
43+
### Dependencies
44+
45+
- Python (>=3.6)
46+
47+
### Install
48+
49+
```bash
50+
pip install -U rony
51+
```
52+
53+
# How do I use Rony?
54+
55+
After installing Rony you can test if the installation is ok by running:
56+
57+
```bash
58+
rony info
59+
```
60+
61+
and you shall see a cute logo. Then,
62+
63+
1) Create a new project:
64+
65+
```bash
66+
rony new project_rony
67+
```
68+
69+
2) Rony already creates a virtual environment for the project.
70+
Windows users can activate it with
71+
72+
```
73+
<project_name>_env\Scripts\activate
74+
```
75+
76+
Linux and MacOS users can do
77+
78+
```bash
79+
source <project_name>_env/bin/activate
80+
```
81+
82+
3) After activating, you should install some libraries. There are a few suggestions in “requirements.txt” file:
83+
84+
```bash
85+
pip install -r requirements.txt
86+
```
87+
88+
4) Rony has also some handy cli commands to build and run docker images locally. You can do
89+
90+
```bash
91+
cd etl
92+
rony build <image_name>:<tag>
93+
```
94+
95+
to build an image and run it with
96+
97+
```bash
98+
rony run <image_name>:<tag>
99+
```
100+
101+
In this particular implementation, `run.py` has a simple etl code that accepts a parameter to filter the data based on the `Sex` column. To use that, you can do
102+
103+
```bash
104+
docker run <image_name>:<tag> -s female
105+
```
106+
107+
# Implementation suggestions
108+
109+
When you start a new `rony` project, you will find
110+
111+
- an `infrastructure` folder with terraform code creating on AWS:
112+
- an S3 bucket
113+
- a Lambda function
114+
- a CloudWatch log group
115+
- a ECR repository
116+
- a AWS Glue Crawler
117+
- IAM roles and policies for lambda and glue
118+
119+
- an `etl` folder with:
120+
- a `Dockerfile` and a `run.py` example of ETL code
121+
- a `lambda_function.py` with a "Hello World" example
122+
123+
- a `tests` folder with unit testing on the Lambda function
124+
- a `.github/workflow` folder with a Github Actions CI/CD pipeline suggestion. This pipeline
125+
- Tests lambda function
126+
- Builds and runs the docker image
127+
- Sets AWS credentials
128+
- Make a terraform plan (but not actually deploy anything)
129+
130+
- a `dags` folder with some **Airflow** example code.f
131+
132+
You also have a `scripts` folder with a bash file that builds a lambda deploy package.
133+
134+
**Feel free to adjust and adapt everything according to your needs.**
135+
136+
137+
## Contributing
138+
139+
Make a pull request with your implementation.
140+
141+
For suggestions, contact us: rony@a3data.com.br
142+
143+
## Licence
144+
Rony is open source and has Apache 2.0 License: [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

0 commit comments

Comments
 (0)