[Feature]: Containerize the ETL Pipeline with Docker

### So, what is it about?

I propose we containerize the ETL pipeline using **Docker** and **Docker Compose**.

This will create a consistent, isolated, and reproducible development environment for all contributors. Currently, the project setup relies on a contributor's local Python environment, which can lead to setup friction (OS differences, Python versions, package conflicts).

A containerized setup means anyone can clone the repo and get the pipeline running with a single command (`docker-compose up`) without worrying about local Python setup.

### Acceptance Criteria

-   [ ] **Create a `Dockerfile`**
    -   Use a lightweight base image (e.g., `python:3.10-slim`).
    -   Copy and install dependencies from `requirements.txt`.
    -   Set the default command to run the pipeline (e.g., `CMD ["python", "main.py"]`).
-   [ ] **Create a `docker-compose.yml` file**
    -   Define a single service (e.g., `etl`).
    -   It should build from the local `Dockerfile`.
    -   It must use a **volume** to mount the local code directory into the container (e.g., `volumes: ['.:/app']`). This is the most important part, as it allows contributors to fix the `TODO`s in the code and see their changes reflected *inside* the container without rebuilding.
-   [ ] **Create a `.dockerignore` file**
    -   Should ignore common files like `venv/`, `__pycache__/`, `.git`, `.vscode/`, `.idea/`, etc., to keep the build context clean and fast.
-   [ ] **Update `README.md`**
    -   Add a new "Running with Docker" section with the new setup instructions.

Adding Docker is a valuable learning opportunity in itself, as it's a core tool in modern data engineering and lowers the barrier to entry for new contributors.

### Code of Conduct

- [x] I agree to follow this project's Code of Conduct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Containerize the ETL Pipeline with Docker #42

So, what is it about?

Acceptance Criteria

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: Containerize the ETL Pipeline with Docker #42

Description

So, what is it about?

Acceptance Criteria

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions