This repository provides a streamlined setup for running AWS Glue ETL libraries locally with support for AWS SSO, based on Amazon Linux 2023, in a DevContainer. It resolves common challenges faced when configuring AWS Glue locally, as discussed in various resources.
- DevContainer Integration: Fully configured for use with Visual Studio Code or compatible tools.
- AWS SSO Support: Properly handles AWS SSO credential management to facilitate seamless local testing.
- Jupyter Environment: Preconfigured Jupyter server for interactive development.
- Spark Configuration: Includes PySpark setup for ETL development.
- References and Improvements: Inspired by discussions and community solutions, such as:
- Visual Studio Code installed on your local machine.
- Docker installed and running.
- AWS SSO configuration in your
~/.aws/credentialsor environment variables. - Familiarity with AWS Glue and ETL processes.
-
Clone this repository:
git clone <repository-url> cd <repository-folder>
-
Open the repository in Visual Studio Code.
-
When prompted, open the folder in the DevContainer. Alternatively, you can manually rebuild the container:
- Press
Ctrl+Shift+P(orCmd+Shift+Pon Mac) to open the Command Palette. - Select Remote-Containers: Rebuild and Reopen in Container.
- Press
-
The DevContainer will start and initialize:
- AWS credentials will be mounted into the container.
- The Jupyter server will be started automatically.
Once the DevContainer is running, JupyterLab will be accessible at http://localhost:8889. It is preconfigured with all necessary libraries and paths for AWS Glue development.
- DevContainer: Configuration is defined in
devcontainer.jsonfor a seamless setup. - Jupyter: Automatically starts on container launch via the
postStartCommandscript. - Spark: Configured with paths to
PyGlue.zipandpy4jfor full Glue functionality. - Network Ports:
4040: Spark UI18080: Spark History Server8998: Livy Server8889: JupyterLab
Modify the devcontainer.json and Dockerfile to customize dependencies and environment configurations.
Feel free to open issues or submit pull requests to improve this repository.