Skip to content
Change the repository type filter

All

    Repositories list

    • DAIVI is a reference solution with IAC modules to accelerate development of Data, Analytics, AI and Visualization applications on AWS using the next generation …
      HCL
      MIT No Attribution
      13000Updated Feb 18, 2026Feb 18, 2026
    • AWS Glue Configurable Test Data Generator
      Python
      MIT No Attribution
      6000Updated Jun 2, 2023Jun 2, 2023
    • JavaScript
      MIT No Attribution
      3100Updated May 22, 2023May 22, 2023
    • Python
      Other
      5100Updated May 22, 2023May 22, 2023
    • Python
      MIT No Attribution
      6000Updated May 12, 2023May 12, 2023
    • Free Data Engineering course!
      Jupyter Notebook
      8k000Updated Apr 21, 2023Apr 21, 2023
    • dbt-glue

      Public
      This repository contains de dbt-glue adapter
      Python
      Apache License 2.0
      93100Updated Apr 7, 2023Apr 7, 2023
    • ClickHouse® is a free analytics DBMS for big data
      C++
      Apache License 2.0
      8.3k000Updated Mar 31, 2023Mar 31, 2023
    • Scala
      MIT No Attribution
      5000Updated Mar 20, 2023Mar 20, 2023
    • you run a script to mimic multiple sensors publishing messages on an IoT MQTT topic, with one message published every second. The events get sent to AWS IoT, wh…
      Python
      Apache License 2.0
      11031Updated Mar 19, 2023Mar 19, 2023
    • Kinesis Data Analytics Blueprints are a curated collection of Apache Flink applications. Each blueprint will walk you through how to solve a practical problem r…
      TypeScript
      MIT No Attribution
      7000Updated Mar 14, 2023Mar 14, 2023
    • nextflow

      Public
      A DSL for data-driven computational pipelines
      Groovy
      Apache License 2.0
      781000Updated Mar 11, 2023Mar 11, 2023
    • An Awesome List of Open-Source Data Engineering Projects
      Other
      543200Updated Mar 7, 2023Mar 7, 2023
    • AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker
      Jupyter Notebook
      Apache License 2.0
      1.1k000Updated Feb 20, 2023Feb 20, 2023
    • Python
      MIT No Attribution
      39000Updated Feb 19, 2023Feb 19, 2023
    • Python
      MIT No Attribution
      8000Updated Dec 31, 2022Dec 31, 2022
    • Java
      MIT No Attribution
      2000Updated Nov 10, 2022Nov 10, 2022
    • Alerting and notification in a serverless data lake during failures
      Python
      MIT No Attribution
      3000Updated Nov 4, 2022Nov 4, 2022
    • Build, Test and Deploy ETL solutions using AWS Glue and AWS CDK based CI/CD pipelines
      Python
      MIT No Attribution
      24000Updated Oct 4, 2022Oct 4, 2022
    • Python
      MIT No Attribution
      2000Updated Oct 3, 2022Oct 3, 2022
    • arvados

      Public
      An open source platform for managing and analyzing biomedical big data
      Go
      Other
      126000Updated Sep 19, 2022Sep 19, 2022
    • .github

      Public
      0000Updated Sep 19, 2022Sep 19, 2022
    • Construct a modern data stack and orchestration the workflows to create high quality data for analytics and ML applications.
      Jupyter Notebook
      39000Updated Sep 12, 2022Sep 12, 2022
    • HCL
      MIT No Attribution
      12000Updated Sep 9, 2022Sep 9, 2022
    • AWS Data Engineering Project using Lambda, S3 and SNS
      Python
      4300Updated Aug 29, 2022Aug 29, 2022
    • querypal

      Public
      Web UI for Amazon Athena
      Vue
      Apache License 2.0
      25000Updated Aug 29, 2022Aug 29, 2022
    • These Terraform modules aggregate Security Hub findings to centralized account using Amazon Kinesis Firehose and AWS Glue
      HCL
      Apache License 2.0
      5000Updated Aug 23, 2022Aug 23, 2022
    • A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, A…
      Java
      GNU General Public License v2.0
      57000Updated Aug 12, 2022Aug 12, 2022
    • Process to gather streaming data from Airline API using NiFi & batch data using AWS redshift using Sqoop and build a data pipeline to analyse the data using Ap…
      GNU General Public License v3.0
      11100Updated Jul 20, 2022Jul 20, 2022
    • This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.
      MIT No Attribution
      41000Updated Jul 18, 2022Jul 18, 2022
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.