| Airbyte |
An open-source data integration platform. It allows the creation of ELT data pipelines and is shipped with more than 140 out-of-the-box connectors. |
| Apache Spark |
A multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters |
| Apache Flink |
Real-time data ingestion and processing into ClickHouse through Flink's DataStream API with support for batch writes |
| Amazon Glue |
A fully managed, serverless data integration service provided by Amazon Web Services (AWS) simplifying the process of discovering, preparing, and transforming data for analytics, machine learning, and application development. |
| Artie |
A fully managed real-time data streaming platform that replicates production data into ClickHouse, unlocking customer-facing analytics, operational workflows, and Agentic AI in production. |
| Azure Synapse |
A fully managed, cloud-based analytics service provided by Microsoft Azure, combining big data and data warehousing to simplify data integration, transformation, and analytics at scale using SQL, Apache Spark, and data pipelines. |
| Azure Data Factory |
A cloud-based data integration service that enables you to create, schedule, and orchestrate data workflows at scale. |
| Apache Beam |
An open-source, unified programming model that enables developers to define and execute both batch and stream (continuous) data processing pipelines. |
| BladePipe |
A real-time end-to-end data integration tool with sub-second latency, boosting seamless data flow across platforms. |
| dbt |
Enables analytics engineers to transform data in their warehouses by simply writing select statements. |
| dlt |
An open-source library that you can add to your Python scripts to load data from various and often messy data sources into well-structured, live datasets. |
| Estuary |
A right-time data platform that enables millisecond-latency ETL pipelines with flexible deployment options. |
| Fivetran |
An automated data movement platform moving data out of, into and across your cloud data platforms. |
| NiFi |
An open-source workflow management software designed to automate data flow between software systems. |
| Vector |
A high-performance observability data pipeline that puts organizations in control of their observability data. |