Kafka Connect BigQuery Connector

This is an implementation of a sink connector from Apache Kafka to Google BigQuery, built on top of Apache Kafka Connect.

Documentation

The Kafka Connect BigQuery Connector documentation is available online at https://aiven-open.github.io/bigquery-connector-for-apache-kafka/. The site contains a complete list of the configuration options as well as information about the project.

Configuration notes

If the configuration includes a JSON GCP credential structure that uses a credential_source entry, one of the following environment variables must be set.

Source Type	Environment Variable
file	io.aiven.commons.envcheck.files
url	io.aiven.commons.envcheck.uri
executable	io.aiven.commons.envcheck.cmd

The environment variables contain a comma separated list of valid entries for each type. If the environment variable is not set, or the JSON value is not found in the environment variable, the value will be prohibited and an exception thrown before the connector starts.

As an example, to access https://example.com/credentials.cgi the environment variable io.aiven.commons.envcheck.uri would need to contain the URL:

export io.aiven.commons.envcheck.uri=https://example.com/credentials.cgi
# start the kafka processes

To add an additional URL, for example https://example.net/credentials.cgi the export would look like:

export io.aiven.commons.envcheck.uri=https://example.com/credentials.cgi,https://example.net/credentials.cgi 
# start the kafka processes

History

This connector was originally developed by WePay. In late 2020 the project moved to Confluent, with both companies taking on maintenance duties. In 2024, Aiven created its own fork based off the Confluent project in order to continue maintaining an open source, Apache 2-licensed version of the connector.

Configuration

Sample

A simple example connector configuration, that reads records from Kafka with JSON-encoded values and writes their values to BigQuery:

{
  "connector.class": "com.wepay.kafka.connect.bigquery.BigQuerySinkConnector",
  "topics": "users, clicks, payments",
  "tasks.max": "3",
  "value.converter": "org.apache.kafka.connect.json.JsonConverter",

  "project": "kafka-ingest-testing",
  "defaultDataset": "kcbq-example",
  "keyfile": "/tmp/bigquery-credentials.json"
}

Complete docs

See the configuration documentation for a list of the connector's configuration properties.

Download

Download information is available on the project web site.

Building from source

This project uses the Maven build tool.

To compile the project without running the integration tests execute mvn package -DskipITs.

To build the documentation execute the following steps:

mvn install -DskipITs
mvn -f tools
mvn -f docs

Once the documentation is built it can be run by executing mvn -f docs site:run.

Integration test setup

Integration tests require a live BigQuery and Kafka installation. Configuring those components is beyond the scope of this document.

Once you have the test environment ready, integration specific environment variables must be set.

Local configuration

GOOGLE_APPLICATION_CREDENTIALS - the path to a json file that was download when the GCP account key was created.
KCBQ_TEST_BUCKET - the name of the bucket to use for testing,
KCBQ_TEST_DATASET - the name of the dataset to use for testing,
KCBQ_TEST_KEYFILE - same as the GOOGLE_APPLICATION_CREDENTIALS
KCBQ_TEST_PROJECT - the name of the project to use.

GitHub configuration

To run the integration tests from a GitHub action the following variables must be set

GCP_CREDENTIALS - the contents of a json file that was download when the GCP account key was created.
KCBQ_TEST_BUCKET - the bucket to use for the tests
KCBQ_TEST_DATASET - the data set to use for the tests.
KCBQ_TEST_PROJECT - the project to use for the tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka Connect BigQuery Connector

Documentation

Configuration notes

History

Configuration

Sample

Complete docs

Download

Building from source

Integration test setup

Local configuration

GitHub configuration

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Kafka Connect BigQuery Connector

Documentation

Configuration notes

History

Configuration

Sample

Complete docs

Download

Building from source

Integration test setup

Local configuration

GitHub configuration