This blueprint deploys a KDA app that reads from MSK Serverless using IAM auth and writes to S3 using the Python Table API:
- Flink version:
1.15.2 - Python version:
3.8
- New (in Flink 1.13)
KafkaSourceconnector (FlinkKafkaSourceis slated to be deprecated). FileSink(StreamingFileSinkis slated to be deprecated).
- Package and deploy Python to S3
- Deploy associated infra (MSK and KDA) using CDK script
- If using existing resources, you can simply update app properties in KDA.
- Perform data generation
- Maven
- AWS SDK v2
- AWS CDK v2 - for deploying associated infra (MSK and KDA app)
- First, let's set up some environment variables to make the deployment easier. Replace these values with your own S3 bucket, app name, etc.
export AWS_PROFILE=<<profile-name>>
export APP_NAME=<<name-of-your-app>>
export S3_BUCKET=<<your-s3-bucket-name>>
export S3_FILE_KEY=<<your-jar-name-on-s3>>- Package Python app folder into zip package
Please follow the instructions detailed here.
- Copy zip package to S3 so it can be referenced in CDK deployment
aws s3 cp target/<<your app zip>> ${S3_BUCKET}/{S3_FILE_KEY}-
Follow instructions in the
cdk-infrafolder to deploy the infrastructure associated with this app - such as MSK Serverless and the Kinesis Data Analytics application. -
Follow instructions in orders-datagen to create topic and generate data into MSK.
-
Start your Kinesis Data Analytics application from the AWS console.
-
Do a Flink query or S3 Select Query against S3 to view data written to S3.
