Insert high throughput events using iceberg #23592
Unanswered
allanbatista
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am analyzing how to do a massive insert using trino (with iceberg). - about 1 million per minute (each event has about 1kb)
I tried to do a this inserts using SQL using python connector, but the throughput is so slow.
I tried to parallelize using multiple workers, but I get continuous error from iceberg metadata.
I have my current pipeline in AWS. Kafka + Firehose + S3 + Athena
Is possible to update a partition in trino like Athena (
ALTER TABLE ADD PARTITION
) to add a already exists file in a path structured partitioned?Filepath example:
account_id=account-id-1/service_name=service-name-1/year=2021/month=01/day=01/hour=00/1727426921807104378_N_acc88e31-eb5e-4eac-be1e-871703dedbda.parquet
Beta Was this translation helpful? Give feedback.
All reactions