@@ -22,7 +22,7 @@ Introduce a command-line tool that enables bulk migration of Iceberg tables from
2222
2323There are various reasons why users may want to move their Iceberg tables to a different catalog. For instance,
2424* They were using hadoop catalog and later realized that it is not production recommended. So, they want to move tables to other production ready catalogs.
25- * They just heard about the awesome Arctic catalog (or Nessie) and want to move their existing iceberg tables to Dremio Arctic .
25+ * They just heard about the awesome Apache Polaris catalog and want to move their existing iceberg tables to Apache Polaris catalog .
2626* They had an on-premise Hive catalog, but want to move tables to a cloud-based catalog as part of their cloud migration strategy.
2727
2828The CLI tool should support two commands
@@ -45,7 +45,7 @@ Need to have Java installed in your machine (Java 21 is recommended and the mini
4545
4646Below is the CLI syntax:
4747```
48- $ java -jar iceberg-catalog-migrator-cli-0.3.0 .jar -h
48+ $ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar -h
4949Usage: iceberg-catalog-migrator [-hV] [COMMAND]
5050 -h, --help Show this help message and exit.
5151 -V, --version Print version information and exit.
@@ -56,7 +56,7 @@ Commands:
5656```
5757
5858```
59- $ java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate -h
59+ $ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate -h
6060Usage: iceberg-catalog-migrator migrate [-hV] [--disable-safety-prompts] [--dry-run] [--stacktrace] [--output-dir=<outputDirPath>]
6161 (--source-catalog-type=<type> --source-catalog-properties=<String=String>[,<String=String>...]
6262 [--source-catalog-properties=<String=String>[,<String=String>...]]...
@@ -130,83 +130,110 @@ Identifier options:
130130Note: Options for register command is exactly same as migrate command.
131131
132132# Sample Inputs
133- ## Bulk registering all the tables from Hadoop catalog to Nessie catalog (main branch)
133+
134+ Note:
135+ a) Before migrating tables to Apache polaris, Make sure the catalog instance is configured to the ` base-location `
136+ same as source catalog ` warehouse ` location during catalog creation.
137+
138+ ```
139+ {
140+ "catalog": {
141+ "name": "test",
142+ "type": "INTERNAL",
143+ "readOnly": false,
144+ "properties": {
145+ "default-base-location": "file:/path/to/source_catalog"
146+ },
147+ "storageConfigInfo": {
148+ "storageType": "FILE",
149+ "allowedLocations": [
150+ "file:/path/to/source_catalog"
151+ ]
152+ }
153+ }
154+ }
155+ ```
156+
157+ b) Get the Oauth token and export it to the local variable
158+
134159``` shell
135- java -jar iceberg-catalog-migrator-cli-0.3.0.jar register \
136- --source-catalog-type HADOOP \
137- --source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
138- --target-catalog-type NESSIE \
139- --target-catalog-properties uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouse
160+ curl -X POST http://localhost:8181/api/catalog/v1/oauth/tokens \
161+ -d " grant_type=client_credentials" \
162+ -d " client_id=my-client-id" \
163+ -d " client_secret=my-client-secret" \
164+ -d " scope=PRINCIPAL_ROLE:ALL"
165+
166+ export TOKEN=xxxxxxx
140167```
141168
142- ## Register all the tables from Hadoop catalog to Arctic catalog (main branch)
169+ c) Also export the required storage related configs and use them respectively for catalog configuration.
170+ For s3,
143171
144172``` shell
145- export PAT=xxxxxxx
146173export AWS_ACCESS_KEY_ID=xxxxxxx
147174export AWS_SECRET_ACCESS_KEY=xxxxxxx
148175export AWS_S3_ENDPOINT=xxxxxxx
149176```
150177
178+ for ADLS,
151179``` shell
152- java -jar iceberg-catalog-migrator-cli-0.3.0.jar register \
153- --source-catalog-type HADOOP \
154- --source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
155- --target-catalog-type NESSIE \
156- --target-catalog-properties uri=https://nessie.dremio.cloud/v1/repositories/8158e68a-5046-42c6-a7e4-c920d9ae2475,ref=main,warehouse=/tmp/warehouse,authentication.type=BEARER,authentication.token=$PAT
180+ export AZURE_SAS_TOKEN=< token>
157181```
158182
159- ## Migrate selected tables (t1,t2 in namespace foo) from Arctic catalog (main branch) to Hadoop catalog.
160-
183+ ## Bulk registering all the tables from Hadoop catalog to Polaris catalog
161184``` shell
162- export PAT=xxxxxxx
163- export AWS_ACCESS_KEY_ID=xxxxxxx
164- export AWS_SECRET_ACCESS_KEY=xxxxxxx
165- export AWS_S3_ENDPOINT=xxxxxxx
185+ java -jar iceberg-catalog-migrator-cli-0.0.1.jar register \
186+ --source-catalog-type HADOOP \
187+ --source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
188+ --target-catalog-type REST \
189+ --target-catalog-properties uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
166190```
167191
192+ ## Migrate selected tables (t1,t2 in namespace foo) from Hadoop catalog to Polaris catalog
193+
168194``` shell
169- java -jar iceberg-catalog-migrator-cli-0.3.0.jar migrate \
170- --source-catalog-type NESSIE \
171- --source-catalog-properties uri=https://nessie.dremio.cloud/v1/repositories/8158e68a-5046-42c6-a7e4-c920d9ae2475,ref=main,warehouse=/tmp/warehouse,authentication.type=BEARER,authentication.token=$PAT \
172- --target-catalog-type HADOOP \
195+ java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
196+ --source-catalog-type HADOOP \
197+ --source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
198+ --target-catalog-type REST \
199+ --target-catalog-properties uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN \
173200--identifiers foo.t1,foo.t2
174201```
175202
176- ## Migrate all tables from GLUE catalog to Arctic catalog (main branch)
203+ ## Migrate all tables from GLUE catalog to Polaris catalog
177204``` shell
178- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
205+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
179206--source-catalog-type GLUE \
180207--source-catalog-properties warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO \
181- --target-catalog-type NESSIE \
182- --target-catalog-properties uri=https ://nessie.dremio.cloud/v1/repositories/612a4560-1178-493f-9c14-ab6b33dc31c5,ref=main, warehouse=s3a://some-other-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,authentication.type=BEARER,authentication. token=$PAT
208+ --target-catalog-type REST \
209+ --target-catalog-properties uri=http ://localhost:60904/api/catalog, warehouse=test, token=$TOKEN
183210```
184211
185- ## Migrate all tables from HIVE catalog to Arctic catalog (main branch)
212+ ## Migrate all tables from HIVE catalog to Polaris catalog
186213``` shell
187- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
214+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
188215--source-catalog-type HIVE \
189216--source-catalog-properties warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083 \
190- --target-catalog-type NESSIE \
191- --target-catalog-properties uri=https ://nessie.dremio.cloud/v1/repositories/612a4560-1178-493f-9c14-ab6b33dc31c5,ref=main, warehouse=s3a://some-other-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,authentication.type=BEARER,authentication. token=$PAT
217+ --target-catalog-type REST \
218+ --target-catalog-properties uri=http ://localhost:60904/api/catalog, warehouse=test, token=$TOKEN
192219```
193220
194- ## Migrate all tables from DYNAMODB catalog to Arctic catalog (main branch)
221+ ## Migrate all tables from DYNAMODB catalog to Polaris catalog
195222``` shell
196- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
223+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
197224--source-catalog-type DYNAMODB \
198225--source-catalog-properties warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO \
199- --target-catalog-type NESSIE \
200- --target-catalog-properties uri=https ://nessie.dremio.cloud/v1/repositories/612a4560-1178-493f-9c14-ab6b33dc31c5,ref=main, warehouse=s3a://some-other-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,authentication.type=BEARER,authentication. token=$PAT
226+ --target-catalog-type REST \
227+ --target-catalog-properties uri=http ://localhost:60904/api/catalog, warehouse=test, token=$TOKEN
201228```
202229
203- ## Migrate all tables from JDBC catalog to Arctic catalog (main branch)
230+ ## Migrate all tables from JDBC catalog to Polaris catalog
204231``` shell
205- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
232+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
206233--source-catalog-type JDBC \
207234--source-catalog-properties warehouse=/tmp/warehouseJdbc,jdbc.user=root,jdbc.password=pass,uri=jdbc:mysql://localhost:3306/db1,name=catalogName \
208- --target-catalog-type NESSIE \
209- --target-catalog-properties uri=https ://nessie.dremio.cloud/v1/repositories/612a4560-1178-493f-9c14-ab6b33dc31c5,ref=main, warehouse=/tmp/nessiewarehouse,authentication.type=BEARER,authentication. token=$PAT
235+ --target-catalog-type REST \
236+ --target-catalog-properties uri=http ://localhost:60904/api/catalog, warehouse=test, token=$TOKEN
210237```
211238
212239# Scenarios
@@ -219,7 +246,7 @@ Users can use a new catalog by creating a fresh table to test the new catalog's
219246
220247Sample input:
221248``` shell
222- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
249+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
223250--source-catalog-type HIVE \
224251--source-catalog-properties warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083 \
225252--target-catalog-type NESSIE \
@@ -235,7 +262,7 @@ The list of table identifiers in `dry_run.txt` can be altered (if needed) and re
235262
236263Sample input:
237264``` shell
238- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
265+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
239266--source-catalog-type HIVE \
240267--source-catalog-properties warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083 \
241268--target-catalog-type NESSIE \
@@ -287,7 +314,7 @@ and also log any table level failures, if present.
287314
288315Sample input:
289316``` shell
290- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
317+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
291318--source-catalog-type HIVE \
292319--source-catalog-properties warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083 \
293320--target-catalog-type NESSIE \
@@ -331,7 +358,7 @@ Users can provide the selective list of identifiers to migrate using any of thes
331358
332359Sample input: (only migrate tables that starts with "foo.")
333360``` shell
334- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
361+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
335362--source-catalog-type HIVE \
336363--source-catalog-properties warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083 \
337364--target-catalog-type NESSIE \
@@ -342,7 +369,7 @@ java -jar iceberg-catalog-migrator-cli-0.3.0.jar migrate \
342369
343370Sample input: (migrate all tables in the file ids.txt where each entry is delimited by newline)
344371``` shell
345- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
372+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
346373--source-catalog-type HIVE \
347374--source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
348375--target-catalog-type NESSIE \
@@ -352,7 +379,7 @@ java -jar iceberg-catalog-migrator-cli-0.3.0.jar migrate \
352379
353380Sample input: (migrate only two tables foo.tbl1, foo.tbl2)
354381``` shell
355- java -jar iceberg-catalog-migrator-cli-0.3.0 .jar migrate \
382+ java -jar iceberg-catalog-migrator-cli-0.0.1 .jar migrate \
356383--source-catalog-type HIVE \
357384--source-catalog-properties warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083 \
358385--target-catalog-type NESSIE \
0 commit comments