-
Notifications
You must be signed in to change notification settings - Fork 135
Description
Hi team,
Customer is facing issues running the migration tool. They have a jumpbox in same AWS account as the source workspace.
We tested with a fresh PAT token. We can hit this endpoint successfully with curl and same PAT https://dbc-b79f98ff-f7c9.cloud.databricks.com:443](https://dbc-b79f98ff-f7c9.cloud.databricks.com/) "GET /api/2.0/secrets/scopes/list HTTP/1.1"
Here we can see databricks CLI works with the configured source workspace profile.
$ databricks --profile Dosatsu-AWS clusters list
ID Name State
0429-184401-if1pioz1 DSV ML-CPU single-node cluster RUNNING
...
Here we do a fresh git pull of the repo
$ git pull
Already up to date.
Then when we run the pipeline script it fails with 403 error when hitting the API endpoints it requires.
$ python3 migration_pipeline.py --profile Dosatsu-AWS --export-pipeline --use-checkpoint --session try1 --debug --skip-failed --skip-missing-users
Using the session id: try1
2024-07-30,17:36:01;INFO;https://dbc-b79f98ff-f7c9.cloud.databricks.com/
2024-07-30,17:36:01;INFO;export_instance_profiles found in checkpoint
2024-07-30,17:36:01;INFO;Task export_instance_profiles already completed, found in checkpoint
2024-07-30,17:36:01;INFO;export_users found in checkpoint
2024-07-30,17:36:01;INFO;Task export_users already completed, found in checkpoint
2024-07-30,17:36:01;INFO;export_groups found in checkpoint
2024-07-30,17:36:01;INFO;Task export_groups already completed, found in checkpoint
2024-07-30,17:36:01;INFO;export_workspace_items_log found in checkpoint
2024-07-30,17:36:01;INFO;Task export_workspace_items_log already completed, found in checkpoint
2024-07-30,17:36:01;INFO;export_workspace_acls found in checkpoint
2024-07-30,17:36:01;INFO;Task export_workspace_acls already completed, found in checkpoint
2024-07-30,17:36:01;INFO;export_notebooks found in checkpoint
2024-07-30,17:36:01;INFO;Task export_notebooks already completed, found in checkpoint
2024-07-30,17:36:01;INFO;Start export_secrets
2024-07-30,17:36:01;DEBUG;Starting new HTTPS connection (1): dbc-b79f98ff-f7c9.cloud.databricks.com:443
2024-07-30,17:36:02;DEBUG;https://dbc-b79f98ff-f7c9.cloud.databricks.com:443 "GET /api/2.0/secrets/scopes/list HTTP/1.1" 403 52
2024-07-30,17:36:02;WARNING;{"error_code": 403, "message": "Invalid access token."}
2024-07-30,17:36:02;DEBUG;https://dbc-b79f98ff-f7c9.cloud.databricks.com:443 "GET /api/2.0/clusters/list HTTP/1.1" 403 52
2024-07-30,17:36:02;WARNING;{"error_code": 403, "message": "Invalid access token."}
2024-07-30,17:36:02;INFO;Starting cluster with name: Workspace_Migration_Work_Leave_Me_Alone
2024-07-30,17:36:02;DEBUG;https://dbc-b79f98ff-f7c9.cloud.databricks.com:443 "POST /api/2.0/clusters/create HTTP/1.1" 403 52
2024-07-30,17:36:02;WARNING;{"error_code": 403, "message": "Invalid access token."}
...
Exception: Could not launch cluster. Verify that the --azure or --gcp flag or cluster config is correct.
Any ideas what could be causing this issue? Could it be a problem with the parsing from databricks CLI profile, any environment variables or stale caching done by the script?
