Skip to content

Migration script failling with 403 error when using a valid token  #301

@jcampabadal-db

Description

@jcampabadal-db

Hi team,

Customer is facing issues running the migration tool. They have a jumpbox in same AWS account as the source workspace.

We tested with a fresh PAT token. We can hit this endpoint successfully with curl and same PAT https://dbc-b79f98ff-f7c9.cloud.databricks.com:443](https://dbc-b79f98ff-f7c9.cloud.databricks.com/) "GET /api/2.0/secrets/scopes/list HTTP/1.1"

Here we can see databricks CLI works with the configured source workspace profile.

$ databricks --profile Dosatsu-AWS clusters list

ID Name State

0429-184401-if1pioz1 DSV ML-CPU single-node cluster RUNNING

...

Here we do a fresh git pull of the repo

$ git pull

Already up to date.

Then when we run the pipeline script it fails with 403 error when hitting the API endpoints it requires.

$ python3 migration_pipeline.py --profile Dosatsu-AWS --export-pipeline --use-checkpoint --session try1 --debug --skip-failed --skip-missing-users

Using the session id: try1

2024-07-30,17:36:01;INFO;https://dbc-b79f98ff-f7c9.cloud.databricks.com/

2024-07-30,17:36:01;INFO;export_instance_profiles found in checkpoint

2024-07-30,17:36:01;INFO;Task export_instance_profiles already completed, found in checkpoint

2024-07-30,17:36:01;INFO;export_users found in checkpoint

2024-07-30,17:36:01;INFO;Task export_users already completed, found in checkpoint

2024-07-30,17:36:01;INFO;export_groups found in checkpoint

2024-07-30,17:36:01;INFO;Task export_groups already completed, found in checkpoint

2024-07-30,17:36:01;INFO;export_workspace_items_log found in checkpoint

2024-07-30,17:36:01;INFO;Task export_workspace_items_log already completed, found in checkpoint

2024-07-30,17:36:01;INFO;export_workspace_acls found in checkpoint

2024-07-30,17:36:01;INFO;Task export_workspace_acls already completed, found in checkpoint

2024-07-30,17:36:01;INFO;export_notebooks found in checkpoint

2024-07-30,17:36:01;INFO;Task export_notebooks already completed, found in checkpoint

2024-07-30,17:36:01;INFO;Start export_secrets

2024-07-30,17:36:01;DEBUG;Starting new HTTPS connection (1): dbc-b79f98ff-f7c9.cloud.databricks.com:443

2024-07-30,17:36:02;DEBUG;https://dbc-b79f98ff-f7c9.cloud.databricks.com:443 "GET /api/2.0/secrets/scopes/list HTTP/1.1" 403 52

2024-07-30,17:36:02;WARNING;{"error_code": 403, "message": "Invalid access token."}

2024-07-30,17:36:02;DEBUG;https://dbc-b79f98ff-f7c9.cloud.databricks.com:443 "GET /api/2.0/clusters/list HTTP/1.1" 403 52

2024-07-30,17:36:02;WARNING;{"error_code": 403, "message": "Invalid access token."}

2024-07-30,17:36:02;INFO;Starting cluster with name: Workspace_Migration_Work_Leave_Me_Alone

2024-07-30,17:36:02;DEBUG;https://dbc-b79f98ff-f7c9.cloud.databricks.com:443 "POST /api/2.0/clusters/create HTTP/1.1" 403 52

2024-07-30,17:36:02;WARNING;{"error_code": 403, "message": "Invalid access token."}

...

Exception: Could not launch cluster. Verify that the --azure or --gcp flag or cluster config is correct.

image

Any ideas what could be causing this issue? Could it be a problem with the parsing from databricks CLI profile, any environment variables or stale caching done by the script?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions