-
Notifications
You must be signed in to change notification settings - Fork 135
Open
Description
The problem
I want to export/import only the notebooks from a workspace but all the suggested commands are not working.
Steps
- In tried the manual way, as it seems running the pipeline is something for admin and also we don't really need to export all the other stuffs are this is managed as infra as code with terraform etc
python export_db.py --profile $SRC_PROFILE --no-ssl-verification --export-home [email protected] --use-checkpoint --num-parallel 8 --retry-total 30 --retry-backoff 1.0this gave me an error
Note: running export_db.py directly is not recommended. Please use migration_pipeline.py
Exporting home directory: [email protected]
Traceback (most recent call last):
File "/path/to/migration/repo/migrate/export_db.py", line 332, in <module>
main()
File "/path/to/migration/repo/migrate/export_db.py", line 265, in main
ws_c = WorkspaceClient(client_config, checkpoint_service)
File "/path/to/migration/repo/migrate/dbclient/WorkspaceClient.py", line 32, in __init__
self.skip_large_nb = configs['skip_large_nb']
KeyError: 'skip_large_nb'- So I decided to try running the pipeline after the suggestion at the beginning and diving into the CLI helper I saw you can pass only notebook task
python3 migration_pipeline.py --profile $SRC_PROFILE --use-checkpoint --retry-total 30 --num-parallel 8 --retry-backoff 1.0 --keep-tasks notebooks --export-pipelinebut this errored for with
Using the session id: <the ID>
2025-03-21,08:51:13;INFO;Start export_instance_profiles
2025-03-21,08:51:13;INFO;export_instance_profiles Skipped.
2025-03-21,08:51:13;INFO;Start export_users
2025-03-21,08:51:13;INFO;export_users Skipped.
2025-03-21,08:51:13;INFO;Start export_groups
2025-03-21,08:51:13;INFO;export_groups Skipped.
2025-03-21,08:51:13;INFO;Start export_workspace_items_log
2025-03-21,08:51:13;INFO;export_workspace_items_log Skipped.
2025-03-21,08:51:13;INFO;Start export_workspace_acls
2025-03-21,08:51:13;INFO;export_workspace_acls Skipped.
2025-03-21,08:51:13;INFO;Start export_notebooks
Traceback (most recent call last):
File "/path/to/migration/repo/migrate/migration_pipeline.py", line 378, in <module>
main()
File "/path/to/migration/repo/migrate/migration_pipeline.py", line 374, in main
pipeline.run()
File "/path/to/migration/repo/migrate/pipeline/pipeline.py", line 64, in run
future.result()
File "/path/to/python/env/envs/bricks-migration/lib/python3.9/concurrent/futures/_base.py", line 446, in result
return self.__get_result()
File "/path/to/python/env/envs/bricks-migration/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/path/to/python/env/envs/bricks-migration/lib/python3.9/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/path/to/migration/repo/migrate/pipeline/pipeline.py", line 73, in _run_task
task.run()
File "/path/to/migration/repo/migrate/tasks/tasks.py", line 144, in run
num_notebooks = ws_c.download_notebooks(num_parallel=self.client_config["num_parallel"])
File "/path/to/migration/repo/migrate/dbclient/WorkspaceClient.py", line 323, in download_notebooks
raise Exception("Run --workspace first to download full log of all notebooks.")
Exception: Run --workspace first to download full log of all notebooks.- so I tried
python3 migration_pipeline.py --profile $SRC_PROFILE --use-checkpoint --retry-total 30 --num-parallel 8 --retry-backoff 1.0 --keep-tasks notebooks --export-pipeline --workspacebut gotmigration_pipeline.py: error: unrecognized arguments: --workspace
Also there is no option to exporting just your own notebooks using the --export-home flag like above.
So I decided to fix the code myself for export_db.py script, to set skip_large_nb = None. Dets on incoming PR.
Metadata
Metadata
Assignees
Labels
No labels