Support persist sandbox metadaba to database#730
Open
zhangjaycee wants to merge 10 commits intoalibaba:masterfrom
Open
Support persist sandbox metadaba to database#730zhangjaycee wants to merge 10 commits intoalibaba:masterfrom
zhangjaycee wants to merge 10 commits intoalibaba:masterfrom
Conversation
Add DatabaseConfig dataclass (url field) to rock/config.py and wire it into RockConfig both as a field and in the from_env() YAML parser.
- Add Base(DeclarativeBase) as the single SQLAlchemy declarative base - Add SandboxRecord ORM model with all sandbox metadata columns - Add LIST_BY_ALLOWLIST and _NOT_NULL_DEFAULTS class-level constants - Add DatabaseProvider with async engine/session factory - Add DatabaseConfig dataclass to RockConfig - _convert_url handles sqlite://, postgresql://, and postgres:// (Heroku) shorthand; URLs with existing driver specifier pass through unchanged - Default state column value uses string literal "pending" instead of State.PENDING enum instance for explicit column semantics
- Add SandboxTable with insert/get/update/delete/list_by/list_by_in - _filter_data strips unknown keys; _NOT_NULL_DEFAULTS fills NOT NULL cols - LIST_BY_ALLOWLIST prevents arbitrary column queries (injection guard) - _record_to_sandbox_info uses lru_cache to avoid repeated get_type_hints calls in bulk list_by scenarios - Add SandboxInfoField generated type and generation script
- Redis alive/timeout keys remain the source of truth for live state
- DB writes are fire-and-forget via asyncio.create_task + _safe_db_call
- batch_get: Redis hits served directly; DB fallback uses a single
list_by_in("sandbox_id", miss_ids) query instead of N serial gets,
leveraging the primary key index for O(1) lookup per row
- iter_alive_sandbox_ids queries DB by state IN (running, pending)
instead of Redis scan_iter, enabling indexed filtering
…e to meta_repo - Replace MetaStore with SandboxRepository throughout SandboxManager, GemManager, BaseManager, and SandboxProxyService - Wire SandboxRepository (Redis + SandboxTable) in admin/main.py startup - stop(): add early return after archive() in the ValueError except branch to prevent double archive when the Ray actor is already gone Made-with: Cursor
- Add TestSandboxTableWithSQLite: full CRUD coverage using SQLite in-memory database (no external dependencies, runs in fast CI) including list_by_in, NOT NULL defaults, and noop-on-missing-id cases - Add TestSandboxTableWithPostgres: PostgreSQL-specific tests (JSONB, real container) marked need_docker + need_database - Add comprehensive SandboxRepository tests: create/update/delete/archive/ get/exists/batch_get/list_by/refresh_timeout/is_expired - Consistent lowercase "stopped" state string throughout test data, matching the State enum value convention (running/pending)
- Add single-column indexes on all commonly queried fields (user_id, state, namespace, experiment_id, cluster_name, image, host_ip, host_name, create_user_gray_flag) - Add scripts/gen_ddl.py to emit CREATE TABLE / CREATE INDEX DDL - Add *.db and ddl/ to .gitignore (generated artifacts)
OperatorContext was missing redis_provider, leaving RayOperator._redis_provider as None. This caused the use_rocklet get_status path to crash with 'NoneType object has no attribute get' because build_sandbox_from_redis skips the lookup entirely when redis_provider is None.
- Add update_version field to SandboxInfo/SandboxRecord and lock_sandbox_key() helper - Add LockResult enum and lock operations to RedisProvider (create_and_acquire, acquire, optimistic_update, release) - Add version-guarded update() to SandboxTable to skip stale writes - Implement SandboxRepository lock context managers (create_and_acquire_lock, acquire_lock) and version-aware CRUD - Wrap SandboxManager.start_async/stop with pessimistic lock context managers; remove _check_sandbox_exists_in_redis - Add tests for optimistic update behaviour in SandboxRepository - Add update_version to SandboxInfoField Literal type
- Document that create() does not write the lock key and update() no-ops without it; callers must use create_and_acquire_lock + create(..., version=). - Add create_with_lock helper and use it in TestSandboxRepositoryWithDocker. - Drop unused socket import in test_sandbox_repository.py.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
close #729