-
Notifications
You must be signed in to change notification settings - Fork 165
feat(BA-3802): Migrate to SQLAlchemy 2.0 with comprehensive type safety improvements #7880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Update SQLAlchemy from 1.4.54 to 2.0.45 - Update python.lock with new dependencies This introduces type errors that need to be fixed in subsequent commits. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
…count - Change row["column"] to row.column attribute access (SQLAlchemy 2.0) - Change sa.select([col1, col2]) to sa.select(col1, col2) unpacking - Change sa.insert(table, values) to sa.insert(table).values(values) - Change result.rowcount to cast(CursorResult, result).rowcount 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
…count - Fix remaining row.column attribute access patterns - Fix remaining sa.select(*cols) unpacking patterns - Fix remaining cast(CursorResult, result).rowcount patterns 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix process_result_value first arg type to `Any | None` - Fix process_bind_param first arg type to `Any | None` - Fix copy() return type to `Self` - Fix VFolderHostPermissionColumn to pass dialect instead of None 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Replace `from sqlalchemy.orm import sessionmaker` with `from sqlalchemy.ext.asyncio import async_sessionmaker` - Update all sessionmaker() calls to async_sessionmaker() - Remove redundant class_=SASession parameter (AsyncSession is default) Files modified: - manager/models/utils.py - account_manager/models/utils.py - appproxy/coordinator/models/utils.py - manager/repositories/scheduler/db_source/db_source.py - manager/repositories/artifact/db_source/db_source.py - manager/repositories/deployment/db_source/db_source.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Migrate Row classes to SQLAlchemy 2.0 Mapped/mapped_column style: - audit_log/row.py - network/row.py - container_registry/row.py - rbac_models/association_scopes_entities.py Changes: - sa.Column → mapped_column with Mapped[T] type hints - IDColumn() → mapped_column with GUID type - relationship() → Mapped[T] type hints for relationships 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR migrates the codebase from SQLAlchemy 1.4 to SQLAlchemy 2.0 with comprehensive type safety improvements. The changes include:
- Updated SQLAlchemy query construction patterns from legacy list-based syntax to modern syntax
- Migrated from
sa.Tabledefinitions to ORM-mapped classes withMappedtype annotations - Added extensive type hints with proper nullable handling
- Improved row access patterns from dict-style to attribute access
- Enhanced error handling with null checks
- Updated session and connection management to async patterns
Reviewed changes
Copilot reviewed 209 out of 211 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/manager/test_queryfilter.py | Removed deprecated list syntax from sa.select() calls |
| tests/unit/manager/test_idle_checker.py | Added cast(Row[Any], ...) for type safety in test fixtures |
| tests/unit/manager/services/ | Added assertions for non-null values before usage |
| tests/unit/manager/repositories/ | Updated sa.insert() to use .values(), added null checks |
| tests/unit/manager/models/ | Updated query syntax and added null assertions |
| src/ai/backend/manager/utils.py | Changed row access from dict-style to attribute access, added null checks |
| src/ai/backend/manager/sweeper/ | Updated to use begin_readonly_session() with scalars().all() |
| src/ai/backend/manager/services/ | Added extensive null checks for domain_name and other nullable fields |
| src/ai/backend/manager/scheduler/ | Added AgentId type wrapping and null handling |
| src/ai/backend/manager/repositories/ | Migrated to async_sessionmaker, updated query patterns |
| src/ai/backend/manager/models/vfolder/row.py | Migrated from Table to ORM class with Mapped annotations |
| src/ai/backend/manager/models/user/row.py | Migrated from Table to ORM class with comprehensive type annotations |
| src/ai/backend/manager/registry.py | Added extensive null checks and type conversions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| import uuid |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Module 'uuid' is imported with both 'import' and 'import from'.
|
|
||
| import logging | ||
| from collections.abc import Callable, Sequence | ||
| import uuid as uuid_mod |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Module 'uuid' is imported with both 'import' and 'import from'.
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| import uuid |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Module 'uuid' is imported with both 'import' and 'import from'.
Migrate endpoint/row.py to SQLAlchemy 2.0 Mapped/mapped_column style: - EndpointRow: All columns and relationships - EndpointTokenRow: All columns and relationships - EndpointAutoScalingRuleRow: All columns and relationships 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Migrate session/row.py and kernel/row.py to SQLAlchemy 2.0 Mapped/mapped_column style: - SessionRow, SessionDependencyRow: all columns and relationships converted - KernelRow: all columns and relationships converted - Added TYPE_CHECKING imports for relationship types to avoid circular imports - Changed IDColumn factory functions to IDColumnType usage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Migrate 12 Row files to SQLAlchemy 2.0 Mapped/mapped_column style: - image/row.py (ImageRow, ImageAliasRow) - scaling_group/row.py (ScalingGroupRow, ScalingGroupFor*Row) - scheduling_history/row.py (4 history Row classes) - routing/row.py (RoutingRow) - user/row.py (UserRow - full declarative conversion) - notification/row.py (NotificationChannelRow, NotificationRuleRow) - artifact/row.py (ArtifactRow) - deployment_auto_scaling_policy/row.py - deployment_revision/row.py - agent/row.py (AgentRow - full declarative conversion) - group/row.py (relationships only) - domain/row.py (DomainRow - full declarative conversion) Changes include: - sa.Column() → mapped_column() with Mapped[T] type hints - IDColumn() → explicit mapped_column with GUID and server_default - relationship() with proper Mapped type annotations - TYPE_CHECKING imports for relationship types - Backward compatibility via table_name = RowClass.__table__ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Migrate 15 Row files to SQLAlchemy 2.0 Mapped/mapped_column style:
- keypair/row.py (KeyPairRow)
- resource_policy/row.py (KeyPairResourcePolicyRow, UserResourcePolicyRow,
ProjectResourcePolicyRow)
- event_log/row.py (EventLogRow)
- object_storage/row.py (ObjectStorageRow)
- storage_namespace/row.py (StorageNamespaceRow)
- artifact_registries/row.py (ArtifactRegistryRow)
- huggingface_registry/row.py (HuggingfaceRegistryRow)
- reservoir_registry/row.py (ReservoirRegistryRow)
- app_config/row.py (AppConfigRow)
- vfs_storage/row.py (VFSStorageRow)
- artifact_revision/row.py (ArtifactRevisionRow)
- deployment_policy/row.py (DeploymentPolicyRow)
- association_artifacts_storages/row.py (AssociationArtifactsStorageRow)
- association_container_registries_groups/row.py
(AssociationContainerRegistriesGroupsRow)
- vfolder/row.py (VFolderRow, VFolderInvitationRow, VFolderPermissionRow)
Key changes:
- sa.Column() → mapped_column() with Mapped[T] type hints
- IDColumn() → explicit mapped_column("id", GUID, primary_key=True,
server_default=sa.text("uuid_generate_v4()"))
- sa.Table(...) + __table__ = table → declarative class with __tablename__
- Added backward compatibility aliases: table_name = RowClass.__table__
- Relationship annotations with Mapped[T], Mapped[T | None], Mapped[list[T]]
- TYPE_CHECKING imports for relationship types to avoid circular imports
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <[email protected]>
Replace IDColumn() patterns with explicit mapped_column definitions:
- scaling_group/row.py (ScalingGroupForDomainRow, ScalingGroupForProjectRow,
ScalingGroupForKeypairsRow)
- artifact/row.py (ArtifactRow)
- resource_preset/row.py (ResourcePresetRow)
- notification/row.py (NotificationChannelRow, NotificationRuleRow)
- deployment_auto_scaling_policy/row.py (DeploymentAutoScalingPolicyRow)
- group/row.py (AssocGroupUserRow, GroupRow - full sa.Table to declarative
class conversion)
Key changes:
- IDColumn() → mapped_column("id", GUID, primary_key=True,
server_default=sa.text("uuid_generate_v4()"))
- Removed IDColumn imports from base module
- group/row.py: Converted sa.Table patterns to declarative classes with
backward compatibility aliases
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <[email protected]>
- errors/storage.py: Convert Row["column"] to attribute access - RBAC models (6 files): role.py, user_role.py, permission_group.py, permission.py, object_permission.py, entity_field.py - Replace IDColumn() with mapped_column() - Add Mapped[T] type hints for all columns - Add Mapped wrapper for relationships - resource_preset/row.py: Complete ORM 2.0 migration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix python_type property return type (T_Enum -> type[T_Enum]) - Fix python_type property typo (_enum_class -> _enum_cls) - Fix type_descriptor argument (sa.JSON -> sa.JSON()) - Rename stmt to insert_stmt/update_stmt to fix type conflicts - Remove explicit Table type annotation to allow None from get() 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- errors/storage.py: Fix VFolderRow import path - resource_preset/row.py: - Add WhereableStatement type alias for Select/Update/Delete - Fix QueryOption type to support where() on Update/Delete - Use `| None` instead of Optional - Fix delete() return type (no return value) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove explicit type annotations on scalar_one_or_none() results - Add None checks with BackendAIError exceptions for nullable fields - Change Sequence[T] return types for .scalars().all() methods - Update dataclass field types to accept nullable values (datetime | None) - Fix nullable relationship access with proper None guards Files modified: - repositories: db_source files for various storages - models: routing, kernel, user, keypair, artifact_revision, etc. - data: deployment, model_serving, object_storage, user types 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
…ass type fixes - Update QueryCondition and QueryOption type aliases in models/types.py to properly reflect Select[Any] -> Select[Any] transformation pattern - Update load_related_field to accept _AbstractLoad instead of sa.orm.Load - Fix user/row.py: Change load_* methods to return _AbstractLoad, update check_credential functions to use ExtendedAsyncSAEngine - Fix group/row.py: Update load_resource_policy return type, fix GroupData and ProjectModel nullable field types, fix WhereClauseType definition - Fix agent/row.py: Update by_scaling_group/by_status/by_schedulable inner function return types, update AgentData/AgentDataForHeartbeatUpdate fields for compute_plugins (Mapping) and first_contact (nullable) - Fix container_registry/row.py: Add None check for nullable project field - Fix deployment_revision/row.py: Convert kernel_path to str, update ModelMountConfigData nullable fields 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
…e, and minilang - models/utils.py: Fix JSONCoalesceExpr type, reenter_txn_session return type, sa.literal for cast, version_str None check - repositories/base/types.py: Fix QueryOrder type to UnaryExpression | ColumnElement - repositories/base/purger.py: Cast table to sa.Table for primary_key access - repositories/base/upserter.py: Chain insert builder to avoid type reassignment - models/minilang/queryfilter.py: Fix atom and binary_expr return types - models/group/row.py: Add Any import, fix BooleanClauseList type argument - models/session/row.py: Fix load_option return types to _AbstractLoad, Sequence return types, cond variable annotations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Changes: - kernel/row.py: Fix None checks for status_history, status_data, sql_json_merge, to_kernel_info type conversions, recalc_concurrency_used scalar types - session/row.py: Fix to_dataclass/to_session_info type conversions, status_history None handling, scalars().all() return types - image/row.py: Add ImageID/ImageCanonical imports and conversions, fix resource key iteration, registry None check - scaling_group/row.py: Fix is_active/created_at None defaults, list return type for scalars().all() - utils.py: Add Row None checks for query results, convert resource_policy to dict - dotfile.py: Add Row None check - network/row.py: Fix options None assignment - rbac/__init__.py: Add domain_name None check - agent_cache.py: Fix _fetch_agent return type - container_registry/*: Fix registry_info.extra None access, scalar() result None check 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- session/row.py: Add SessionId wrapper for instance.id assignment - image/row.py: Fix SlotName key type handling in resources property - dotfile.py: Add type annotation for internal_data dict - gitlab.py: Fix headers dict type with conditional access_token - github.py: Handle None extra dict before accessing entity_type - vfolder/row.py: Fix select() list args, sql_json_merge column refs, VFolderMountPermission None handling, Container->Iterable type - endpoint/row.py: Fix Container->Iterable for in_() calls, add None checks for optional fields, fix service_ports cast, handle Row None 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove unused VFolderRow TypeAlias (Mapping[str, Any]) - Remove TypeAlias import - Remove VFolderDBRow alias, use VFolderRow directly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- schedule/repository.py: Fix CursorResult.rowcount, EndpointId import, route.session None checks, AgentId wrapping, session.connection() usage - api/stream.py: Fix urlparse().hostname bytes handling, service_ports cast, active_session_ids type annotation, add cast import - vfolder/repository.py: Fix VFolderHostPermissionMap return type, session.connection() usage, VFolderMountPermission import, sa.insert() syntax - session/service.py: Fix ImageIdentifier None checks, agent_row None check, LegacySessionInfo fields, scaling_group_name None check, await file.decode() - models/utils.py: Update sql_json_merge/sql_json_increment to accept InstrumentedAttribute 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
…scheduler dispatcher - Fix ColumnElement[bool] type for condition lists - Fix scalar_one_or_none() assignments (remove explicit type annotations) - Fix Sequence vs list type assignments with list() - Fix AgentId wrapping for agent.id - Fix creation_id None handling with fallback to empty string - Fix KernelId to str conversion for repository methods 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Changes: - Fix UserData dataclass to accept Optional fields matching UserRow - Fix fetch_current_time to handle scalar() returning None - Fix simple_db_mutate to accept Delete statements - Fix generate_sql_info_for_gql_connection to accept InstrumentedAttribute - Fix sweeper session/kernel to use begin_readonly_session with scalars() - Fix api/service.py Mapping[str, Any] attribute access using dict syntax - Fix api/session.py dependency query and null checks - Fix cluster_template.py and session_template.py variable reuse issues - Fix session/repository.py access_key null check - Fix model_serving.py to use query_userinfo_from_session 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Changes: - Fix BinaryExpression to ColumnElement[bool] type annotation - Fix access_key type from UUID to str in BaseResourceUsageGroup - Fix status_history type from str to Mapping[str, Any] - Add null checks for kernel.agent, container_id, status_history - Fix stat_map access to use .get() method - Fix resource_opts null handling with nmget - Fix return type from Sequence to list 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
…sitory Changes: - Fix variable reuse issue (Delete vs Update query) - Add null check for owner user in permission check - Add null checks for session_owner, access_key, and role - Wrap access_key with AccessKey newtype - Convert role to string for UserScope - Add null checks for model, model_row, and image_row - Fix vfolder_mounts Sequence to list conversion 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix predicates.py: None checks for resource policy, Sequence to list - Fix agent_selector.py: AgentId wrapping, ResourceGroupID usage - Fix drf.py: AccessKey wrapping, None checks for access_key - Fix resource_preset/db_source: Row indexing, AgentId conversion - Fix event_dispatcher/handlers/session.py: Variable reuse, status_data check - Fix idle.py: get_db_now assertion, variable naming - Fix image/db_source: Sequence to list, alias None handling - Fix container_registry/repository.py: Variable naming, list conversion - Fix permission_controller: Sequence to list, None checks, return type - Fix auth/db_source: UserRole import, fallback values - Fix agent/db_source: Sequence to list 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
| identifier=ImageIdentifier( | ||
| canonical=self.image, | ||
| architecture=self.architecture, | ||
| architecture=self.architecture or "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty architecture does not make sense.
How about commenting out this kind of code separately?
| "status": status, | ||
| "status_history": sql_json_merge( | ||
| SessionRow.status_history, | ||
| sessions.c.status_history, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change required?
| case SessionStatus.PREPARED: | ||
| await self.event_producer.anycast_event(DoStartSessionEvent()) | ||
| case SessionStatus.RUNNING: | ||
| creation_id = session_row.creation_id or "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If creation_id is expected to become required in the future → add a comment.
If it remains optional → let’s throw an exception.
| result = await conn.execute(query) | ||
| row = result.first() | ||
| if row is None: | ||
| raise ValueError("Unknown owner access key") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make a new exception
| result = await db_sess.execute(query) | ||
| row = result.first() | ||
| if row is None: | ||
| raise ValueError("Unknown owner access key") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make a new exception
| open_to_public=self.open_to_public, | ||
| created_at=self.created_at, | ||
| open_to_public=self.open_to_public if self.open_to_public is not None else False, | ||
| created_at=self.created_at or datetime.now(timezone.utc), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of the default values don’t make sense.
We’ll need to fix all of them along with a DB migration.
| "status_changed": now, | ||
| "status_history": sql_json_merge( | ||
| VFolderRow.status_history, | ||
| vfolders.c.status_history, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL JSON merge needs to operate on tables not ORM, so access to the .c files is required.
- test_idle_checker.py: Replace dict with mock_row() helper that supports attribute access, matching actual SQLAlchemy Row behavior - agent/row.py: Cast id to AgentId in to_data() for type consistency 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Use row._mapping["column"] instead of row["column"] for dict-style access - Add explicit null checks for row results - Add missing required_slots field to kernel test data 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- test_check_presets.py: Restore resource_policy dict with total_resource_slots and default_for_unspecified keys (reverts incorrect simplification) - test_auth_repository.py: Add missing need_password_change field to UserRow fixture - repository.py, db_source.py: Fix resource_policy type from Mapping[str, str] to Mapping[str, Any] to match actual API contract 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
978e9fa to
c1edd80
Compare
Co-authored-by: octodog <[email protected]>
|
|
||
| route_id: UUID | ||
| session_id: UUID | ||
| kernel_host: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay kernel_host to be nullable?
| resource_policy_dict = { | ||
| "total_resource_slots": kp_policy.total_resource_slots.to_json(), | ||
| "default_for_unspecified": str(kp_policy.default_for_unspecified), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this file default_for_unspecified field in resource_policy_dict has been gone. Is it okay?
| resource_policy_dict = { | ||
| "total_resource_slots": kp_policy.total_resource_slots.to_json(), | ||
| "default_for_unspecified": str(kp_policy.default_for_unspecified), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
| case ResourcePresetData(): | ||
| shared_memory = str(row.shared_memory) if row.shared_memory is not None else None | ||
| return cls( | ||
| id=row.id, | ||
| name=row.name, | ||
| resource_slots=row.resource_slots.to_json(), | ||
| shared_memory=shared_memory, | ||
| scaling_group_name=row.scaling_group_name, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that data is not being received correctly in from_row. If it's for existing operation compatibility, it would be better to fix it in subsequent tasks.
| if not session.main_kernel.image or not session.main_kernel.architecture: | ||
| error_msg = f"Session {manifest.session_id} main kernel has no image or architecture" | ||
| log.error(error_msg) | ||
| return CommitSessionResult(error_message=error_msg) | ||
| image_row = await self._session_repository.resolve_image([ | ||
| ImageIdentifier(session.main_kernel.image, session.main_kernel.architecture) | ||
| ]) | ||
| base_image_ref = image_row.image_ref | ||
|
|
||
| # Build new image canonical name | ||
| filtered_tag_set = [ | ||
| x for x in base_image_ref.tag.split("-") if not x.startswith("customized_") | ||
| ] | ||
|
|
||
| if base_image_ref.name == "": | ||
| new_name = base_image_ref.project | ||
| else: | ||
| new_name = base_image_ref.name | ||
|
|
||
| new_canonical = f"{manifest.registry_hostname}/{manifest.registry_project}/{new_name}:{'-'.join(filtered_tag_set)}" | ||
|
|
||
| # Check for existing customized image | ||
| existing_row = await self._session_repository.get_existing_customized_image( | ||
| new_canonical, | ||
| manifest.image_visibility.value, | ||
| manifest.image_owner_id, | ||
| manifest.image_name, | ||
| ) | ||
|
|
||
| customized_image_id: str | ||
| kern_features: list[str] | ||
| if existing_row is not None: | ||
| existing_image: ImageRow = existing_row | ||
| labels = existing_image.labels or {} | ||
| kern_features_str = labels.get(LabelName.FEATURES, DEFAULT_KERNEL_FEATURE) | ||
| kern_features = ( | ||
| kern_features_str.split() if kern_features_str else [DEFAULT_KERNEL_FEATURE] | ||
| ) | ||
| customized_image_id = labels.get(LabelName.CUSTOMIZED_ID, str(uuid.uuid4())) | ||
| log.debug("reusing existing customized image ID {}", customized_image_id) | ||
| else: | ||
| kern_features = [DEFAULT_KERNEL_FEATURE] | ||
| customized_image_id = str(uuid.uuid4()) | ||
| # Remove PRIVATE label for customized images | ||
| kern_features = [ | ||
| feat for feat in kern_features if feat != KernelFeatures.PRIVATE.value | ||
| ] | ||
|
|
||
| new_canonical += f"-customized_{customized_image_id.replace('-', '')}" | ||
|
|
||
| from ai.backend.common.docker import ImageRef | ||
|
|
||
| new_image_ref = ImageRef.from_image_str( | ||
| new_canonical, | ||
| None, | ||
| manifest.registry_hostname, | ||
| architecture=base_image_ref.architecture, | ||
| is_local=base_image_ref.is_local, | ||
| ) | ||
|
|
||
| # Prepare image labels | ||
| image_labels: dict[str | LabelName, str] = { | ||
| LabelName.CUSTOMIZED_OWNER: f"{manifest.image_visibility.value}:{manifest.image_owner_id}", | ||
| LabelName.CUSTOMIZED_NAME: manifest.image_name, | ||
| LabelName.CUSTOMIZED_ID: customized_image_id, | ||
| LabelName.FEATURES: " ".join(kern_features), | ||
| } | ||
| match manifest.image_visibility: | ||
| case CustomizedImageVisibilityScope.USER: | ||
| image_labels[LabelName.CUSTOMIZED_USER_EMAIL] = manifest.user_email | ||
|
|
||
| # Commit session | ||
| log.info("Committing session {}", manifest.session_id) | ||
| resp = await self._agent_registry.commit_session( | ||
| session, | ||
| new_image_ref, | ||
| extra_labels=image_labels, | ||
| ) | ||
| bgtask_id = cast(uuid.UUID, resp["bgtask_id"]) | ||
|
|
||
| # Wait for commit to complete | ||
| await self._wait_for_agent_bgtask(bgtask_id, "Commit") | ||
|
|
||
| # Push image to registry if not local | ||
| if not new_image_ref.is_local: | ||
| log.info("Pushing image to registry") | ||
| image_registry = ImageRegistry( | ||
| name=manifest.registry_hostname, | ||
| url=str(registry_conf.url), | ||
| username=registry_conf.username, | ||
| password=registry_conf.password, | ||
| ) | ||
| if not session.main_kernel.agent: | ||
| error_msg = f"Session {manifest.session_id} main kernel has no agent assigned" | ||
| log.error(error_msg) | ||
| return CommitSessionResult(error_message=error_msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, wouldn't raising an error be the correct approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven’t touched the existing implementation unless absolutely necessary. I’ll be sending you requests one at a time.
| description=row.description, | ||
| is_active=row.status == UserStatus.ACTIVE, | ||
| status=row.status, | ||
| status=row.status or UserStatus.ACTIVE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be safer to set it to INACTIVE if there is no status?
| sa.select(PermissionGroupRow).where(PermissionGroupRow.scope_id == str(user_id)) | ||
| ) | ||
| if permission_group is None: | ||
| raise ValueError(f"Permission group not found for user_id={user_id}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to raise Custom Error instead of ValueError
| class UserData: | ||
| uuid: uuid.UUID | ||
| username: str | ||
| username: Optional[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can change it to str | None
| SessionScheduledAnycastEvent(sess_ctx.id, sess_ctx.creation_id), | ||
| SessionScheduledBroadcastEvent(sess_ctx.id, sess_ctx.creation_id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think creation_id should not be None here
| AgentId(kernel.agent) if kernel.agent else None, | ||
| kernel.agent_addr or "", | ||
| kernel.scaling_group or "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think agent_addr and scaling_group is also should not be None
| ) | ||
|
|
||
| if result.rowcount > 0: | ||
| if cast(CursorResult, result).rowcount > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this cast need in such case?
| ) | ||
|
|
||
| deployment_policy = relationship( | ||
| deployment_policy: Mapped[DeploymentPolicyRow | None] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deployment_policies table has endpoint column which refers endpoint record. I think this type notation should be list[DeploymentPolicyRow], not DeploymentPolicyRow | None
| ) | ||
|
|
||
| auto_scaling_policy = relationship( | ||
| auto_scaling_policy: Mapped[DeploymentAutoScalingPolicyRow | None] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
| back_populates="owned_endpoints", | ||
| foreign_keys=[session_owner], | ||
| primaryjoin=lambda: foreign(EndpointRow.session_owner) == UserRow.uuid, | ||
| primaryjoin="foreign(EndpointRow.session_owner) == UserRow.uuid", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not set a string value to primaryjoin
| back_populates="created_endpoints", | ||
| foreign_keys=[created_user], | ||
| primaryjoin=lambda: foreign(EndpointRow.created_user) == UserRow.uuid, | ||
| primaryjoin="foreign(EndpointRow.created_user) == UserRow.uuid", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
| "ImageRow", | ||
| primaryjoin=lambda: foreign(EndpointRow.image) == ImageRow.id, | ||
| foreign_keys=[image], | ||
| primaryjoin="foreign(EndpointRow.image) == ImageRow.id", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
fregataa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many false non-nullable fields.
# This is correct because the foreign key constraint
# ensures that `company_id` refers a valid company record.
# But we should check if the foreign key constraint has a ondelete=set null option.
company_id: Mapped[UUID] = mapped_column("company_id", ForeignKey("companies.id"))
company_row: Mapped[CompanyRow] = relationship()
# This is incorrect because the `company_id` can refer a record that has already removed.
# So, the `company_row` should be nullable.
company_id: Mapped[UUID] = mapped_column("company_id", GUID)
company_row: Mapped[CompanyRow] = relationship()| ) | ||
|
|
||
| huggingface_registry = relationship( | ||
| huggingface_registry: Mapped[HuggingFaceRegistryRow] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nullable (because there is no cascade option)
| ) | ||
|
|
||
| reservoir_registry = relationship( | ||
| reservoir_registry: Mapped[ReservoirRegistryRow] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
|
|
||
| # Relationships (without FK constraints) | ||
| endpoint_row = relationship( | ||
| endpoint_row: Mapped[EndpointRow] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nullalble
|
|
||
| # Relationships (without FK constraints) | ||
| endpoint_row = relationship( | ||
| endpoint_row: Mapped[EndpointRow] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
| primaryjoin=_get_endpoint_join_condition, | ||
| ) | ||
| image_row = relationship( | ||
| image_row: Mapped[ImageRow] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
| foreign_keys=[channel_id], | ||
| ) | ||
| creator = relationship( | ||
| creator: Mapped[UserRow] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nullable
| ) | ||
|
|
||
| container_registry_row = relationship( | ||
| container_registry_row: Mapped[ContainerRegistryRow] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nullable
| ) | ||
|
|
||
| group_row = relationship( | ||
| group_row: Mapped[GroupRow] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
|
|
||
| @property | ||
| def mapping(self) -> dict[str, Any]: | ||
| def mapping(self) -> dict[str, object]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it dict[str, object] type?
| namespace: Mapped[str] = mapped_column("namespace", sa.String, nullable=False) | ||
|
|
||
| object_storage_row = relationship( | ||
| object_storage_row: Mapped[ObjectStorageRow] = relationship( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nullable
| await valkey_live.update_connection_tracker(str(kernel_id), service, stream_id) | ||
| await root_ctx.idle_checker_host.update_app_streaming_status( | ||
| kernel_id, | ||
| session_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay to change from kernel_id to session_id ?
| if remaining_count == 0: | ||
| await root_ctx.idle_checker_host.update_app_streaming_status( | ||
| kernel_id, | ||
| session_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| conn_tracker_lock: asyncio.Lock | ||
| conn_tracker_gc_task: asyncio.Task | ||
| active_session_ids: defaultdict[SessionId, int] | ||
| active_session_ids: defaultdict[KernelId, int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be active_session_ids: defaultdict[SessionId, int] or change variable name
| session_id=session_row.id, | ||
| access_key=session_row.access_key, | ||
| creation_id=session_row.creation_id, | ||
| access_key=AccessKey(session_row.access_key) if session_row.access_key else AccessKey(""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is right to raise error when access_key is None
| terminated_at=now, | ||
| status_history=sql_json_merge( | ||
| SessionRow.status_history, | ||
| SessionRow.__table__.c.status_history, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a pattern using __table__.c.in this file. Is it okay?
| terminated_at=now, | ||
| status_history=sql_json_merge( | ||
| SessionRow.status_history, | ||
| SessionRow.__table__.c.status_history, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| kern.status_info = destroy_reason | ||
| kern.status_history = sql_json_merge( | ||
| KernelRow.status_history, | ||
| KernelRow.__table__.c.status_history, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
fregataa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up: we should update all plugins
…ic in ArtifactDBSource and tests
7362e63 to
90207f7
Compare
timezone.utcwithUTCfor consistency in datetime handling across multiple files.resolves #NNN (BA-MMM)
Checklist: (if applicable)
ai.backend.testdocsdirectory📚 Documentation preview 📚: https://sorna--7880.org.readthedocs.build/en/7880/
📚 Documentation preview 📚: https://sorna-ko--7880.org.readthedocs.build/ko/7880/