Skip to content

Conversation

@HyeockJinKim
Copy link
Collaborator

@HyeockJinKim HyeockJinKim commented Jan 8, 2026

  • Updated various files to enhance code formatting, including consistent use of inline conditionals and improved line breaks for better readability.
  • Replaced timezone.utc with UTC for consistency in datetime handling across multiple files.
  • Simplified query construction in several repository files for clarity.
  • Adjusted import statements to remove redundancies and improve organization.
  • Enhanced logging messages for better clarity in error handling.
  • Improved type hinting and annotations for better code comprehension.

resolves #NNN (BA-MMM)

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Update of end-to-end CLI integration tests in ai.backend.test
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation
  • Documentation
    • Contents in the docs directory
    • docstrings in public interfaces and type annotations

📚 Documentation preview 📚: https://sorna--7880.org.readthedocs.build/en/7880/


📚 Documentation preview 📚: https://sorna-ko--7880.org.readthedocs.build/ko/7880/

Copilot AI review requested due to automatic review settings January 8, 2026 23:37
HyeockJinKim and others added 5 commits January 9, 2026 08:37
- Update SQLAlchemy from 1.4.54 to 2.0.45
- Update python.lock with new dependencies

This introduces type errors that need to be fixed in subsequent commits.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…count

- Change row["column"] to row.column attribute access (SQLAlchemy 2.0)
- Change sa.select([col1, col2]) to sa.select(col1, col2) unpacking
- Change sa.insert(table, values) to sa.insert(table).values(values)
- Change result.rowcount to cast(CursorResult, result).rowcount

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…count

- Fix remaining row.column attribute access patterns
- Fix remaining sa.select(*cols) unpacking patterns
- Fix remaining cast(CursorResult, result).rowcount patterns

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix process_result_value first arg type to `Any | None`
- Fix process_bind_param first arg type to `Any | None`
- Fix copy() return type to `Self`
- Fix VFolderHostPermissionColumn to pass dialect instead of None

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Replace `from sqlalchemy.orm import sessionmaker` with
  `from sqlalchemy.ext.asyncio import async_sessionmaker`
- Update all sessionmaker() calls to async_sessionmaker()
- Remove redundant class_=SASession parameter (AsyncSession is default)

Files modified:
- manager/models/utils.py
- account_manager/models/utils.py
- appproxy/coordinator/models/utils.py
- manager/repositories/scheduler/db_source/db_source.py
- manager/repositories/artifact/db_source/db_source.py
- manager/repositories/deployment/db_source/db_source.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Migrate Row classes to SQLAlchemy 2.0 Mapped/mapped_column style:
- audit_log/row.py
- network/row.py
- container_registry/row.py
- rbac_models/association_scopes_entities.py

Changes:
- sa.Column → mapped_column with Mapped[T] type hints
- IDColumn() → mapped_column with GUID type
- relationship() → Mapped[T] type hints for relationships

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the codebase from SQLAlchemy 1.4 to SQLAlchemy 2.0 with comprehensive type safety improvements. The changes include:

  • Updated SQLAlchemy query construction patterns from legacy list-based syntax to modern syntax
  • Migrated from sa.Table definitions to ORM-mapped classes with Mapped type annotations
  • Added extensive type hints with proper nullable handling
  • Improved row access patterns from dict-style to attribute access
  • Enhanced error handling with null checks
  • Updated session and connection management to async patterns

Reviewed changes

Copilot reviewed 209 out of 211 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/unit/manager/test_queryfilter.py Removed deprecated list syntax from sa.select() calls
tests/unit/manager/test_idle_checker.py Added cast(Row[Any], ...) for type safety in test fixtures
tests/unit/manager/services/ Added assertions for non-null values before usage
tests/unit/manager/repositories/ Updated sa.insert() to use .values(), added null checks
tests/unit/manager/models/ Updated query syntax and added null assertions
src/ai/backend/manager/utils.py Changed row access from dict-style to attribute access, added null checks
src/ai/backend/manager/sweeper/ Updated to use begin_readonly_session() with scalars().all()
src/ai/backend/manager/services/ Added extensive null checks for domain_name and other nullable fields
src/ai/backend/manager/scheduler/ Added AgentId type wrapping and null handling
src/ai/backend/manager/repositories/ Migrated to async_sessionmaker, updated query patterns
src/ai/backend/manager/models/vfolder/row.py Migrated from Table to ORM class with Mapped annotations
src/ai/backend/manager/models/user/row.py Migrated from Table to ORM class with comprehensive type annotations
src/ai/backend/manager/registry.py Added extensive null checks and type conversions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

from __future__ import annotations

import logging
import uuid
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module 'uuid' is imported with both 'import' and 'import from'.

Copilot uses AI. Check for mistakes.

import logging
from collections.abc import Callable, Sequence
import uuid as uuid_mod
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module 'uuid' is imported with both 'import' and 'import from'.

Copilot uses AI. Check for mistakes.
from __future__ import annotations

import logging
import uuid
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module 'uuid' is imported with both 'import' and 'import from'.

Copilot uses AI. Check for mistakes.
HyeockJinKim and others added 20 commits January 9, 2026 08:48
Migrate endpoint/row.py to SQLAlchemy 2.0 Mapped/mapped_column style:
- EndpointRow: All columns and relationships
- EndpointTokenRow: All columns and relationships
- EndpointAutoScalingRuleRow: All columns and relationships

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Migrate session/row.py and kernel/row.py to SQLAlchemy 2.0 Mapped/mapped_column style:
- SessionRow, SessionDependencyRow: all columns and relationships converted
- KernelRow: all columns and relationships converted
- Added TYPE_CHECKING imports for relationship types to avoid circular imports
- Changed IDColumn factory functions to IDColumnType usage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Migrate 12 Row files to SQLAlchemy 2.0 Mapped/mapped_column style:
- image/row.py (ImageRow, ImageAliasRow)
- scaling_group/row.py (ScalingGroupRow, ScalingGroupFor*Row)
- scheduling_history/row.py (4 history Row classes)
- routing/row.py (RoutingRow)
- user/row.py (UserRow - full declarative conversion)
- notification/row.py (NotificationChannelRow, NotificationRuleRow)
- artifact/row.py (ArtifactRow)
- deployment_auto_scaling_policy/row.py
- deployment_revision/row.py
- agent/row.py (AgentRow - full declarative conversion)
- group/row.py (relationships only)
- domain/row.py (DomainRow - full declarative conversion)

Changes include:
- sa.Column() → mapped_column() with Mapped[T] type hints
- IDColumn() → explicit mapped_column with GUID and server_default
- relationship() with proper Mapped type annotations
- TYPE_CHECKING imports for relationship types
- Backward compatibility via table_name = RowClass.__table__

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Migrate 15 Row files to SQLAlchemy 2.0 Mapped/mapped_column style:

- keypair/row.py (KeyPairRow)
- resource_policy/row.py (KeyPairResourcePolicyRow, UserResourcePolicyRow,
  ProjectResourcePolicyRow)
- event_log/row.py (EventLogRow)
- object_storage/row.py (ObjectStorageRow)
- storage_namespace/row.py (StorageNamespaceRow)
- artifact_registries/row.py (ArtifactRegistryRow)
- huggingface_registry/row.py (HuggingfaceRegistryRow)
- reservoir_registry/row.py (ReservoirRegistryRow)
- app_config/row.py (AppConfigRow)
- vfs_storage/row.py (VFSStorageRow)
- artifact_revision/row.py (ArtifactRevisionRow)
- deployment_policy/row.py (DeploymentPolicyRow)
- association_artifacts_storages/row.py (AssociationArtifactsStorageRow)
- association_container_registries_groups/row.py
  (AssociationContainerRegistriesGroupsRow)
- vfolder/row.py (VFolderRow, VFolderInvitationRow, VFolderPermissionRow)

Key changes:
- sa.Column() → mapped_column() with Mapped[T] type hints
- IDColumn() → explicit mapped_column("id", GUID, primary_key=True,
  server_default=sa.text("uuid_generate_v4()"))
- sa.Table(...) + __table__ = table → declarative class with __tablename__
- Added backward compatibility aliases: table_name = RowClass.__table__
- Relationship annotations with Mapped[T], Mapped[T | None], Mapped[list[T]]
- TYPE_CHECKING imports for relationship types to avoid circular imports

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Replace IDColumn() patterns with explicit mapped_column definitions:

- scaling_group/row.py (ScalingGroupForDomainRow, ScalingGroupForProjectRow,
  ScalingGroupForKeypairsRow)
- artifact/row.py (ArtifactRow)
- resource_preset/row.py (ResourcePresetRow)
- notification/row.py (NotificationChannelRow, NotificationRuleRow)
- deployment_auto_scaling_policy/row.py (DeploymentAutoScalingPolicyRow)
- group/row.py (AssocGroupUserRow, GroupRow - full sa.Table to declarative
  class conversion)

Key changes:
- IDColumn() → mapped_column("id", GUID, primary_key=True,
  server_default=sa.text("uuid_generate_v4()"))
- Removed IDColumn imports from base module
- group/row.py: Converted sa.Table patterns to declarative classes with
  backward compatibility aliases

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- errors/storage.py: Convert Row["column"] to attribute access
- RBAC models (6 files): role.py, user_role.py, permission_group.py,
  permission.py, object_permission.py, entity_field.py
  - Replace IDColumn() with mapped_column()
  - Add Mapped[T] type hints for all columns
  - Add Mapped wrapper for relationships
- resource_preset/row.py: Complete ORM 2.0 migration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix python_type property return type (T_Enum -> type[T_Enum])
- Fix python_type property typo (_enum_class -> _enum_cls)
- Fix type_descriptor argument (sa.JSON -> sa.JSON())
- Rename stmt to insert_stmt/update_stmt to fix type conflicts
- Remove explicit Table type annotation to allow None from get()

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- errors/storage.py: Fix VFolderRow import path
- resource_preset/row.py:
  - Add WhereableStatement type alias for Select/Update/Delete
  - Fix QueryOption type to support where() on Update/Delete
  - Use `| None` instead of Optional
  - Fix delete() return type (no return value)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove explicit type annotations on scalar_one_or_none() results
- Add None checks with BackendAIError exceptions for nullable fields
- Change Sequence[T] return types for .scalars().all() methods
- Update dataclass field types to accept nullable values (datetime | None)
- Fix nullable relationship access with proper None guards

Files modified:
- repositories: db_source files for various storages
- models: routing, kernel, user, keypair, artifact_revision, etc.
- data: deployment, model_serving, object_storage, user types

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…ass type fixes

- Update QueryCondition and QueryOption type aliases in models/types.py
  to properly reflect Select[Any] -> Select[Any] transformation pattern
- Update load_related_field to accept _AbstractLoad instead of sa.orm.Load
- Fix user/row.py: Change load_* methods to return _AbstractLoad, update
  check_credential functions to use ExtendedAsyncSAEngine
- Fix group/row.py: Update load_resource_policy return type, fix GroupData
  and ProjectModel nullable field types, fix WhereClauseType definition
- Fix agent/row.py: Update by_scaling_group/by_status/by_schedulable inner
  function return types, update AgentData/AgentDataForHeartbeatUpdate fields
  for compute_plugins (Mapping) and first_contact (nullable)
- Fix container_registry/row.py: Add None check for nullable project field
- Fix deployment_revision/row.py: Convert kernel_path to str, update
  ModelMountConfigData nullable fields

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…e, and minilang

- models/utils.py: Fix JSONCoalesceExpr type, reenter_txn_session return type,
  sa.literal for cast, version_str None check
- repositories/base/types.py: Fix QueryOrder type to UnaryExpression | ColumnElement
- repositories/base/purger.py: Cast table to sa.Table for primary_key access
- repositories/base/upserter.py: Chain insert builder to avoid type reassignment
- models/minilang/queryfilter.py: Fix atom and binary_expr return types
- models/group/row.py: Add Any import, fix BooleanClauseList type argument
- models/session/row.py: Fix load_option return types to _AbstractLoad,
  Sequence return types, cond variable annotations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Changes:
- kernel/row.py: Fix None checks for status_history, status_data, sql_json_merge,
  to_kernel_info type conversions, recalc_concurrency_used scalar types
- session/row.py: Fix to_dataclass/to_session_info type conversions,
  status_history None handling, scalars().all() return types
- image/row.py: Add ImageID/ImageCanonical imports and conversions,
  fix resource key iteration, registry None check
- scaling_group/row.py: Fix is_active/created_at None defaults,
  list return type for scalars().all()
- utils.py: Add Row None checks for query results,
  convert resource_policy to dict
- dotfile.py: Add Row None check
- network/row.py: Fix options None assignment
- rbac/__init__.py: Add domain_name None check
- agent_cache.py: Fix _fetch_agent return type
- container_registry/*: Fix registry_info.extra None access,
  scalar() result None check

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- session/row.py: Add SessionId wrapper for instance.id assignment
- image/row.py: Fix SlotName key type handling in resources property
- dotfile.py: Add type annotation for internal_data dict
- gitlab.py: Fix headers dict type with conditional access_token
- github.py: Handle None extra dict before accessing entity_type
- vfolder/row.py: Fix select() list args, sql_json_merge column refs,
  VFolderMountPermission None handling, Container->Iterable type
- endpoint/row.py: Fix Container->Iterable for in_() calls, add None
  checks for optional fields, fix service_ports cast, handle Row None

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove unused VFolderRow TypeAlias (Mapping[str, Any])
- Remove TypeAlias import
- Remove VFolderDBRow alias, use VFolderRow directly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- schedule/repository.py: Fix CursorResult.rowcount, EndpointId import,
  route.session None checks, AgentId wrapping, session.connection() usage
- api/stream.py: Fix urlparse().hostname bytes handling, service_ports cast,
  active_session_ids type annotation, add cast import
- vfolder/repository.py: Fix VFolderHostPermissionMap return type,
  session.connection() usage, VFolderMountPermission import, sa.insert() syntax
- session/service.py: Fix ImageIdentifier None checks, agent_row None check,
  LegacySessionInfo fields, scaling_group_name None check, await file.decode()
- models/utils.py: Update sql_json_merge/sql_json_increment to accept
  InstrumentedAttribute

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…scheduler dispatcher

- Fix ColumnElement[bool] type for condition lists
- Fix scalar_one_or_none() assignments (remove explicit type annotations)
- Fix Sequence vs list type assignments with list()
- Fix AgentId wrapping for agent.id
- Fix creation_id None handling with fallback to empty string
- Fix KernelId to str conversion for repository methods

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Changes:
- Fix UserData dataclass to accept Optional fields matching UserRow
- Fix fetch_current_time to handle scalar() returning None
- Fix simple_db_mutate to accept Delete statements
- Fix generate_sql_info_for_gql_connection to accept InstrumentedAttribute
- Fix sweeper session/kernel to use begin_readonly_session with scalars()
- Fix api/service.py Mapping[str, Any] attribute access using dict syntax
- Fix api/session.py dependency query and null checks
- Fix cluster_template.py and session_template.py variable reuse issues
- Fix session/repository.py access_key null check
- Fix model_serving.py to use query_userinfo_from_session

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Changes:
- Fix BinaryExpression to ColumnElement[bool] type annotation
- Fix access_key type from UUID to str in BaseResourceUsageGroup
- Fix status_history type from str to Mapping[str, Any]
- Add null checks for kernel.agent, container_id, status_history
- Fix stat_map access to use .get() method
- Fix resource_opts null handling with nmget
- Fix return type from Sequence to list

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…sitory

Changes:
- Fix variable reuse issue (Delete vs Update query)
- Add null check for owner user in permission check
- Add null checks for session_owner, access_key, and role
- Wrap access_key with AccessKey newtype
- Convert role to string for UserScope
- Add null checks for model, model_row, and image_row
- Fix vfolder_mounts Sequence to list conversion

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix predicates.py: None checks for resource policy, Sequence to list
- Fix agent_selector.py: AgentId wrapping, ResourceGroupID usage
- Fix drf.py: AccessKey wrapping, None checks for access_key
- Fix resource_preset/db_source: Row indexing, AgentId conversion
- Fix event_dispatcher/handlers/session.py: Variable reuse, status_data check
- Fix idle.py: get_db_now assertion, variable naming
- Fix image/db_source: Sequence to list, alias None handling
- Fix container_registry/repository.py: Variable naming, list conversion
- Fix permission_controller: Sequence to list, None checks, return type
- Fix auth/db_source: UserRole import, fallback values
- Fix agent/db_source: Sequence to list

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
identifier=ImageIdentifier(
canonical=self.image,
architecture=self.architecture,
architecture=self.architecture or "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty architecture does not make sense.
How about commenting out this kind of code separately?

"status": status,
"status_history": sql_json_merge(
SessionRow.status_history,
sessions.c.status_history,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change required?

case SessionStatus.PREPARED:
await self.event_producer.anycast_event(DoStartSessionEvent())
case SessionStatus.RUNNING:
creation_id = session_row.creation_id or ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If creation_id is expected to become required in the future → add a comment.
If it remains optional → let’s throw an exception.

result = await conn.execute(query)
row = result.first()
if row is None:
raise ValueError("Unknown owner access key")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make a new exception

result = await db_sess.execute(query)
row = result.first()
if row is None:
raise ValueError("Unknown owner access key")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make a new exception

open_to_public=self.open_to_public,
created_at=self.created_at,
open_to_public=self.open_to_public if self.open_to_public is not None else False,
created_at=self.created_at or datetime.now(timezone.utc),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of the default values don’t make sense.
We’ll need to fix all of them along with a DB migration.

"status_changed": now,
"status_history": sql_json_merge(
VFolderRow.status_history,
vfolders.c.status_history,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL JSON merge needs to operate on tables not ORM, so access to the .c files is required.

HyeockJinKim and others added 4 commits January 9, 2026 12:40
- test_idle_checker.py: Replace dict with mock_row() helper that supports
  attribute access, matching actual SQLAlchemy Row behavior
- agent/row.py: Cast id to AgentId in to_data() for type consistency

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Use row._mapping["column"] instead of row["column"] for dict-style access
- Add explicit null checks for row results
- Add missing required_slots field to kernel test data

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- test_check_presets.py: Restore resource_policy dict with total_resource_slots
  and default_for_unspecified keys (reverts incorrect simplification)
- test_auth_repository.py: Add missing need_password_change field to UserRow fixture
- repository.py, db_source.py: Fix resource_policy type from Mapping[str, str]
  to Mapping[str, Any] to match actual API contract

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@HyeockJinKim HyeockJinKim force-pushed the deps/upgrade-sqlalchemy-v2 branch from 978e9fa to c1edd80 Compare January 9, 2026 04:34
@github-actions github-actions bot added size:L 100~500 LoC and removed size:XL 500~ LoC labels Jan 9, 2026

route_id: UUID
session_id: UUID
kernel_host: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay kernel_host to be nullable?

Comment on lines 628 to 631
resource_policy_dict = {
"total_resource_slots": kp_policy.total_resource_slots.to_json(),
"default_for_unspecified": str(kp_policy.default_for_unspecified),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this file default_for_unspecified field in resource_policy_dict has been gone. Is it okay?

Comment on lines 739 to 742
resource_policy_dict = {
"total_resource_slots": kp_policy.total_resource_slots.to_json(),
"default_for_unspecified": str(kp_policy.default_for_unspecified),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Comment on lines +84 to +92
case ResourcePresetData():
shared_memory = str(row.shared_memory) if row.shared_memory is not None else None
return cls(
id=row.id,
name=row.name,
resource_slots=row.resource_slots.to_json(),
shared_memory=shared_memory,
scaling_group_name=row.scaling_group_name,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that data is not being received correctly in from_row. If it's for existing operation compatibility, it would be better to fix it in subsequent tasks.

Comment on lines 119 to 214
if not session.main_kernel.image or not session.main_kernel.architecture:
error_msg = f"Session {manifest.session_id} main kernel has no image or architecture"
log.error(error_msg)
return CommitSessionResult(error_message=error_msg)
image_row = await self._session_repository.resolve_image([
ImageIdentifier(session.main_kernel.image, session.main_kernel.architecture)
])
base_image_ref = image_row.image_ref

# Build new image canonical name
filtered_tag_set = [
x for x in base_image_ref.tag.split("-") if not x.startswith("customized_")
]

if base_image_ref.name == "":
new_name = base_image_ref.project
else:
new_name = base_image_ref.name

new_canonical = f"{manifest.registry_hostname}/{manifest.registry_project}/{new_name}:{'-'.join(filtered_tag_set)}"

# Check for existing customized image
existing_row = await self._session_repository.get_existing_customized_image(
new_canonical,
manifest.image_visibility.value,
manifest.image_owner_id,
manifest.image_name,
)

customized_image_id: str
kern_features: list[str]
if existing_row is not None:
existing_image: ImageRow = existing_row
labels = existing_image.labels or {}
kern_features_str = labels.get(LabelName.FEATURES, DEFAULT_KERNEL_FEATURE)
kern_features = (
kern_features_str.split() if kern_features_str else [DEFAULT_KERNEL_FEATURE]
)
customized_image_id = labels.get(LabelName.CUSTOMIZED_ID, str(uuid.uuid4()))
log.debug("reusing existing customized image ID {}", customized_image_id)
else:
kern_features = [DEFAULT_KERNEL_FEATURE]
customized_image_id = str(uuid.uuid4())
# Remove PRIVATE label for customized images
kern_features = [
feat for feat in kern_features if feat != KernelFeatures.PRIVATE.value
]

new_canonical += f"-customized_{customized_image_id.replace('-', '')}"

from ai.backend.common.docker import ImageRef

new_image_ref = ImageRef.from_image_str(
new_canonical,
None,
manifest.registry_hostname,
architecture=base_image_ref.architecture,
is_local=base_image_ref.is_local,
)

# Prepare image labels
image_labels: dict[str | LabelName, str] = {
LabelName.CUSTOMIZED_OWNER: f"{manifest.image_visibility.value}:{manifest.image_owner_id}",
LabelName.CUSTOMIZED_NAME: manifest.image_name,
LabelName.CUSTOMIZED_ID: customized_image_id,
LabelName.FEATURES: " ".join(kern_features),
}
match manifest.image_visibility:
case CustomizedImageVisibilityScope.USER:
image_labels[LabelName.CUSTOMIZED_USER_EMAIL] = manifest.user_email

# Commit session
log.info("Committing session {}", manifest.session_id)
resp = await self._agent_registry.commit_session(
session,
new_image_ref,
extra_labels=image_labels,
)
bgtask_id = cast(uuid.UUID, resp["bgtask_id"])

# Wait for commit to complete
await self._wait_for_agent_bgtask(bgtask_id, "Commit")

# Push image to registry if not local
if not new_image_ref.is_local:
log.info("Pushing image to registry")
image_registry = ImageRegistry(
name=manifest.registry_hostname,
url=str(registry_conf.url),
username=registry_conf.username,
password=registry_conf.password,
)
if not session.main_kernel.agent:
error_msg = f"Session {manifest.session_id} main kernel has no agent assigned"
log.error(error_msg)
return CommitSessionResult(error_message=error_msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, wouldn't raising an error be the correct approach?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven’t touched the existing implementation unless absolutely necessary. I’ll be sending you requests one at a time.

description=row.description,
is_active=row.status == UserStatus.ACTIVE,
status=row.status,
status=row.status or UserStatus.ACTIVE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be safer to set it to INACTIVE if there is no status?

sa.select(PermissionGroupRow).where(PermissionGroupRow.scope_id == str(user_id))
)
if permission_group is None:
raise ValueError(f"Permission group not found for user_id={user_id}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to raise Custom Error instead of ValueError

class UserData:
uuid: uuid.UUID
username: str
username: Optional[str]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change it to str | None

Comment on lines -827 to -828
SessionScheduledAnycastEvent(sess_ctx.id, sess_ctx.creation_id),
SessionScheduledBroadcastEvent(sess_ctx.id, sess_ctx.creation_id),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think creation_id should not be None here

Comment on lines +1006 to +1008
AgentId(kernel.agent) if kernel.agent else None,
kernel.agent_addr or "",
kernel.scaling_group or "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think agent_addr and scaling_group is also should not be None

)

if result.rowcount > 0:
if cast(CursorResult, result).rowcount > 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cast need in such case?

)

deployment_policy = relationship(
deployment_policy: Mapped[DeploymentPolicyRow | None] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deployment_policies table has endpoint column which refers endpoint record. I think this type notation should be list[DeploymentPolicyRow], not DeploymentPolicyRow | None

)

auto_scaling_policy = relationship(
auto_scaling_policy: Mapped[DeploymentAutoScalingPolicyRow | None] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

back_populates="owned_endpoints",
foreign_keys=[session_owner],
primaryjoin=lambda: foreign(EndpointRow.session_owner) == UserRow.uuid,
primaryjoin="foreign(EndpointRow.session_owner) == UserRow.uuid",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not set a string value to primaryjoin

back_populates="created_endpoints",
foreign_keys=[created_user],
primaryjoin=lambda: foreign(EndpointRow.created_user) == UserRow.uuid,
primaryjoin="foreign(EndpointRow.created_user) == UserRow.uuid",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

"ImageRow",
primaryjoin=lambda: foreign(EndpointRow.image) == ImageRow.id,
foreign_keys=[image],
primaryjoin="foreign(EndpointRow.image) == ImageRow.id",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Member

@fregataa fregataa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many false non-nullable fields.

# This is correct because the foreign key constraint
# ensures that `company_id` refers a valid company record.
# But we should check if the foreign key constraint has a ondelete=set null option.
company_id: Mapped[UUID] = mapped_column("company_id", ForeignKey("companies.id"))
company_row: Mapped[CompanyRow] = relationship()

# This is incorrect because the `company_id` can refer a record that has already removed.
# So, the `company_row` should be nullable.
company_id: Mapped[UUID] = mapped_column("company_id", GUID)
company_row: Mapped[CompanyRow] = relationship()

)

huggingface_registry = relationship(
huggingface_registry: Mapped[HuggingFaceRegistryRow] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nullable (because there is no cascade option)

)

reservoir_registry = relationship(
reservoir_registry: Mapped[ReservoirRegistryRow] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here


# Relationships (without FK constraints)
endpoint_row = relationship(
endpoint_row: Mapped[EndpointRow] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nullalble


# Relationships (without FK constraints)
endpoint_row = relationship(
endpoint_row: Mapped[EndpointRow] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

primaryjoin=_get_endpoint_join_condition,
)
image_row = relationship(
image_row: Mapped[ImageRow] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

foreign_keys=[channel_id],
)
creator = relationship(
creator: Mapped[UserRow] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nullable

)

container_registry_row = relationship(
container_registry_row: Mapped[ContainerRegistryRow] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nullable

)

group_row = relationship(
group_row: Mapped[GroupRow] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here


@property
def mapping(self) -> dict[str, Any]:
def mapping(self) -> dict[str, object]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it dict[str, object] type?

namespace: Mapped[str] = mapped_column("namespace", sa.String, nullable=False)

object_storage_row = relationship(
object_storage_row: Mapped[ObjectStorageRow] = relationship(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nullable

await valkey_live.update_connection_tracker(str(kernel_id), service, stream_id)
await root_ctx.idle_checker_host.update_app_streaming_status(
kernel_id,
session_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay to change from kernel_id to session_id ?

if remaining_count == 0:
await root_ctx.idle_checker_host.update_app_streaming_status(
kernel_id,
session_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

conn_tracker_lock: asyncio.Lock
conn_tracker_gc_task: asyncio.Task
active_session_ids: defaultdict[SessionId, int]
active_session_ids: defaultdict[KernelId, int]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be active_session_ids: defaultdict[SessionId, int] or change variable name

session_id=session_row.id,
access_key=session_row.access_key,
creation_id=session_row.creation_id,
access_key=AccessKey(session_row.access_key) if session_row.access_key else AccessKey(""),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is right to raise error when access_key is None

terminated_at=now,
status_history=sql_json_merge(
SessionRow.status_history,
SessionRow.__table__.c.status_history,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a pattern using __table__.c.in this file. Is it okay?

terminated_at=now,
status_history=sql_json_merge(
SessionRow.status_history,
SessionRow.__table__.c.status_history,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

kern.status_info = destroy_reason
kern.status_history = sql_json_merge(
KernelRow.status_history,
KernelRow.__table__.c.status_history,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Member

@fregataa fregataa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up: we should update all plugins

@HyeockJinKim HyeockJinKim enabled auto-merge January 9, 2026 07:09
@HyeockJinKim HyeockJinKim added this pull request to the merge queue Jan 9, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 9, 2026
@HyeockJinKim HyeockJinKim force-pushed the deps/upgrade-sqlalchemy-v2 branch from 7362e63 to 90207f7 Compare January 9, 2026 08:11
@HyeockJinKim HyeockJinKim added this pull request to the merge queue Jan 9, 2026
Merged via the queue into main with commit e58da41 Jan 9, 2026
31 checks passed
@HyeockJinKim HyeockJinKim deleted the deps/upgrade-sqlalchemy-v2 branch January 9, 2026 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:docs Documentations comp:app-proxy Related to App Proxy component comp:cli Related to CLI component comp:manager Related to Manager component size:L 100~500 LoC