Skip to content

hook: refactor to avoid write status to disk#12

Merged
gaius-qi merged 1 commit intomodelpack:mainfrom
imeoer:refactor-progress-status
Nov 25, 2025
Merged

hook: refactor to avoid write status to disk#12
gaius-qi merged 1 commit intomodelpack:mainfrom
imeoer:refactor-progress-status

Conversation

@imeoer
Copy link
Collaborator

@imeoer imeoer commented Nov 25, 2025

The hook currently executes during each layer pull, updating progress info to the status file, which is a disk write operation that may impact model pull concurrency on lower-performance disks.

This PR refactors the hook implementation to avoid writing per-layer progress to the disk-based status file, instead keeping it cached in memory, users can still get the model pull progress via the API as before.


This pull request refactors how progress tracking for model pulls is handled, moving responsibility from the service package to a new status.Hook and introducing a centralized HookManager for managing pull progress state. The changes simplify the codebase, improve separation of concerns, and ensure more consistent and thread-safe access to pull progress. The most important changes are grouped below.

Refactoring and Code Organization

  • Moved the Hook implementation from pkg/service/puller.go to a new file pkg/status/hook.go, and replaced all references to service.Hook with status.Hook throughout the codebase. This includes updating the Puller interface, constructors, and related usages. [1] [2] [3] [4] [5] [6]
  • Removed the progress callback mechanism from the pull logic, and now progress is tracked and accessed via the new HookManager. [1] [2] [3]

Progress Tracking and Status Management

  • Introduced HookManager in pkg/status/hook.go and added it to StatusManager in pkg/status/status.go for centralized management of pull progress hooks, including methods to set, get, and delete hooks by key (typically the path to the status file). [1] [2] [3]
  • Updated the status retrieval logic so that when StatusManager.Get() is called, it fetches the latest progress from the HookManager and attaches it to the returned status. [1] [2]

Worker and Model Pull Logic

  • Refactored the worker logic to use the new status.Hook and HookManager for progress tracking; progress is no longer passed around manually but is managed centrally. Also, progress is no longer set directly during status updates, but is attached automatically when status is retrieved. [1] [2]
  • Ensured that when a model is deleted, the corresponding progress hook is also removed from the HookManager, preventing stale progress data.

These changes improve modularity, thread safety, and maintainability of the codebase, especially around model pull progress tracking.

@imeoer imeoer changed the title [WIP] hook: refactor to avoid write status to disk hook: refactor to avoid write status to disk Nov 25, 2025
@imeoer imeoer marked this pull request as ready for review November 25, 2025 03:58
@imeoer
Copy link
Collaborator Author

imeoer commented Nov 25, 2025

@gemini-code-assist review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively refactors the model pull progress tracking to improve performance by avoiding disk writes for each layer. The introduction of HookManager to cache progress in memory is a good approach. The code is cleaner and the concerns are better separated.

I've found a few issues:

  1. A critical bug in deleteModel that will cause a memory leak due to incorrect statusPath calculation when deleting a hook from HookManager.
  2. A high-severity regression where the functionality to detect and log long-running layer pulls (checkLongPulling) has been removed.
  3. A medium-severity issue with some dead code in StatusManager.Get due to an incorrect nil check.

I've provided detailed comments and suggestions for each of these points. Once these are addressed, the PR should be in great shape.

@imeoer imeoer force-pushed the refactor-progress-status branch 2 times, most recently from cbdeb93 to 0a63e33 Compare November 25, 2025 06:08
The hook currently executes during each layer pull, updating
progress info to the status file, which is a disk write operation
that may impact model pull concurrency on lower-performance disks.

This PR refactors the hook implementation to avoid writing per-layer
progress to the disk-based status file, instead keeping it cached
in memory, users can still get the model pull progress via the API
as before.

Signed-off-by: imeoer <[email protected]>
@imeoer imeoer force-pushed the refactor-progress-status branch from 0a63e33 to a937c62 Compare November 25, 2025 09:05
Copy link
Member

@gaius-qi gaius-qi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@chlins chlins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@gaius-qi gaius-qi merged commit 0658ae6 into modelpack:main Nov 25, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants