-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Labels
💪 enhancementNew feature or requestNew feature or request
Description
Self Checks
- I have read the Contributing Guide and Language Policy.
- I have searched for existing issues search for existing issues, including closed ones.
- I confirm that I am using English to submit this report, otherwise it will be closed.
- Please do not modify this template :) and fill in all the required fields.
1. Is this request related to a challenge you're experiencing? Tell me about your story.
When running file cleanup/maintenance commands, it’s hard to understand why a file is considered “in use”, and where it is referenced.
Today we have two commands in api/commands.py that define the scope of “in-use” files:
flask clear-orphaned-file-records(DB-level)- Base tables:
upload_files (id, key)andtool_files (id, file_key) - A file is treated as referenced if its UUID appears in one of these places (either equality match or UUID found in text/JSON):
message_files.upload_file_id(and relation tomessages.id)documents.data_source_infodocument_segments.contentmessages.answer,messages.inputs(json),messages.message(json)workflow_node_executions.inputs,workflow_node_executions.process_data,workflow_node_executions.outputs(json)conversations.introduction,conversations.system_instructionaccounts.avatarapps.iconsites.icon
- It deletes file records that are not referenced anywhere above.
- Base tables:
flask remove-orphaned-files-on-storage(storage-level)- Base tables:
upload_files,tool_files(keys only) - Scans storage directories:
image_files,tools,upload_files - Deletes storage objects that do not exist in DB base tables (does not check business references).
- Base tables:
Problem: There is no command to inspect usage, i.e. “this file is referenced by which table/field and which record id”.
This makes troubleshooting difficult when files are unexpectedly removed or unexpectedly retained.
Related context: #11835
2. Additional context or comments
Proposed enhancement
Add a new Flask CLI command (similar style to existing commands) to query file usages and print a human-readable list.
Example command name (suggestion):
flask query-file-usages- or
flask show-file-usages - or
flask file-usage
- or
Expected behavior
- Reuse the same “in-use” definition as
clear-orphaned-file-records(same reference columns, same matching strategy). - Output a list of references with consistent columns:
src(table.column, e.g.documents.data_source_info)record_id(primary key of the referencing row)file_id(matched file UUID)key(resolved storage key fromupload_files.keyortool_files.file_key)
Suggested options (nice to have)
--file-id <uuid>: show usage for a specific file UUID--key <file_key>: show usage for a specific storage key--src <pattern>: filter by a specific table/field (optional)--limit N/--offset N(optional)- Output format:
- default: table-like console output
- optional:
--jsonfor scripting
Why this helps
- Operators can quickly locate unexpected references (especially UUIDs embedded in text/JSON).
- Safer cleanup workflows: verify a file’s real usage before deletion.
- Easier debugging when
clear-orphaned-file-recordsfinds something “still in use” but the reason is unclear.
3. Can you help us with this feature?
- I am interested in contributing to this feature.
dosubot
Metadata
Metadata
Assignees
Labels
💪 enhancementNew feature or requestNew feature or request