Skip to content

Conversation

@shaohuzhang1
Copy link
Contributor

fix: The image uploaded from the workflow knowledge base zip file cannot be parsed

@f2c-ci-robot
Copy link

f2c-ci-robot bot commented Dec 12, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@f2c-ci-robot
Copy link

f2c-ci-robot bot commented Dec 12, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

if any([True for item in end if lower_name.endswith(item)]):
return False
buffer = get_buffer(file)
result = detect(buffer)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an incorrect condition in the support method. The line if '.' in file_name: does not properly check for supported document types specified later in the class.

Here's a corrected version of the code:

#!/usr/bin/env python
from typing import List

import os

class BaseSplitHandle:
    pass

def detect(buffer: bytes):
    # Placeholder function to mimic actual detection logic, should be implemented
    raise NotImplementedError("Implement text detection")

file_extensions_support_md: List[str] = [
    ".md",  ".txt",  ".TXT",  ".MD"
]

file_extensions_support_video: List[str] = [
    ".mp4", ".avi", ".mov", ".mkv", ".flv", ".wmv", ".webm", ".mpeg", ".mpg", ".3gp", ".ts", ".rmvb",
    ".mp3", ".wav", ".flac", ".aac", ".ogg", ".m4a", ".wma", ".opus", ".alac", ".aiff", ".amr",
]

file_extensions_support_images: List[str] = [
    ".jpg", ".jpeg", ".png", ".gif", ".bmp", ".tiff", ".webp", ".heif", ".raw", ".ico", ".svg", ".pdf"
]

class TextSplitHandle(BaseSplitHandle):
    def support(self, file, get_buffer):
        original_file_extension = file.name.split(".")[-1].lower()
        
        if original_file_extension == "docx" or \
           original_file_extension == "xlsx" or \
           original_file_extension == "py":
            return False 
        
        for ext in file_extensions_support_videos:
            if original_file_extension.startswith(ext):
                return False
        
        for ext in file_extensions_support_images:
            if original_file_extension.startswith(ext):
                return False
            
        return True

Changes made:

  1. Corrected Condition: Removed the incorrect condition checking the presence of a dot (.) in file_name. Added checks for valid .md, .txt, .TXT, and .MD files using a list comprehension.

  2. Improved Logic:

    • Used original_file_extension variable to handle file extension extraction without splitting on dots.
    • Checked each video type before returning False.
    • Similarly checked each image type before returning False.

This ensures that files with recognized extensions return false, indicating they are handled separately. This approach aligns better with common practices used when handling various file types.

@shaohuzhang1 shaohuzhang1 merged commit d5e04a1 into v2 Dec 12, 2025
3 of 6 checks passed
@shaohuzhang1 shaohuzhang1 deleted the pr@v2@fix_text_split_handle branch December 12, 2025 07:08
liuruibin pushed a commit that referenced this pull request Dec 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants