Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion apps/common/handle/impl/text/text_split_handle.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ def support(self, file, get_buffer):
if file_name.endswith(".md") or file_name.endswith('.txt') or file_name.endswith('.TXT') or file_name.endswith(
'.MD'):
return True
if file_name.index('.') > 0:
if '.' in file_name:
return False
buffer = get_buffer(file)
result = detect(buffer)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code snippet you provided is an example of a function named support that determines whether a given file's extension indicates it should be supported. It checks if the file has a .md, .txt, or .MD extension and returns True if any of these match, otherwise it returns False. If there is at least one period in the file name (.), it immediately returns False.

There are no immediate issues with this logic, but here are some general suggestions for improving readability and maintainability:

  1. Consistent Extension Handling: Ensure consistent case handling when checking for extensions to avoid confusion. The current implementation treats .txt and .TXT differently, which might not be intended.

  2. String Methods: You can use the in keyword instead of indexing, which makes the code more readable and concise.

Here's an optimized version of the function using string methods:

def support(file, get_buffer):
    # Define valid text file extensions, ignoring case
    valid_extensions = {'.md', '.txt'}

    # Get the file name from the path
    file_name = file.name

    # Check if the file ends with a valid text extension
    if file_name.lower().split('.')[1] in valid_extensions:
        return True

    # Buffer the content of the file
    buffer = get_buffer(file)
    result = detect(buffer)

    return result

Key Improvements:

  • Case Insensitivity: Use file_name.lower() to ensure the comparison is case-insensitive.
  • List For Valid Extensions: Store valid extensions in a set for faster lookups.
  • Splitting by Dot: Extract only the extension part directly after splitting the file name by periods using file_name.split('.')[1].

These changes make the code cleaner and more efficient.

Expand Down
Loading