Skip to content

Version 2.0.0b5#114

Open
cdgriffith wants to merge 9 commits intomasterfrom
develop
Open

Version 2.0.0b5#114
cdgriffith wants to merge 9 commits intomasterfrom
develop

Conversation

@cdgriffith
Copy link
Owner

@cdgriffith cdgriffith commented Jul 5, 2025

Add sndhdr tests #85
Add dynamic text type tests

Copilot AI review requested due to automatic review settings December 18, 2025 11:57
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for sndhdr audio file tests and enhances text type detection with dynamic checks for version 2.0.0b5. The changes include significant refactoring of internal functions from private (underscore-prefixed) to public names, new scanner implementations, and improved test coverage.

Key Changes:

  • Added new sndhdr_scanner for audio file detection
  • Enhanced text scanner with CSV detection and format-specific checks
  • Improved Python scanner with better code pattern recognition
  • Added JSON array support to json_scanner
  • Refactored internal functions to use public naming
  • Added new test cases for audio formats and dynamic text types

Reviewed changes

Copilot reviewed 13 out of 28 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
test/test_scanners.py Added sndhdr scanner tests and updated text scanner assertions to use lowercase text type names
test/test_main.py Removed deprecated imghdr compatibility tests
test/test_common_extensions.py Updated function calls to match renamed public API
test/resources/system/test_list.json Added test JSON file with array structure
test/resources/audio/*.mp3 Added MP3 test files for audio detection
test/resources/audio/test.sndr Added SNDR format test file
test/resources/audio/test.opus Added Opus format test file
pyproject.toml Disabled black and flake8 linting in favor of Ruff
puremagic/scanners/text_scanner.py Major refactoring with CSV detection and dynamic format checks
puremagic/scanners/sndhdr_scanner.py New scanner for audio formats
puremagic/scanners/python_scanner.py Enhanced Python code detection with keyword/pattern matching
puremagic/scanners/json_scanner.py Added support for JSON arrays
puremagic/main.py Version bump, function renames, new scanner integration, error type changes
CHANGELOG.md Updated with issue #85 reference
.gitignore Added VS Code exclusions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 99 to +104

#
# # Check for quotes that might indicate CSV
# has_quoted_fields = '"' in text and (f'"{best_delimiter}' in text or f'{best_delimiter}\"')
#

Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are commented-out code blocks left in the production code. These should either be removed if they're no longer needed, or completed and uncommented if they serve a purpose. Leaving commented code in production reduces code quality and maintainability.

Suggested change
#
# # Check for quotes that might indicate CSV
# has_quoted_fields = '"' in text and (f'"{best_delimiter}' in text or f'{best_delimiter}\"')
#

Copilot uses AI. Check for mistakes.
result = catch_all_deep_scan(filename, head, foot)
except Exception:
pass
raise
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handling was changed to re-raise all exceptions instead of silently passing. This change may expose previously caught errors to users. While this can be beneficial for debugging, it's a change in behavior that could break existing code. The comment says 'pass' was replaced with 'raise', but this should be intentional and documented.

Suggested change
raise
# Intentionally suppress exceptions from catch-all deep scan to
# maintain backward-compatible behavior and continue fallback logic.
pass

Copilot uses AI. Check for mistakes.
filename: os.PathLike | str,
head=None,
foot=None,
confidence=0,
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function signature changed to add a new parameter 'confidence=0' but this change in the function signature could affect compatibility if this function is used elsewhere. Ensure that all call sites have been updated to handle this new parameter.

Copilot uses AI. Check for mistakes.
for i, result in enumerate(matches):
if i == 0:
print("\n\tBest Match")
print("\n\tDeepscan Match" if int(result.confidence == 1) else "\n\tBest Match")
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition 'int(result.confidence == 1)' will always evaluate to either 0 or 1 (since it's casting a boolean to int), and when used as a condition in an if statement, it would only be truthy when confidence is exactly 1. However, the logic seems inverted - you're checking if it should print "Deepscan Match" when confidence equals 1, but the condition will be 0 (falsy) when confidence is 1. This should likely be 'if result.confidence == 1.0:' instead.

Suggested change
print("\n\tDeepscan Match" if int(result.confidence == 1) else "\n\tBest Match")
print("\n\tDeepscan Match" if result.confidence == 1.0 else "\n\tBest Match")

Copilot uses AI. Check for mistakes.
if 1700 <= year_int <= 3000:
self.tags.append("ID3v1")
return True
except ValueError:
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except ValueError:
except ValueError:
# If the year string cannot be converted to an int, treat it as an invalid/missing year.
# We intentionally ignore this and fall through to returning None below.

Copilot uses AI. Check for mistakes.
confidence=0.1, # Lower confidence due to simple format
)
except (IndexError, struct.error):
pass
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
pass
# If the header is too short or malformed for an SNDR rate, treat it as
# not matching SNDR and fall through to other format detection.
return test_hcom(head)

Copilot uses AI. Check for mistakes.
try:
dialect = csv.Sniffer().sniff(text, delimiters=''.join(potential_delimiters))
csv_sniffer_result = dialect.delimiter
except Exception:
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
except Exception:
# csv.Sniffer can fail on non-standard or truncated input; in that case
# we ignore the error and rely solely on the heuristic delimiter detection below.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants