Skip to content

Commit 79bc3f7

Browse files
committed
fix: make code detection more forgiving for technical text
Removed ambiguous keywords (var, public, private) that match common words in technical discussions and transcripts. Increased threshold from 3 to 5 keyword hits and tightened combination logic to require both indented code AND structural tokens, not just one. This allows technical content, podcasts, and documentation to be summarized while still catching actual code files.
1 parent 1ae6d41 commit 79bc3f7

File tree

3 files changed

+5
-7
lines changed

3 files changed

+5
-7
lines changed

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "clipdrop"
3-
version = "1.6.6"
3+
version = "1.6.7"
44
description = "Save clipboard content (text and images) to files with smart format detection"
55
readme = "README.md"
66
authors = [

src/clipdrop/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
"""ClipDrop - Save clipboard content (text and images) to files with smart format detection."""
22

3-
__version__ = "1.6.0"
3+
__version__ = "1.6.7"

src/clipdrop/detect.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ def is_summarizable_content(content: str, detected_format: str) -> tuple[bool, s
2929
return False, SINGLE_PASS_LIMIT_REASON
3030

3131
# Heuristic filter to avoid passing obvious code snippets.
32+
# Use strong indicators only - be forgiving to allow technical text.
3233
lowered = content.lower()
3334
code_keywords = (
3435
"def ",
@@ -38,9 +39,6 @@ def is_summarizable_content(content: str, detected_format: str) -> tuple[bool, s
3839
"#!/",
3940
"import ",
4041
"const ",
41-
" var ",
42-
"public ",
43-
"private ",
4442
)
4543
code_hits = sum(lowered.count(keyword) for keyword in code_keywords)
4644
fenced_code = "```" in content or "</code>" in lowered
@@ -50,8 +48,8 @@ def is_summarizable_content(content: str, detected_format: str) -> tuple[bool, s
5048
)
5149
structural_tokens = any(token in content for token in ("{", "};", "=>", "#include"))
5250

53-
if fenced_code or code_hits >= 3 or (
54-
code_hits >= 2 and (indented_code or structural_tokens)
51+
if fenced_code or code_hits >= 5 or (
52+
code_hits >= 3 and indented_code and structural_tokens
5553
):
5654
return False, "Content appears to be code"
5755

0 commit comments

Comments
 (0)