Skip to content

Commit c7d648c

Browse files
Copilotwannaphong
andcommitted
Fix memory management and type annotation issues
- Fixed double free bug in lookahead logic (was freeing next_prefixes twice) - Corrected return type annotation for _get_default_dict_path() to Optional[str] - All tests still pass (12/12) - Tokenization results unchanged Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>
1 parent a7652a5 commit c7d648c

File tree

2 files changed

+10
-9
lines changed

2 files changed

+10
-9
lines changed

cthainlp/tokenize.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
_cthainlp = None
1212

1313

14-
def _get_default_dict_path() -> str:
14+
def _get_default_dict_path() -> Optional[str]:
1515
"""
1616
Get the default dictionary file path.
1717

src/newmm.c

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -162,18 +162,19 @@ static int segment_text(const char* text, Trie* trie, char*** tokens) {
162162
/* Prefer it */
163163
best_len = lengths[i];
164164
best_end_pos = end_pos;
165-
166-
/* Free and break */
167-
for (int j = 0; j < num_next; j++) {
168-
free(next_prefixes[j]);
169-
}
170-
free(next_prefixes);
171-
free(next_lengths);
172-
break;
173165
}
174166

167+
/* Free lookahead results */
168+
for (int j = 0; j < num_next; j++) {
169+
free(next_prefixes[j]);
170+
}
175171
free(next_prefixes);
176172
free(next_lengths);
173+
174+
if (num_next > 0) {
175+
/* We found a better match, stop looking */
176+
break;
177+
}
177178
}
178179
}
179180
}

0 commit comments

Comments
 (0)