Skip to content

Commit 708ca73

Browse files
authored
added chardet encoding detection in filer_check (#187)
* added chardet to detect encoding before decoding * added testing data under zayo10*, ran poetry commands for proper dependency solve * add types-chardet to poetry deps * import order changed for chardet * adds types-chardet to correct group, testing locally completes
1 parent af823b8 commit 708ca73

File tree

6 files changed

+497
-66
lines changed

6 files changed

+497
-66
lines changed

circuit_maintenance_parser/provider.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
import traceback
55

66
from typing import Iterable, List, Dict
7+
import chardet
78

89
from pydantic import BaseModel
910

@@ -95,7 +96,8 @@ def filter_check(filter_dict: Dict, data: NotificationData, filter_type: str) ->
9596
if filter_data_type not in filter_dict:
9697
continue
9798

98-
data_part_content = data_part.content.decode().replace("\r", "").replace("\n", "")
99+
data_part_encoding = chardet.detect(data_part.content).get("encoding", "utf-8")
100+
data_part_content = data_part.content.decode(data_part_encoding).replace("\r", "").replace("\n", "")
99101
if any(re.search(filter_re, data_part_content) for filter_re in filter_dict[filter_data_type]):
100102
logger.debug("Matching %s filter expression for %s.", filter_type, data_part_content)
101103
return True

0 commit comments

Comments
 (0)