Skip to content

Commit 51c3310

Browse files
committed
fix: 修复字符集不匹配的情况提取web的问题
--bug=1048607 --user=刘瑞斌 【github#1577】有个网站导入web知识库报错 https://www.tapd.cn/57709429/s/1623295
1 parent 14ee62a commit 51c3310

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

apps/common/util/fork.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,10 @@ def get_beautiful_soup(response):
142142
if len(charset_list) > 0:
143143
charset = charset_list[0]
144144
if charset != encoding:
145-
html_content = response.content.decode(charset)
145+
try:
146+
html_content = response.content.decode(charset)
147+
except Exception as e:
148+
logging.getLogger("max_kb").error(f'{e}')
146149
return BeautifulSoup(html_content, "html.parser")
147150
return beautiful_soup
148151

0 commit comments

Comments
 (0)