Skip to content
This repository was archived by the owner on Jun 20, 2023. It is now read-only.

Conversation

@tohoku
Copy link

@tohoku tohoku commented Jan 26, 2022

#142 mentions the following error getting returned intermittently:

[ERROR] UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe0 in position 276: invalid continuation byte
Traceback (most recent call last):
File "/var/task/scan.py", line 236, in lambda_handler
scan_result, scan_signature = clamav.scan_file(file_path)
File "/var/task/clamav.py", line 195, in scan_file
output = av_proc.communicate()[0].decode()

Challenge: The av_proc.communicate() method returns the output of the clamscan call, but sometimes that output is using a different charset than the default utf-8 used by decode().

As a workaround, this PR will use chardet to try to determine the charset and decode accordingly.

In issue #183, removing the -a flag from the clamscan call is mentioned as a possible workaround, but I have not tried this.

@CLAassistant
Copy link

CLAassistant commented Jan 26, 2022

CLA assistant check
All committers have signed the CLA.

@tohoku tohoku marked this pull request as ready for review January 26, 2022 04:42
@tohoku tohoku closed this by deleting the head repository Jan 10, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants