Skip to content

Better handling of empty result from curl #42

@bac0id

Description

@bac0id

curl from Internet Archive frequently returns empty, which results in meaningless empty lines being printed on the log. In most cases the output is like this:

2025-02-18 12:16:43 [Status request failed] https://example.com
(an empty line here)
2025-02-18 12:16:53 [Request failed] https://example.com
(another empty line here)

There lines are for showing info/error from IA. Can we print something other than empty lines here? Related codes:

if [[ -z "$message" ]]; then
if [[ "$request" =~ "429 Too Many Requests" ]] || [[ "$request" == "" ]]; then
echo "$request"

request=$(curl "${curl_args[@]}" -s -m 60 "https://web.archive.org/save/status/$job_id")
status=$(echo "$request" | grep -Eo '"status":"([^"\\]|\\["\\])*"' | head -1)
if [[ -z "$status" ]]; then
echo "$(date -u '+%Y-%m-%d %H:%M:%S') [Status request failed] $1"
if [[ "$request" =~ "429 Too Many Requests" ]] || [[ "$request" == "" ]]; then
echo "$request"

Extra info:

I did some research. There empty response seems comes from SSL Error before/during HTTP request. I tried sending frequent HTTP requests to IA by Python and kept getting this error:

HTTPSConnectionPool(host='web.archive.org', port=443): Max retries exceeded with url: /save/status/user (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions