Skip to content

Behaviour of the /submit/ endpoint #22

@antonalekseev

Description

@antonalekseev

Behaviour of https://archive.md/submit/ endpoint has changed recently. Now it returns WIP page in Refresh header (https://archive.md/wip/Z6uhm) which contains page capture progress and expects client to retry until the page is captured and proper memento URL (https://archive.md/Z6uhm) returned via Location. This way archiveis.capture() always returns URL of the WIP page.

This can be fixed either by retrying until proper URL is available (and somehow handling errors if it is not) or just stripping /wip/ from URL and hoping for the best.

>>> archive_url = archiveis.capture("https://example.com")
DEBUG:archiveis.api:Requesting https://archive.md/
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): archive.md:443
DEBUG:urllib3.connectionpool:https://archive.md:443 "GET / HTTP/1.1" 200 4997
DEBUG:archiveis.api:Unique identifier: QxbCURgTX9qqOlJsvO7Qnp6OpwoRYUx3YErVZz1eLx4aUht3+iuOB+6Ili4WD2Y2
DEBUG:archiveis.api:Requesting https://archive.md/submit/
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): archive.md:443
DEBUG:urllib3.connectionpool:https://archive.md:443 "POST /submit/ HTTP/1.1" 200 244
DEBUG:archiveis.api:Memento from Refresh header: https://archive.md/wip/Z6uhm

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions