Skip to content

Avoid spidering PDF URLs - causes crash or incomplete record #29

@wswtizer

Description

@wswtizer

Problem: I managed to crash the devCenter Uploader trying to add a link to a PDF with the 'Create New Document' tab. (Using the 'Create provisional document' tab has a different result in that it creates a record, but doesn't populate the title, so I can't access that record via the UI. Issue #28 opened for that.)

The problem is related to the fact that the tool tries to crawl (spider) for PDF, but there is no data. Every a document is edited with a blank body, it tries to fetch the content again.

Glynn suggested that the URL could be 'pre-fetched' in order to get its content-type and if it's not text/html just skip the crawler.

Workaround suggested: for PDF URL, ensure that you have a title, and put some words in the body field when creating a new record with 'Create new document tab' - this would avoid the attempt to fetch again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions