Skip to content

extractor.py should ignore already extracted PDFs  #38

@skorasaurus

Description

@skorasaurus

As of now, the extractor will run on all qualified PDFs, even ones that have already been extracted.

As we incrementally add newly released files, re-extracting them again is a waste of time.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions