Skip to content

Remove file extension checking #646

@zhengliw

Description

@zhengliw

When storing a PDF in a named temporary file, it does not have any file extension indicating it's a PDF, but the content is still valid for parsing. When passing the path as a string, camelot would refuse to parse the PDF due to the missing file extension. I propose the following change:

diff --git a/camelot/handlers.py b/camelot/handlers.py
index 14e29dd..f4e2263 100644
--- a/camelot/handlers.py
+++ b/camelot/handlers.py
@@ -68,9 +68,6 @@ class PDFHandler:
             filepath = download_url(str(filepath))
         self.filepath: StrByteType | Path | str = filepath
 
-        if isinstance(filepath, str) and not filepath.lower().endswith(".pdf"):
-            raise NotImplementedError("File format not supported")
-
         if password is None:
             self.password = ""  # noqa: S105
         else:

As a workaround for now, I pass pathlib.Path("path_to_pdf") instead of the string directly, which bypasses the checking. Unless there is a good reason to keep the file extension check, I would like to PR this change. Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions