Skip to content
This repository was archived by the owner on Feb 16, 2023. It is now read-only.

Problem with adding documents other than PDF: Not Found for url: http://gotenberg:3000/convert/office #1455

@IscreaMan

Description

@IscreaMan

Hi. I am having problem with adding .docx files that for Gotenberg/tika conversion.

The erros is: Error while converting document to PDF: 404 Client Error: Not Found for url: http://gotenberg:3000/convert/office

Here's log entry:

documents.parsers.ParseError: Error while converting document to PDF: 404 Client Error: Not Found for url: http://gotenberg:3000/convert/office

[2021-11-28 16:59:22,417] [INFO] [paperless.consumer] Consuming dodruk na dolnym marginesie.docx

[2021-11-28 16:59:22,418] [DEBUG] [paperless.consumer] Detected mime type: application/vnd.openxmlformats-officedocument.wordprocessingml.document

[2021-11-28 16:59:22,420] [DEBUG] [paperless.consumer] Parser: TikaDocumentParser

[2021-11-28 16:59:22,423] [DEBUG] [paperless.consumer] Parsing dodruk na dolnym marginesie.docx...

[2021-11-28 16:59:22,424] [INFO] [paperless.parsing.tika] Sending /tmp/paperless/paperless-upload-mw87l9ig to Tika server

[2021-11-28 16:59:22,473] [INFO] [paperless.parsing.tika] Converting /tmp/paperless/paperless-upload-mw87l9ig to PDF as /tmp/paperless/paperless-1_45e99n/convert.pdf

[2021-11-28 16:59:22,476] [DEBUG] [paperless.parsing.tika] Deleting directory /tmp/paperless/paperless-1_45e99n

[2021-11-28 16:59:22,479] [ERROR] [paperless.consumer] Error while consuming document dodruk na dolnym marginesie.docx: Error while converting document to PDF: 404 Client Error: Not Found for url: http://gotenberg:3000/convert/office

Traceback (most recent call last):

File "/usr/src/paperless/src/paperless_tika/parsers.py", line 79, in convert_to_pdf

response.raise_for_status()  # ensure we notice bad responses

File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 953, in raise_for_status

raise HTTPError(http_error_msg, response=self)

requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://gotenberg:3000/convert/office

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/usr/src/paperless/src/documents/consumer.py", line 248, in try_consume_file

document_parser.parse(self.path, mime_type, self.filename)

File "/usr/src/paperless/src/paperless_tika/parsers.py", line 65, in parse

self.archive_path = self.convert_to_pdf(document_path, file_name)

File "/usr/src/paperless/src/paperless_tika/parsers.py", line 81, in convert_to_pdf

raise ParseError(

documents.parsers.ParseError: Error while converting document to PDF: 404 Client Error: Not Found for url: http://gotenberg:3000/convert/office

Here ares running containers:

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a801212ef883 jonaswinkler/paperless-ng:latest "/sbin/docker-entryp…" 8 minutes ago Up 8 minutes (healthy) 0.0.0.0:8000->8000/tcp paperless_webserver_1
3c3d4bae4b3b apache/tika "/bin/sh -c 'exec ja…" 9 minutes ago Up 9 minutes 9998/tcp paperless_tika_1
8a46d9b2f7b1 redis:6.0 "docker-entrypoint.s…" 9 minutes ago Up 9 minutes 6379/tcp paperless_broker_1
1cb833cf1d0a postgres:13 "docker-entrypoint.s…" 9 minutes ago Up 9 minutes 5432/tcp paperless_db_1
02bae6dac385 thecodingmachine/gotenberg "/usr/bin/tini -- go…" 9 minutes ago Up 9 minutes 3000/tcp paperless_gotenberg_1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions