Replies: 16 comments 4 replies
-
You could set up a post-consume script: https://paperless-ng.readthedocs.io/en/latest/advanced_usage.html#post-consumption-script This receives the ID of the newly created document. That's all you need to connect to your database, get the document's content, do some regex matching on the content, and update the However, you'd have to wait for the next version for that. Currenctly, the script is called during a database transaction, not after, so the document won't be there yet by the time it is called. (Has been that for years in paperless, wonder why noone noticed) |
Beta Was this translation helpful? Give feedback.
-
Thank you (again ;-). The link you posted links to what i mean with post-hook. But i have to handle the DB-Connection by myself and there is no support by paperless-ng? i also have basic knowledge of python (i prefer to use python for this kind of solution), but there is no paperless-django-module which i can use for this? I only just need to send one sql-statement. Any hint is appreciated. |
Beta Was this translation helpful? Give feedback.
-
There's no support in paperless for this. If you want to modify the source directly, I can give you some directions where to make these changes. |
Beta Was this translation helpful? Give feedback.
-
Can you point me where to find Database-Settings (URL, Database, User, Password) in the Docker-Container? When i like to connect to the database i think it is better to use this information instead of hardcoding. |
Beta Was this translation helpful? Give feedback.
-
These settings should be available as environment variables, as specified in the docker-compose.env file / in the environment section of the docker-compose.yml file. |
Beta Was this translation helpful? Give feedback.
-
That's what i hoped for but unfortunately not (only PAPERLESS_DBHOST): |
Beta Was this translation helpful? Give feedback.
-
Oh, right.
Database name, username and password all default to https://github.com/jonaswinkler/paperless-ng/blob/master/src/paperless/settings.py#L253 |
Beta Was this translation helpful? Give feedback.
-
So, when the environment variable is not set, it is the default, otherwise i will find a given environment variable. |
Beta Was this translation helpful? Give feedback.
-
Yes. |
Beta Was this translation helpful? Give feedback.
-
I am finished with the post-hook-script. Unfortunately the script is not called or there are permission problems.
Questions:
|
Beta Was this translation helpful? Give feedback.
-
Ok. Worked it out. The Filesystem, where the script resides was mounted with noexec! This is the reason, why the bash-script gets "Permission denied". I bind a new folder to the docker-container, which is mounted "exec". Now the script is executed. Information regarding this issue is found only in the docker-logs. Now there are only remain my questions 3 and 4. |
Beta Was this translation helpful? Give feedback.
-
You can also bind mount single files into the container if you do not wish to put that into the consumption directory. |
Beta Was this translation helpful? Give feedback.
-
Finished. Everythink works. My first post-hook-script. It tries to extract a 6-digit number from the title and content of the document. If it finds a matching number and it is not used already by any other document it writes it to the ASN-Field in the database (currently only postgresql is supported). If somebody is interested, i can provide the file. May be it can be found a repository to provide such hook-scripts. Thx for your support. |
Beta Was this translation helpful? Give feedback.
-
asn.py.zip sure, no problem. Please adjust Database, Databaseuser and Databasepassword for your environment. The script still works for me, just a few false positives when OCR is not working 100%. But that happens rarely. Best, |
Beta Was this translation helpful? Give feedback.
-
Hi, has anyone in this thread (@e-patrick, @andbez) experienced issues with Paperless not being able to run the post consume script in a Docker container? I've been struggling for a long time with a simple script that calls Home Assistant (using curl) after consumption, but keep getting the message "Configured post-consume script "/usr/src/paperless/post_consume.sh" does not exist.", which is odd since I can both ls, cat and execute that particular script inside the container. The script is mounted using a bind mount and it is passed to Paperless using an environment variable with an absolute path. Paperless log: Docker log:
Test using bash inside container:
Can anyone understand why the script can't be executed by Paperless? |
Beta Was this translation helpful? Give feedback.
-
@e-patrick and @andbez - thanks for helping out! I solved the problem - it wasn't the hashbang, permissions or noexec... It turned out that the quotes in the value of the ENV variable was the culprit. This didn't work: But this worked: (Fun fact: I realized this after testing The pre consumption hook example in the docs uses quotes in the environment, and I assumed the .env file was supposed to have the same syntax. I'll try to submit a change to the docs to make this part clearer. @jonaswinkler Is there any particular reason to why document type isn't available as an argument to the post-consumption script? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Every scanned Document which is archived as paper get's a Number (Always 6-digits, the first 2 Digits are a long time 0 ;-)). This would be in the File-Content after Consumption and OCR by Paperless. Now i would like to extract the number by RegEx 00\d{4} and setting the ASN of the consumed document. I did not find any solution using the manage.py Script.
Any Idea how to handle this? Is it possible to use a self developed SQL-Statement?
Thx.
Beta Was this translation helpful? Give feedback.
All reactions