Skip to content

feat(refactor): Ansible Role for Improved Code Quality and Maintainability#275

Open
Tealk wants to merge 5 commits intopaperless-ngx:mainfrom
RollenspielMonster:fix/linting
Open

feat(refactor): Ansible Role for Improved Code Quality and Maintainability#275
Tealk wants to merge 5 commits intopaperless-ngx:mainfrom
RollenspielMonster:fix/linting

Conversation

@Tealk
Copy link
Contributor

@Tealk Tealk commented Dec 19, 2025

Overview

This pull request comprehensively refactors the Paperless-ngx Ansible role to improve code quality, readability, and maintainability. The changes focus on better organization, consistent formatting, and modernized Ansible syntax patterns.

in this PR #271 #272 and #273 would already be included

Key Changes

1. Default Variables Reorganization (defaults/main.yml)

  • Grouped configuration variables into logical sections with clear headers
  • Added comprehensive section comments for easier navigation
  • Improved variable ordering and documentation
  • Enhanced readability for future maintenance

2. Code Quality Improvements

  • Refactored all task files for consistency and clarity
  • Modernized Ansible syntax (replaced deprecated patterns)
  • Simplified conditional logic and loop structures
  • Added standardized YAML file terminators (...)

3. Specific Enhancements

  • jbig2enc.yml: Improved version checking and installation logic
  • python.yml: Better dependency management and error handling
  • repo_packages.yml: Cleaner apt package declarations
  • systemd_services.yml: Simplified service configuration loops
  • configuration.yml: Extracted common service options
  • release_files.yml: Streamlined file operations and permissions handling

4. Ansible Best Practices

  • Removed redundant tag patterns (simplified apply/tags structure)
  • Improved task naming for better clarity
  • Enhanced check mode handling
  • Better use of loops and conditionals

@Tealk Tealk changed the title Refactor Ansible Role for Improved Code Quality and Maintainability feat(refactor): Ansible Role for Improved Code Quality and Maintainability Dec 19, 2025
@github-actions github-actions bot added the enhancement New feature or request label Dec 19, 2025
@stevenengland
Copy link
Collaborator

Hi @Tealk thanks for your huge effort. Would you mind to send all logical packets as separate packets so that they all can be reviewed and documented on their own? This is a huge PR which needs some dissection.

From a quick overview: Personally I am against the logical ordering of variables. That caused a lot of trouble in the past to keep consistency and clean separated blocks of vars. I'd like to stick with alphabetical order (which I use for the variables in multiple files) that will always be valid ;) Also the logic comes from the ansible main repo, in this repo it should only be of interest if a var is already supported.

@Tealk
Copy link
Contributor Author

Tealk commented Dec 25, 2025

Hi,

I dont understand what you mean with:

Also the logic comes from the ansible main repo, in this repo it should only be of interest if a var is already supported.

What do you mean by splitting it up?
These are all the changes I made so it runs 100% on my system and doesn't crash midway due to formatting errors or system discrepancies.

@stevenengland
Copy link
Collaborator

Hi @Tealk , sorry for the confusion. What I meant was, that the meaning of each variable and to which broader scope (database etc.) it belongs to is given by the help page of the paperless-ngx main repo. This paired with the experience I gathered by trying to keep such logical blocks/scopes for variables consistent over time in this ansible files makes me not a fan of reintroducing it :-) I would like to keep it alphabetically sorted.

Given this example and my wish to keep traceability of what PR introduced what particular change I would like to ask you to provide different PRs for each attempt of making the repo better. The NLTK PR is clearly a fix. Others are improvements to dedicated portions of the code. Some might be discussworthy, others will be merged easily (like the python fix you provided). You think this is possible?

@Tealk
Copy link
Contributor Author

Tealk commented Dec 26, 2025

I currently wouldn't know how to break this down. I first ran a strict linting process over it and fixed all errors and warnings, and adjusted dependencies.
During test runs, I then noticed problems which I also fixed. Since I only executed the original role once and then immediately started with the improvements, I would have to roll back every change and test multiple times.

I recreated the config because I couldn't figure it out with the little info in the readme and I had to search through the code to see what each variable actually does.

You're welcome to take out what you like and leave what you don't. I'm happy to test again, but currently the version in this repo isn't stable enough for a production environment for me.

@stevenengland
Copy link
Collaborator

Yes, I completely feel you. But the PR is huge and combines a lot of different improvements.

If you say it is not stable enough for production, can you elaborate a bit more on this? I merged the two other PRs you submitted. That would leave the NLTK thing a third that is needed. For that one I will see why the test suites did not report this issue and will adapt. Besides that anything else that is needed for production usage?

Regarding the variables: Would it have helped you if I referenced the paperless manual page for quick lookups? Because there it is clearly stated what each var does. Specific ones to the role (which are few) are documented in the Readme. Or do I miss a point here?

@Tealk
Copy link
Contributor Author

Tealk commented Dec 26, 2025

Unfortunately, I can't tell you exactly which tasks caused problems anymore, as it's been too long. There were a few scenarios that didn't run cleanly and where manual intervention was needed to get the playbook running again.
They're not common errors, but I wanted to make it resilient against unusual cases as well.

I only adjusted the variables because I had cross-checked them while making adjustments and it was too confusing for me; especially since only the first 50% or so were sorted alphabetically.

I can't understand why you had problems with the sorting there, I also have some larger config files that I sort exactly the same way and haven't had that problem before.

@stevenengland
Copy link
Collaborator

I can't understand why you had problems with the sorting there, I also have some larger config files that I sort exactly the same way and haven't had that problem before.

I wouldn't call it problems in the first place. It is recurring effort. Because if I introduce a ordering system then it must be consistent to the categories here: https://docs.paperless-ngx.com/configuration/

And these switched at least two times over the years on a bigger scale. That's why I gave up to follow. And I think most of the people just do look up the vars there.

@stevenengland
Copy link
Collaborator

Unfortunately, I can't tell you exactly which tasks caused problems anymore, as it's been too long. There were a few scenarios that didn't run cleanly and where manual intervention was needed to get the playbook running again.
They're not common errors, but I wanted to make it resilient against unusual cases as well.

An even bigger reason to keep track of each single one ;)
I am very sorry but I cannot overlook what each of your hundreds of changed lines maybe will cause as a side effect at another place somewhen.

For the NLTK thing: Seems like it really boils down to switching the package from punkt to punkt_tab? Or is uninstall of the old needed? Anything else?

@Tealk
Copy link
Contributor Author

Tealk commented Dec 26, 2025

For the NLTK thing: Seems like it really boils down to switching the package from punkt to punkt_tab? Or is uninstall of the old needed? Anything else?

Since I've deleted the directories quite often, I can't answer that at all right now; but I believe he ignored the old package.

ls -lah /usr/share/nltk_data/
total 20K
drwxr-x---   5 paperlessngx paperlessngx 4.0K Dec 11 18:09 .
drwxr-xr-x 120 root         root         4.0K Dec 19 12:51 ..
drwxrwxr-x   3 paperlessngx paperlessngx 4.0K Dec 11 18:09 corpora
drwxrwxr-x   2 paperlessngx paperlessngx 4.0K Dec 11 18:09 stemmers
drwxrwxr-x   4 paperlessngx paperlessngx 4.0K Dec 11 19:58 tokenizers

Tealk added a commit to Tealk/paperless-ngx_ansible that referenced this pull request Jan 13, 2026
Tealk added a commit to Tealk/paperless-ngx_ansible that referenced this pull request Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants