Conversation
8a345e5 to
a13fe6b
Compare
thesamesam
left a comment
There was a problem hiding this comment.
Just some initial comments. I'd consider rebasing and cleanly splitting the refactoring before adding the ebuild backend as well, like the changes to the Debian side & md5 parts
| AS_HELP_STRING([--with-ebuild],[Use the ebuild database as a trust source]), | ||
| use_ebuild=$withval,use_ebuild=no) | ||
|
|
||
| if test x$use_ebuild = xyes ; then |
There was a problem hiding this comment.
Use AS_IF. Bare if has issues with quoting and is no longer recommended in the autoconf manual.
a13fe6b to
dc272c2
Compare
ff73fd5 to
1796a0f
Compare
|
Here's the other work I'm hoping to land before the next release, but if we don't make it I can snapshot the code for Gentoo after it's merged. #277 has been raised to separate out the deb-backend changes; I'll rebase once that's merged so that there's less to review here. Would you like me to split out this commit too, or should we squash all of the ebuild-related mess in the middle down into a few logical commits? Edit: I have my TODO up the top, but aside from getting derailed by some 'not caused by my changes' ASAN stuff that I'm debugging in the background, I need to:
On my todo, but not part of this PR, is adding a Gentoo CI/CD stage - there's some work that I'll need to do to come up with something suitable for downstream use. I suppose a short-term alternative, as the ebuild backend adds no new dependencies, would be to tarball that up as test data and just extract it in the test container? I'll ping you again once I think the code is ready for review, but if you have any early feedback I'm willing to address it. :) Edit the second: It won't make this PR / release, but would you be receptive to a Meson build system as an alternative to (or replacement for) autotools? It's starting to get to me a bit, so I'm willing to put in the porting work. |
bffb243 to
e4eabce
Compare
|
PR #277 appears ready to merge. Do we want to do that first and then rework this patch (if needed)? I have not yet looked this patch over. |
7c256e3 to
2f3e330
Compare
Yes, please - that's a prereq for this PR. I've had some people better at C than I provide some feedback offline and I'll be addressing it over the next few days. |
b239ec1 to
08b0dfa
Compare
|
I've adapted the deb test to provide some basic assurances that the ebuild backend can read a vdb. I spent a bit of time today working on generating a psuedo-vdb that would enable us to test this without relying on a real Portage installation / database, but I didn't get there in the end. I'll keep plugging away at it in the background, but it may not make it in here. I'm reasonably happy with the current state, reading the VDB is fast enough, though I could stand to work out if there's anything else on a Gentoo system can be safely filtered to reduce the amount of md5ing that happens when populating the trust db. Happy to address any feedback! |
|
OK, the other patch was merged which appears to create a couple conflicts that need fixing. I'll see if I can look this PR over soon. |
3149256 to
65041a5
Compare
|
Was it really necessary to rename init.d to extra? That added a whole lot of churn. There is a lot of code in /usr/share. It's in the HFS docs as the place for purely interpreted libraries. Have you compiled and run this with address/undefined behavior sanitizers? |
|
Apologies for the delayed response, I got deployed to help out with the response to the local severe weather event - today is my first day back at the keyboard.
It seemed like the best way to separate the systemd init scripts, new openrc init stuff, and the application data / configuration that was all stored together under the 'init' folder. If you have a better way of organising it (or just one that you prefer) I'm happy to amend that commit.
In Gentoo it's 'architecture independent application data which is not modified at runtime.', but I do see some python, JS, and... C header files in amongst all the other binary files installed there on my development box. I'll have to come up with a better filter. I'll force push and drop that for the short term.
Yes. I actually have a branch where I'm tracking down some weirdness that I thought I caused but apparently existed in the code before I started touching it. I have a minimal broken example so if I can't work that out I'll log a bug: I am currently leaking 13 bytes allocated here that I haven't been able to track down, I'll try and look into that soon but I'm still recovering from working a week and a half of my vacation! |
|
Hello, sounds like you've been busy. I think it might be best to not change the name of the directory as part of this patch set. That can/should be done separately. The idea is to minimize the changes in case we need to bisect back to this. The original idea was that everything related to config and initialization would live in the init directory. Let's address this after this patch is merged. Also, I think we should schedule this patch for after the 1.3.3 release so that we can get other bug fixes out to the community. Maybe since this is a new platform being on-boarded, we call the next release 1.4. Regarding /usr/share, you might look at how we do this for rpm. We use a config file, fapolicyd-filter.conf. There is a library/filter.h file that shows the API that uses it. We had to resort to this because we found that we may be getting rid of files that the user really wants to trust. So, we gave them a way to take control. As for the address sanitizer issue, I think that should be filed separately and treated as a 1.3.3 release blocker. I can see an issue it my testing too. Thanks for spotting it. Issue #281 was opened for this. @radosroka , have you seen the above address sanitizer issue? |
Would you be happy for a patch/PR that moves the existing files into the new structure so that the 'gentoo PR' can just add the openrc files, or is there a strong preference for sticking with the existing approach?
No worries, less time pressure to look at filtering. 😄 Gentoo has a binary package host now, so it will be sane to add a containerised CI/CD option where it still wasn't late last year - I'll look into that too if I have time!
Will do. |
Is this from the daemon itself or CLI? |
|
It's from the daemon itself. |
And was it on this branch or main? |
|
Not forgotten, just had a few other projects eat up my time. I'll aim to get this finished off for late April / Early may; I have some leave coming up :) |
f513fbf to
396803f
Compare
|
OK I put fvwm3 to bed and finally have some time for this. Thanks for the pointer on the filter - I'm almost there with it's implementation - I seem to be having some difficulty getting anything to match the filters though, which is annoying. I'll try and track that down over the next little bit. All that I can tell for now is that the filter file is loaded (just a slightly modified fedora config because our kernel path is My "sophisticated debugging" techniques here seem to reveal that I also note that Otherwise we seem to be looking pretty good; after 15 minutes or so in permissive mode; it's even happy with the package manager (at least in terms of it removing things!): |
| typedef struct contents { | ||
| char *md5; | ||
| char *path; | ||
| } ebuildfiles; |
There was a problem hiding this comment.
Nit, but this describes a single file, so it should properly be ebuildfile - otherwise it is confusing if the object is an array of file objects or just a single file.
| #include "fapolicyd-backend.h" // for SRC_EBUILD, backend | ||
| #include "filter.h" // for filter_destroy, filter_init, filter_l... | ||
| #include "llist.h" // for list_empty, list_init | ||
| #include "md5-backend.h" // for add_file_to_backend_by_md5 | ||
| #include "message.h" // for msg |
There was a problem hiding this comment.
From other code in this repository, 8 column tabs are used; these comments don't align with the others when using 8 column tabs.
156e552 to
1d9c07d
Compare
This commit introduces a new backend for handling Gentoo ebuilds via the paortage var database (VDB), allowing fapolicyd to trust files managed by the Portage package manager. The implementation is modeled after the existing Debian (MD5) and RPM backends. Key changes: - Implemented `ebuild-backend` to ingest trust data from the ebuild VDB. - Integrated the new backend into the backend manager and database structures. - Added OpenRC init scripts (`init.d` and `conf.d`) to support Gentoo and other non-systemd environments. - Updated the build system to exclude OpenRC files when building RPMs. This enables fapolicyd to function on Gentoo-based systems by automatically trusting installed packages. Signed-off-by: Matt Jolly <kangie@gentoo.org>
If the main event loop and decision thread try to write out at the same time, this can result in interleaved output like this: ``` 12/03/25 10:28:29 [ DEBUG ]: Main poll interrupted 12/03/25 10:28:29 [ DEBUG12/03/25 10:28:29 [ DEBUG ]: Main poll interrupted ]: rule=6 dec=allow perm=open auid=1000 pid=15627 exe=/usr/bin/zsh : path=/usr/lib64/ld-linux-x86-64.so.2 ftype=application/x-sharedlib trust=1 . . . 12/03/25 10:28:29 [ DEBUG ]: Main poll interrupted 12/03/25 10:28:29 [ 12/03/25 10:28:29 [ DEBUG ]: rule=29 dec=allow perm=open auid=1000 pid=15627 exe=/usr/bin/env : path=/etc/ld.so.cache ftype=application/octet-stream trust=0 DEBUG ]: Main poll interrupted ``` Add a `pthread_mutex` to serialise access to `stderr` and ensure that output comes out as desired. Signed-off-by: Matt Jolly <kangie@gentoo.org>
My initial implementation of colour logging output unconditionally put ANSI escape sequences in log output. Improve this by respecting `NO_COLOR` and detecting if `stderr` is a tty, and only adding colours if that's not the case (i.e. interactive operation). Signed-off-by: Matt Jolly <kangie@gentoo.org>
1d9c07d to
ac778b0
Compare
Lower level APIs, like `md5` have been deprecated since OpenSSL 3.0. Use the modern high-level `EVP` (envelope) API instead. Signed-off-by: Matt Jolly <kangie@gentoo.org>
The `sanity_check_array` function in `src/library/object.c` was accepting a mutable `o_array *` but was being called with `const o_array *` from `object_access` and `object_find_file`. Since the function only reads from the array, the signature was updated to accept `const o_array *` to resolve the "discards 'const' qualifier" warning. Signed-off-by: Matt Jolly <kangie@gentoo.org>
When adding items to the subject or object arrays, the code previously unconditionally incremented the item count (`cnt`). If an entry already existed at the target index (e.g., overwriting an existing attribute), this caused the count to drift from the actual number of elements. This desynchronisation triggered the `sanity_check_array` assertion: `[ DEBUG ]: subject_add 2 - array corruption 3!=4` This patch modifies `subject_add` and `object_add` to check for existing entries. If an entry exists, it is properly freed and replaced without incrementing the count. The count is now only incremented when a new slot is filled. Signed-off-by: Matt Jolly <kangie@gentoo.org>
If we exit during early init (say by pressing `^C` during the ebuild backend's long init before the queue has been initialised), it could cause a crash. Modify `nudge_queue` to safely handle this case and properly exit early. Signed-off-by: Matt Jolly <kangie@gentoo.org>
ac778b0 to
986b4d1
Compare
This commit introduces a meson build system option for fapolicyd. It has the same functionality as the existing autotools implementation, and tests are wired up. Signed-off-by: Matt Jolly <kangie@gentoo.org>
986b4d1 to
5371858
Compare
|
OK it's been a hot minute. Revisited the code and fixed my issue with filters (oops, it was trivial). The ebuild backend is not quick to initialise (and will never be while we have to re-hash every file to get a The ebuild backend is self contained and does not rely on any package-manager dependencies. Some minimal VDB parsing tests have been included; since these tests should pass on any build (and currently are) I don't see too much value in adding additional Gentoo CI, but I'm happy to do so if you want "modern compiler" checks as I will be packaging this downstream and we tend to run bleeding-edge GCC - let me know if that's desirable. ASAN and Valgrind are happy, and I've included a few bugfixes that I encountered along the way: Logging:
Stability:
QA:
While reviewing the Additionally, while I was waiting for ASAN and Valgrind in some very long ebuild backend inits I implemented a quick meson port. It's fast, has feature-parity with the existing autotools build, has tests wired up, and provides a nice summary at the end of the configure stage. If you're interested in merging this I'm happy to update CI to cater for it, too. This is a big PR, and although I've done my best to isolate changes into their own commits I'm happy to split any change off into an individual PR for separate review, drop any changes that you're not interested in, or respond to any feedback on the code. As-is we're passing tests and I've been able to run I'm helping out with recovery from some severe storms we had on this side of the timecube, so my availability will be a little limited in the near future, but I'll do my best to respond to any feedback promptly. @stevegrubb - When you have time, let me know what changes you'd like. No rush given that it's been about a year since I last had time to progress this one :) |
|
This patch is trying to do a lot:
These should be separate patches so they can be reviewed and accepted on their own merit. The ebuild backend should probably be done with a new PR with an up to date commit message so that we have a merge with an appropriate introduction. I would suggest making the PR's in the common code/moving files around first so that changes I mention below don't cause this whole patch to have conflicts. Better to make smaller PRs to reduce merge surface area. There are going to be some patches landing soon that may conflict with this, so we should probably get through this asap. ebuild-backend - it would be preferable if you followed the project code style so that it has a common look and feel. It follows the linux kernel programming style. Also, it's nice that you followed the memfd approach, that is much faster and doesn't waste as much memory. On my fedora system with a little over 6000 rpms, startup time is around 3 seconds. Of course it doesn't have to rehash everything. We did a whole lot of performance work on the rpm backend to get it there. You might look at the commit history of that file to see what performance changes we had to make. In the rpm-backend, filter_check is called when processing a file in the package. If it is not a keeper, you do not need to malloc, hash, and save it on a list. Just move on to the next file. The point is to call the filter when the path is assembled and it's known to be a regular file to see if there's any reason to continue with it. Other notes, SHA256() is on the hot path. This patch replaces it with EVP_ functions which breaks the hashing up into multiple calls. Does this make it faster? Anything changing crypto should be it's own patch with the rationale for doing so in it's commit message. How to measure performance of fapolicyd is here: https://github.com/linux-application-whitelisting/fapolicyd-extras/tree/main/performance Updates in md5-backend.c could be its own patch The interleaving TTY output is a direct result of using multiple calls to fputs which is much faster than the original which was a monster, big fprintf. This should be it's own patch with a commit message explaining it. But I'm considering to defer a patch addressing this issue because I've got plans for next year to redesign a major part of the application such that there is more than one decision thread. That is going to involve doing a lot of isolation and synchronization work. I'd rather have this to be part of that. (Also, this code is on the critical path when using the debug mode.) So, maybe consider breaking this up into multiple patches with the mutex locking the last in the series. Colorization changes could go in on it's own merit. What is the code change in object for? Should be its own patch with its own commit message. What is the code change in subject for? Should be its own patch with its own commit message. |
|
Thanks for the feedback. I'm finally back from travel and relief work so I'll get this stuff split up over the next couple of days for review - the commits should already be independent; I anticipated that this would need to be broken up. WRT the SHA256/EVP stuff - it's the only non-deprecated way to access those hashing functions in modern OpenSSL, but we can discuss that on its own PR. :) |
|
ok, just for planning purposes...we are planning to do a release sometime in January. Probably mid to late just to roll up everything since the last release. (The magic lookup optimization is the main driver for the release.) |

I stumbled across fapolicyd and thought "that sounds pretty cool" and "I bet I could teach it to read the portage database".
After a month or so of "Fun" this PR is the result.
It's currently a WIP but I figured that good enough to request feedback and see if (once I write some tests and validate that I didn't break anything else...) we can sneak it in before the next release.
I am a C novice, so please let me know what can be improved!
Highlights:
init->extra(and categorised some files)TODOs:
libmdunescape_shellduring the refactoring (I didn't, just noticed something extant with asan enabled, will investigate further)