feat: extend phonehome facts with pseudonymization#1034
Merged
DavidePrincipi merged 11 commits intomainfrom Jan 26, 2026
Merged
feat: extend phonehome facts with pseudonymization#1034DavidePrincipi merged 11 commits intomainfrom
DavidePrincipi merged 11 commits intomainfrom
Conversation
Send the free-text label of cluster entities.
- Initialize a random seed for string pseudonymization. Save it into Redis for stable results. - Enforce pseudonymization if a subscription is not active.
Return new ansible facts as attributes of node get-facts action response: - fqdn (pseudonymized with new pseudo_domain() function) - timezone - uptime seconds - kernel version
- Add a list of detailed user domain objects. - Add module reference to user domains.
- Import agent.facts library under cluster and node get-facts actions - Return user_domains facts from cluster/get-facts - Clean up print-phonehome script
Contributor
There was a problem hiding this comment.
Pull request overview
This pull request extends the phonehome facts collection functionality with comprehensive pseudonymization support to protect sensitive data. The implementation ensures that systems without a subscription have their sensitive information (domain names, IP addresses, hostnames) pseudonymized using a stable seed, while maintaining data utility for telemetry purposes.
Changes:
- Added new
agent.factsmodule implementing pseudonymization functions with MD5-based hashing - Extended cluster, node, and module facts with additional fields including ui_name, timezone, kernel version, uptime, FQDN, and user domain information
- Introduced stable pseudonymization seed stored in Redis (cluster/anon_seed) to ensure consistent hashing across collection runs
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| core/imageroot/usr/local/agent/pypkg/agent/facts.py | New module implementing pseudonymization functions for strings, domains, and IP addresses |
| core/imageroot/var/lib/nethserver/cluster/bin/print-phonehome | Enhanced to collect ui_name, user_domains, FQDNs from Traefik, and module certification/update status |
| core/imageroot/var/lib/nethserver/cluster/actions/get-facts/50get | Added user domain facts collection with counters and update schedule information |
| core/imageroot/var/lib/nethserver/node/actions/get-facts/50get | Extended node facts with cluster_leader flag, FQDN, IP addresses, timezone, kernel version, and uptime |
| core/imageroot/usr/local/agent/pypkg/cluster/inventory.py | Added fact_user_domain_counters function to count LDAP users and groups per domain |
| docs/core/database.md | Documented new cluster/anon_seed Redis key for pseudonymization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
core/imageroot/var/lib/nethserver/cluster/actions/get-facts/50get
Outdated
Show resolved
Hide resolved
- add update_available flag to modules and nodes - module certification level attribute (from repo metadata) - report cluster update disabled status and reason - report update schedule status flag
- Extract IPv4 and IPv6 from default node IP route. - Implement a pseudo_ip() helper that returns a stable, random hash of the given IP address from the global anon_seed.
- Add the list of FQDN to the application facts. FQDNs are obtained from Traefik instances, by looking at the name_module_map fact. - Traefik knows an application name from set_route() and set_certificate() calls.
- Print a warning if the global cluster seed is unset - Use a temporary self-generated fallback seed to calculate hashes
5d48986 to
f479091
Compare
Extract the node creation date by looking at the birth date of some files, created at NS8 installation time. Using system's "stat" command because it provides better filesystem birth date compatibility than Python 3.11.
083c400 to
5eb9c5c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Sensitive strings, domain and host names are pseudonymized with a random seed. The seed is generated once in cluster lifetime to ensure hash stability.
Refs NethServer/dev#7829