Skip to content

Conversation

@manoharan-nexthop
Copy link

@manoharan-nexthop manoharan-nexthop commented Sep 8, 2025

This implementation addresses the User Management HLD requirements for centralized user administration in SONiC. sonic-net/SONiC#2018

1. User Management Daemon (userd):

  • Implemented C++ daemon using SWSS framework for CONFIG_DB integration
  • Added user lifecycle management (create/update/delete/enable/disable)
  • Implemented role-based group assignment (administrator, operator roles)
  • Added SSH key management with proper file permissions and ownership
  • Used posix_spawn() for secure command execution without shell interpretation
  • Integrated PAM faillock configuration using Jinja2 templates

2. Build System Integration:

  • Added CMakeLists.txt for C++ compilation with SWSS dependencies
  • Created debian packaging with proper control files and dependencies
  • Added systemd service configuration for userd daemon
  • Integrated Makefile for building and installation

3. Security Features:

  • Secure password handling using system's native hashing methods
  • Role-based group assignments with predefined security groups
  • PAM faillock integration for login attempt limiting

@mssonicbld
Copy link

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@manoharan-nexthop manoharan-nexthop changed the title [userd] Add user management daemon feat: Add user management daemon Sep 8, 2025
%:
dh $@

override_dh_auto_build:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe these changes have no effect?

You override the default behaviour (dh_auto_build) and then invoke the default behaviour (dh_auto_build).

@mssonicbld
Copy link

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Comment on lines +494 to +651
// Add user to role-specific groups
for (const std::string& group : it->second) {
if (!is_user_in_group(username, group)) {
std::vector<std::string> cmd = {"/usr/sbin/usermod", "-a", "-G", group, username};
SystemCommand::execute(cmd); // Don't fail if group doesn't exist
SWSS_LOG_DEBUG("Added user %s to group %s", username.c_str(), group.c_str());
} else {
SWSS_LOG_DEBUG("User %s already in group %s", username.c_str(), group.c_str());
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we ever need to remove users from a group? For example if one is downgraded from admin to operator?

bool set_user_password(const std::string& username, const std::string& password_hash) {
std::vector<std::string> cmd = {"/usr/sbin/usermod", "-p", password_hash, username};

if (!SystemCommand::execute(cmd)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SystemCommand::execute with log the hash as well. Do we want that?

SONiC needed a centralized user management daemon that can:
- Monitor CONFIG_DB for user configuration changes
- Manage local system users based on SONiC configuration
- Provide role-based access control with predefined groups
- Handle SSH key management with proper security
- Integrate with PAM for authentication policies
- Support efficient change detection to minimize system calls

**1. User Management Daemon (userd):**
- Implemented C++ daemon using SWSS framework for CONFIG_DB integration
- Added comprehensive user lifecycle management (create/update/delete/enable/disable)
- Implemented role-based group assignment (administrator, operator roles)
- Added SSH key management with proper file permissions and ownership
- Used posix_spawn() for secure command execution without shell interpretation
- Added efficient change detection using UserInfo comparison to avoid unnecessary system calls
- Integrated PAM faillock configuration using Jinja2 templates

**2. Build System Integration:**
- Added CMakeLists.txt for C++ compilation with SWSS dependencies
- Created debian packaging with proper control files and dependencies
- Added systemd service configuration for userd daemon
- Integrated Makefile for building and installation

**3. Security Features:**
- Secure password handling using system's native hashing methods
- Proper file permissions for SSH keys (600) and directories (700)
- Role-based group assignments with predefined security groups
- PAM faillock integration for login attempt limiting
- Input validation and sanitization for all user operations

**4. Testing Framework:**
- Added comprehensive unit tests for userd functionality
- Integration tests for CONFIG_DB interaction
- User lifecycle testing with proper cleanup
- SSH key management testing
- Role-based access control validation
@mssonicbld
Copy link

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

# Role-based security policies
{% for role, policy in security_policies.items() %}
{% if policy.max_login_attempts %}
# {{ role }} role settings

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does faillock support individual config per group/role/user ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is applied per role.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a user management daemon (userd) for SONiC that provides centralized user administration capabilities. The daemon integrates with CONFIG_DB to manage local users, roles, SSH keys, and security policies, replacing manual user management with an automated, database-driven approach.

Key Changes

  • Core daemon implementation (userd.cpp): 1190-line C++ daemon using SWSS framework for CONFIG_DB integration, implementing full user lifecycle management (create/update/delete/enable/disable), role-based access control, SSH key management, and PAM faillock configuration
  • Build and packaging infrastructure: Complete CMake build system, Debian packaging with systemd service configuration, and Makefile for compilation and installation
  • Security policy templates: Jinja2 template for rendering PAM faillock configuration based on role-specific security policies

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 25 comments.

Show a summary per file
File Description
userd/src/userd.cpp Main daemon implementation with UserManager class for CONFIG_DB-driven user lifecycle management, SSH key handling, PAM configuration, and security policy enforcement
userd/src/CMakeLists.txt CMake build configuration with C++17, compiler flags, SWSS dependencies, and installation rules
userd/Makefile Top-level Makefile with build-cpp, install, and clean targets for development and packaging
userd/debian/control Debian package metadata with dependencies on SWSS, nlohmann-json, and crypt libraries
userd/debian/sonic-host-userd.service Systemd service unit with dependencies on config-setup and sonic.target
userd/debian/rules Debian packaging rules integrating with Makefile build system
userd/debian/install Installation manifest placing userd binary in /usr/local/bin
userd/debian/changelog Package changelog with initial release information
userd/debian/compat Debhelper compatibility level 11
userd/debian/copyright Apache 2.0 license information
data/templates/faillock.conf.j2 Jinja2 template for rendering PAM faillock configuration from security policies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +926 to +928
std::string j2_command = "j2 " + PAM_FAILLOCK_TEMPLATE + " " + temp_json_file;
std::string rendered_content;
FILE* pipe = popen(j2_command.c_str(), "r");
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The popen() function executes the command through a shell, which is a security risk. Since the template path and temp file are constructed from variables, this could potentially be exploited if those paths are manipulated. Consider using posix_spawn() with proper argument parsing (similar to SystemCommand::execute()) or at least validate/sanitize the paths before using them in the shell command.

Copilot uses AI. Check for mistakes.
template_data["security_policies"] = policies;

// Write JSON to temporary file
std::string temp_json_file = "/tmp/security_policies.json";
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The temporary JSON file /tmp/security_policies.json is created in a predictable location without using mkstemp() or similar secure temporary file creation. This could lead to symlink attacks or race conditions where an attacker could manipulate the file before it's used. Use mkstemp() or a similar secure method to create temporary files with unpredictable names.

Copilot uses AI. Check for mistakes.
Comment on lines +862 to +890
if (!user_config.password_hash.empty() &&
current_info.password_hash != user_config.password_hash) {
if (!set_user_password(username, user_config.password_hash)) {
return false;
}
}

if (current_info.shell != expected_config.shell) {
if (!set_user_shell(username, user_config.enabled)) {
return false;
}
}

if (!user_config.role.empty() && current_info.role != user_config.role) {
SWSS_LOG_INFO("Changing user %s role from '%s' to '%s'",
username.c_str(), current_info.role.c_str(), user_config.role.c_str());
if (!set_user_groups(username, user_config.role)) {
return false;
}
}

if (current_info.ssh_keys != user_config.ssh_keys) {
if (!setup_ssh_keys(username, user_config.ssh_keys)) {
return false;
}
}

SWSS_LOG_INFO("Updated user %s", username.c_str());
return true;
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent error handling: when updating the password hash fails (line 863), the function continues checking for other differences but then returns false if password update failed. However, when shell or groups update fails (lines 869-881), the function also returns false but may have already applied partial changes (e.g., password was updated). This could leave users in an inconsistent state. Consider using a transaction-like approach or at least log which updates succeeded before failing.

Suggested change
if (!user_config.password_hash.empty() &&
current_info.password_hash != user_config.password_hash) {
if (!set_user_password(username, user_config.password_hash)) {
return false;
}
}
if (current_info.shell != expected_config.shell) {
if (!set_user_shell(username, user_config.enabled)) {
return false;
}
}
if (!user_config.role.empty() && current_info.role != user_config.role) {
SWSS_LOG_INFO("Changing user %s role from '%s' to '%s'",
username.c_str(), current_info.role.c_str(), user_config.role.c_str());
if (!set_user_groups(username, user_config.role)) {
return false;
}
}
if (current_info.ssh_keys != user_config.ssh_keys) {
if (!setup_ssh_keys(username, user_config.ssh_keys)) {
return false;
}
}
SWSS_LOG_INFO("Updated user %s", username.c_str());
return true;
bool password_updated = true;
bool shell_updated = true;
bool groups_updated = true;
bool ssh_keys_updated = true;
if (!user_config.password_hash.empty() &&
current_info.password_hash != user_config.password_hash) {
password_updated = set_user_password(username, user_config.password_hash);
if (!password_updated) {
SWSS_LOG_ERROR("Failed to update password for user %s", username.c_str());
}
}
if (current_info.shell != expected_config.shell) {
shell_updated = set_user_shell(username, user_config.enabled);
if (!shell_updated) {
SWSS_LOG_ERROR("Failed to update shell for user %s", username.c_str());
}
}
if (!user_config.role.empty() && current_info.role != user_config.role) {
SWSS_LOG_INFO("Changing user %s role from '%s' to '%s'",
username.c_str(), current_info.role.c_str(), user_config.role.c_str());
groups_updated = set_user_groups(username, user_config.role);
if (!groups_updated) {
SWSS_LOG_ERROR("Failed to update groups for user %s", username.c_str());
}
}
if (current_info.ssh_keys != user_config.ssh_keys) {
ssh_keys_updated = setup_ssh_keys(username, user_config.ssh_keys);
if (!ssh_keys_updated) {
SWSS_LOG_ERROR("Failed to update SSH keys for user %s", username.c_str());
}
}
if (password_updated && shell_updated && groups_updated && ssh_keys_updated) {
SWSS_LOG_INFO("Updated user %s", username.c_str());
return true;
} else {
std::ostringstream oss;
oss << "Partial update for user " << username << ": ";
oss << "password=" << (password_updated ? "ok" : "fail") << ", ";
oss << "shell=" << (shell_updated ? "ok" : "fail") << ", ";
oss << "groups=" << (groups_updated ? "ok" : "fail") << ", ";
oss << "ssh_keys=" << (ssh_keys_updated ? "ok" : "fail");
SWSS_LOG_ERROR("%s", oss.str().c_str());
return false;
}

Copilot uses AI. Check for mistakes.

for (const std::string& key : ssh_keys) {
file << key << "\n";
}
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file stream is not explicitly checked for write errors. The file << key << "\n" operation could fail silently (e.g., disk full). Check the stream state after writing or use file.flush() with error checking before closing.

Suggested change
}
}
if (!file) {
SWSS_LOG_ERROR("Failed to write SSH keys to authorized_keys file for user %s", username.c_str());
file.close();
return false;
}

Copilot uses AI. Check for mistakes.
Comment on lines +763 to +769
std::vector<gid_t> groups(ngroups);
if (getgrouplist(username.c_str(), pw->pw_gid, groups.data(), &ngroups) != -1) {
for (gid_t gid : groups) {
struct group* grp = getgrgid(gid);
if (grp && std::string(grp->gr_name) == groupname) {
return true;
}
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getgrouplist() function is called twice: first with nullptr to get the count, then with the actual buffer. However, the second call could return -1 if the group list changed between calls (race condition) or if the buffer is too small. Check the return value of the second getgrouplist() call and handle the -1 case appropriately.

Suggested change
std::vector<gid_t> groups(ngroups);
if (getgrouplist(username.c_str(), pw->pw_gid, groups.data(), &ngroups) != -1) {
for (gid_t gid : groups) {
struct group* grp = getgrgid(gid);
if (grp && std::string(grp->gr_name) == groupname) {
return true;
}
int max_attempts = 5;
int attempts = 0;
while (attempts < max_attempts) {
std::vector<gid_t> groups(ngroups);
int ret = getgrouplist(username.c_str(), pw->pw_gid, groups.data(), &ngroups);
if (ret != -1) {
for (gid_t gid : groups) {
struct group* grp = getgrgid(gid);
if (grp && std::string(grp->gr_name) == groupname) {
return true;
}
}
break;
} else {
// Buffer was too small, ngroups updated to required size
attempts++;

Copilot uses AI. Check for mistakes.

* Initial release

-- SONiC Maintainers <maintainers@sonic.net> Mon, 11 Aug 2025 08:00:00 +0000
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The changelog indicates a future date (Mon, 11 Aug 2025) which is unusual. Typically changelog dates should reflect the actual release date or use the current date. Consider using the current date or a placeholder that will be updated upon release.

Suggested change
-- SONiC Maintainers <maintainers@sonic.net> Mon, 11 Aug 2025 08:00:00 +0000
-- SONiC Maintainers <maintainers@sonic.net> TODO: update upon release

Copilot uses AI. Check for mistakes.
Comment on lines +862 to +867
if (!user_config.password_hash.empty() &&
current_info.password_hash != user_config.password_hash) {
if (!set_user_password(username, user_config.password_hash)) {
return false;
}
}
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Password hash comparison is attempted but current_info.password_hash is never populated from system data. The get_existing_users() function doesn't retrieve password hashes from /etc/shadow, so this comparison (line 863) will always show a difference if a password_hash is provided in the config. Either populate the password hash when reading existing users (using getspnam()) or remove this comparison since password hashes can't be reliably compared.

Copilot uses AI. Check for mistakes.
Comment on lines +852 to +859
UserInfo expected_config = user_config;
expected_config.shell = user_config.enabled ? "/bin/bash" : "/usr/sbin/nologin";

// Compare the configurations
if (current_info == expected_config) {
SWSS_LOG_DEBUG("User %s configuration is already up to date", username.c_str());
return true;
}
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison current_info == expected_config will always fail because current_info.password_hash is never populated (it's empty), while expected_config.password_hash contains the hash from the configuration. This means users will be unnecessarily updated on every consistency check. Consider excluding password_hash from the comparison or properly populate it in get_existing_users().

Copilot uses AI. Check for mistakes.
Comment on lines +999 to +1000
UserInfo user;
user.username = key;
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Username validation is missing. Usernames from CONFIG_DB are used directly in system commands without validation. Malicious usernames containing special characters (e.g., semicolons, spaces, shell metacharacters) could potentially cause issues or be used for injection attacks. Add validation to ensure usernames match expected patterns (e.g., ^[a-z_][a-z0-9_-]*[$]?$).

Copilot uses AI. Check for mistakes.
std::string role;
std::string password_hash;
std::vector<std::string> ssh_keys;
bool enabled;
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enabled field is not initialized in the default constructor of UserInfo. If a user entry in CONFIG_DB doesn't have an explicit "enabled" field, the value will be uninitialized, leading to undefined behavior. Add a default value initialization in the struct definition (e.g., bool enabled = true;).

Suggested change
bool enabled;
bool enabled = true;

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants