Skip to content

Conversation

@JanZachmann
Copy link
Contributor

@JanZachmann JanZachmann commented Nov 14, 2025

Summary

This PR fixes network configuration rollback and improves error handling throughout the application, building on top of the API separation refactoring from upstream/main.

Changes

Fixed network configuration rollback implementation

  • Corrected rollback processing logic to properly handle pending rollbacks
  • Fixed rollback file structure and serialization/deserialization
  • Improved rollback timing and cancellation handling
  • Added proper error context throughout network service

Reverted device service client builder pattern

  • Removed OmnectDeviceServiceClientBuilder in favor of simpler initialization
  • Moved certificate creation and endpoint registration to main startup flow
  • Made startup sequence more explicit and easier to follow
  • Changed has_publish_endpoint to public field for direct access

Optimized application startup

  • Reorganized startup sequence for better clarity
  • Service client now created once and reused across restarts
  • Certificate and centrifugo setup moved to run_until_shutdown
  • Added warning for unexpected pending rollback on startup

Improved error handling in main.rs

  • Replaced all .expect() calls with proper anyhow::Result and .context()
  • Added top-level error handler in main() with proper logging
  • All functions now return Result with descriptive error context
  • Error chains display full context with {e:#} formatting
  • Standardized error messages: lowercase, concise, no trailing periods

Additional improvements

  • Added complete function documentation for network service
  • Fixed shutdown sequence to only unregister service client on full shutdown
  • Removed redundant "backup:" prefixes from log messages
  • Fixed typo: "publish} endpoint" -> "publish endpoint"
  • Added centrifugo log level configuration support

Signed-off-by: Jan Zachmann [email protected]

Replace hyper-util with reqwest for all HTTP/Unix socket communication,
remove Arc wrappers, and eliminate the socket_client module to simplify
the codebase.

Changes:
- Remove Arc wrappers from Api struct fields (OmnectDeviceServiceClient
  and SingleSignOnProvider are already cheap to clone via internal pools)
- Replace hyper-util Client with reqwest for Unix socket communication
  using reqwest 0.12.23's native unix_socket() support
- Remove socket_client.rs module entirely (~109 lines removed)
- Inline HTTP request logic directly in OmnectDeviceServiceClient with
  cleaner helper methods: get(), post(), post_json()
- Convert keycloak_client from blocking to async reqwest
- Update certificate.rs to use reqwest::Client directly

Dependencies:
- Remove: http-body-util, hyper, hyperlocal, hyper-util
- Update: reqwest from 0.12 to 0.12.23 (minimum for Unix socket support)
- Remove: blocking feature from reqwest (no longer needed)

Benefits:
- Single HTTP client library throughout codebase
- Simpler, more maintainable code with direct reqwest usage
- Better error messages and logging
- Removed unnecessary abstraction layers
- Zero runtime overhead (no Arc, reqwest uses internal Arc)

Signed-off-by: Jan Zachmann <[email protected]>
This commit implements comprehensive refactorings across the codebase:

Performance Optimizations:
- Cache Keycloak HTTP client using OnceLock to avoid repeated instantiation
- Consolidate HTTP client usage patterns

Code Quality Improvements:
- Add HTTP response handling helper to reduce duplication
- Consolidate path macros for consistency
- Add API error handling helper with standardized logging
- Use endpoint constants instead of string literals
- Remove unnecessary .map(|_| ()) patterns
- Standardize error logging format to {e:#}
- Standardize all error messages to use "failed to..." format

Maintainability:
- Replace magic string "omnect-ui" with env!("CARGO_PKG_NAME")
- Improve string interpolation to use direct variable references
- Optimize imports (remove wildcard imports, use explicit imports)
- Fix inconsistent crate imports in main.rs

Signed-off-by: Jan Zachmann <[email protected]>
Problems addressed:
- Inconsistent cleanup across different shutdown paths (ctrl-c, SIGTERM, server crash, centrifugo crash)
- Race conditions between service client shutdown and centrifugo termination
- Mixed error handling (panics vs logging)
- No guaranteed cleanup order

Solution:
- Unified shutdown sequence that runs regardless of exit reason
- Consistent cleanup order: service client → server → centrifugo
- Graceful error handling throughout (no panics)
- Clear debug logging at each step

Benefits:
- Predictable shutdown behavior in all scenarios
- Proper resource cleanup guaranteed
- Better observability with consistent logging
- Safer error handling

Signed-off-by: Jan Zachmann <[email protected]>
This commit introduces new infrastructure modules without changing existing
code, preparing for gradual migration in future PRs.

New Modules:
- config.rs: Centralized configuration management with validation
  - AppConfig struct with all environment variable configuration
  - Explicit validation at startup instead of lazy initialization
  - IoT Edge configuration detection
  - Path configuration helpers

- errors.rs: Structured error types with HTTP status mapping
  - OmnectUiError enum with specific error categories
  - Automatic HTTP status code mapping via ResponseError trait
  - ErrorContext trait for adding typed error context
  - Better error diagnostics for API responses

Benefits:
- Foundation for removing global OnceLock state in future PRs
- Type-safe configuration with clear documentation
- Structured error handling improves debugging and API responses
- No breaking changes to existing code

Future Work:
- PR #2: Consolidate HTTP clients using config
- PR #3: Migrate authentication and testing to use new infrastructure
This commit introduces HttpClientFactory to centralize HTTP client creation,
eliminating code duplication and ensuring consistent configuration.

New Module: http_client.rs
- HttpClientFactory with three client types:
  - https_client(): Cached HTTPS client for external APIs
  - unix_socket_client(): For local Unix socket communication
  - workload_client(): For IoT Edge workload API

Testing:
- Unit tests: 8 tests covering all three client factory methods
  - test_https_client_is_cached
  - test_https_client_returns_valid_client
  - test_https_client_multiple_calls_same_instance
  - test_workload_client_parses_uri
  - test_workload_client_rejects_invalid_scheme
  - test_unix_socket_client_creates_client
  - test_unix_socket_client_with_relative_path
  - test_unix_socket_client_with_empty_path

- Integration tests: 14 tests in tests/http_client.rs
  - 6 tests for https_client (using httpbin.org and example.com)
  - 4 tests for unix_socket_client (using mock Unix socket servers)
  - 4 tests for workload_client (using mock IoT Edge workload API)

- Test configuration: Added .cargo/config.toml
  - Set RUST_TEST_THREADS=1 to prevent rate-limiting issues
  - Ensures reliable test execution for external API tests

- Fixed dead_code warning for workload_client when using mock feature
  by adding #[cfg_attr(feature = "mock", allow(dead_code))]

Benefits:
- Eliminates 3 separate OnceLock instances across modules
- Single source of truth for HTTP client configuration
- Shared connection pools improve performance
- Easier to add timeout/retry configuration in future
- Better testability with centralized mocking point
- Comprehensive test coverage for all client types

Changes by Module:
- keycloak_client.rs: Use HttpClientFactory::https_client()
  - Removed local OnceLock<Client>
  - 4 lines removed, cleaner code

- omnect_device_service_client.rs: Use HttpClientFactory::unix_socket_client()
  - Simplified client creation
  - 3 lines removed

- certificate.rs: Use HttpClientFactory::workload_client()
  - Removed manual URI parsing
  - 8 lines removed, moved to factory

Signed-off-by: Jan Zachmann <[email protected]>
This commit introduces a centralized TokenManager for handling JWT token
creation and verification, eliminating scattered token logic.

New Module: auth/token.rs
- TokenManager: Centralized token management
  - create_token(): Generate JWT tokens with configured expiration
  - verify_token(): Validate tokens with proper verification options
  - Configurable expiration, subject, and secret

Benefits:
- Single source of truth for token handling
- Eliminates duplicate token creation/verification logic
- Consistent verification options (expiration, subject, tolerance)
- Better testability with 4 comprehensive tests
- Easier to change token strategy in future

Changes by Module:
- middleware.rs: Use TokenManager for verify_token()
  - Removed inline HS256Key creation
  - Removed manual VerificationOptions setup
  - 15 lines simplified to 1-line call

- api.rs: Use TokenManager for session_token()
  - Removed inline HS256Key and Claims creation
  - Cleaner error handling
  - 8 lines simplified to 6 lines

Signed-off-by: Jan Zachmann <[email protected]>
Add network configuration management UI and backend:

Backend:
- Add network_config module with support for static IP and DHCP configuration
- Implement network rollback mechanism with 90-second timeout
- Add API endpoint for setting network configuration with validation
- Refactor certificate management to use DeviceServiceClient reference
- Move certificate creation to server startup after service client initialization
- Remove certificate regeneration from network rollback flow
- Add server restart channel for handling network changes
- Cancel pending rollback on successful authentication

Frontend:
- Add Network page with network settings UI
- Add NetworkSettings component for editing network configuration
- Add NetworkActions component for network-related actions
- Add useWaitForNewIp composable for handling IP changes
- Update DeviceNetworks component to show network details
- Refactor DeviceActions to use composable pattern
- Update Vuetify and Biome configurations
- Add route for /network page

Dependencies:
- Add rust-ini for network config file management
- Add serde_valid for request validation
- Add trait-variant for async trait support
- Add actix-cors for CORS support
- Bump version to 1.1.0

Signed-off-by: Jan Zachmann <[email protected]>
…e common.rs

- Introduce AppConfig with OnceLock for thread-safe singleton configuration
- Replace all direct environment variable access with AppConfig::get()
- All config fields now used (centrifugo, iot_edge, paths)
- Add test-mode support with conditional validation and defaults

- Remove common.rs by moving functions to appropriate modules:
  - validate_password → auth/mod.rs
  - create_frontend_config_file → keycloak_client.rs
  - handle_http_response → http_client.rs
- Consolidate HTTP response handling across all clients
- Remove unnecessary wrapper functions (cert_path, key_path, keycloak_url macro)
- Remove duplicate CentrifugoConfig from common.rs

- main.rs: Use AppConfig for UI_PORT, config paths
- api.rs: Use AppConfig for tenant, paths
- keycloak_client.rs: Remove keycloak_url! macro, use AppConfig directly
- omnect_device_service_client.rs: Use AppConfig for socket_path, centrifugo config
- certificate.rs: Use AppConfig for cert/key paths, iot_edge config
- middleware.rs: Use AppConfig for centrifugo client_token in tests

- Fix test isolation issues with cached AppConfig
- Update middleware tests to work with cached config paths
- Add test helper function create_password_file()
- All 69 tests passing with zero warnings

- Make iot_edge mandatory (no longer Option)
- Add cfg_attr for mock feature dead_code suppression
- Remove unnecessary pub from test modules
- Direct calls to handle_http_response (remove wrapper methods)
- Conditional bail import for test vs non-test builds

Signed-off-by: Jan Zachmann <[email protected]>
This commit refactors the codebase to separate business logic from HTTP
concerns by introducing a dedicated services layer.

Introduces a new `services/` module that encapsulates all business logic:

- **AuthorizationService**: Token validation and role-based access control
  (FleetAdministrator, FleetOperator) with tenant/fleet authorization
- **TokenManager**: Session token creation and validation with configurable
  expiration and HMAC signing
- **CertificateService**: Module certificate creation via IoT Edge workload
  API with conditional compilation for mock/production
- **FirmwareService**: Update file persistence and data folder management
- **NetworkConfigService**: Network configuration with automatic rollback,
  server restart coordination, and certificate renewal
- **PasswordService**: Password hashing (Argon2id) and secure storage

- Clear separation of concerns between HTTP handlers and business logic
- Improved testability through dependency injection
- Services accept client traits, enabling easy mocking
- Stateless service structs with static methods
- Reusable functions across different contexts

- Moves `certificate.rs` to `services/certificate.rs`
- Moves `network.rs` to `services/network.rs` (also introduces NetworkConfigService pattern)
- Moves `auth/` to `services/auth/` with new authorization module
- API handlers (`api.rs`) now delegate to services
- Backward compatibility maintained via re-exports in `lib.rs`

- Separates public HTTP handlers from private helper methods
- Reduces `Api` struct responsibility to HTTP concerns only
- Consistent error handling and logging patterns
- Session management delegated to auth service

This refactoring improves maintainability, testability, and code clarity
while maintaining all existing functionality.

Signed-off-by: Jan Zachmann <[email protected]>
Signed-off-by: Jan Zachmann <[email protected]>
Fixed network configuration rollback implementation:
- Corrected rollback processing logic to properly handle pending rollbacks
- Fixed rollback file structure and serialization/deserialization
- Improved rollback timing and cancellation handling
- Added proper error context throughout network service

Reverted device service client builder pattern:
- Removed OmnectDeviceServiceClientBuilder in favor of simpler initialization
- Moved certificate creation and endpoint registration to main startup flow
- Made startup sequence more explicit and easier to follow
- Changed has_publish_endpoint to public field for direct access

Optimized application startup:
- Reorganized startup sequence for better clarity
- Service client now created once and reused across restarts
- Certificate and centrifugo setup moved to run_until_shutdown
- Added warning for unexpected pending rollback on startup

Improved error handling in main.rs:
- Replaced all .expect() calls with proper anyhow::Result and .context()
- Added top-level error handler in main() with proper logging
- All functions now return Result with descriptive error context
- Error chains display full context with {e:#} formatting
- Standardized error messages: lowercase, concise, no trailing periods

Additional improvements:
- Added complete function documentation for network service
- Fixed shutdown sequence to only unregister service client on full shutdown
- Removed redundant "backup:" prefixes from log messages
- Fixed typo: "publish} endpoint" -> "publish endpoint"
- Added centrifugo log level configuration support

Signed-off-by: Jan Zachmann <[email protected]>
Copy link
Contributor

@empwilli empwilli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits, LGTM overall

- Fix comment to match field name: rollback_time -> deadline
- Simplify rollback_exists() to use exists() instead of try_exists().unwrap_or(false)

Signed-off-by: Jan Zachmann <[email protected]>
@JanZachmann
Copy link
Contributor Author

Fixed both review findings in a7ba05d:

  • Changed comment to match field name 'deadline' (was 'rollback_time')
  • Simplified rollback_exists() to use exists() instead of try_exists().unwrap_or(false)

@JanZachmann JanZachmann merged commit 7120671 into omnect:main Nov 19, 2025
2 checks passed
@JanZachmann JanZachmann deleted the refactor-api-separation branch December 18, 2025 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants