Skip to content

Add ServerCleaning controller#704

Draft
stefanhipfel wants to merge 12 commits intomainfrom
server_cleaning
Draft

Add ServerCleaning controller#704
stefanhipfel wants to merge 12 commits intomainfrom
server_cleaning

Conversation

@stefanhipfel
Copy link
Copy Markdown
Contributor

Proposed Changes

  • Introduced comprehensive unit tests for the ServerCleaning controller, covering various scenarios including creation, reconciliation, and state transitions of ServerCleaning resources.
  • Implemented tests to ensure proper handling of server states, cleaning tasks tracking, and finalizer management.
  • Updated the test suite setup to include the ServerCleaningReconciler, ensuring it is properly initialized with the test manager.

- Introduced comprehensive unit tests for the ServerCleaning controller, covering various scenarios including creation, reconciliation, and state transitions of ServerCleaning resources.
- Implemented tests to ensure proper handling of server states, cleaning tasks tracking, and finalizer management.
- Updated the test suite setup to include the ServerCleaningReconciler, ensuring it is properly initialized with the test manager.
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Feb 23, 2026

// Taints is a list of taints that affect this server.
// +optional
Taints []v1.Taint `json:"taints,omitempty"`
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed once #672 is merged

if server.Spec.ServerClaimRef == nil {
if modified, err := r.patchServerState(ctx, server, metalv1alpha1.ServerStateAvailable); err != nil || modified {
return true, err
// Check if server has taints
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed as well once #672 is merged

@stefanhipfel stefanhipfel linked an issue Feb 27, 2026 that may be closed by this pull request
Implements a dedicated controller for monitoring BMC task status with
configurable polling intervals. This controller provides consistent
task polling separate from controller lifecycle events.

Features:
- Dedicated BMCTaskReconciler with configurable poll interval (default 30s)
- Event filter to only reconcile BMCs with active tasks
- Automatic requeue for in-progress tasks
- Comprehensive test suite (15 test cases)

The controller is implemented but not yet integrated with existing
controllers, allowing for independent review and gradual rollout.

Components:
- internal/controller/bmctask_controller.go (188 lines)
- internal/controller/bmctask_controller_test.go (650+ lines)
- docs/bmc-task-tracking.md (architecture documentation)
Registers the BMCTaskReconciler with the controller manager and adds
configuration flag for poll interval.

Changes:
- cmd/main.go: Add --task-poll-interval flag (default 30s)
- cmd/main.go: Register BMCTaskReconciler with manager
- internal/controller/suite_test.go: Add controller to test suite

The controller is registered and will run but currently has no effect
as no controllers are creating tasks for it to monitor yet.
Add note indicating that BMCTask controller requires the Tasks API changes
to be functional. The controller can be merged independently and will
activate once the API changes are available.
Adds the necessary API changes for BMC task tracking:
- Add Tasks field to BMC.Status for tracking operations
- Add BMCTask type with TaskURI, TaskType, State, etc.
- Add BMCTaskType enum for different task types
- Add GetTaskStatus method to BMC interface
- Implement GetTaskStatus in Redfish implementations

These API changes enable the BMCTask controller to function.

Changes:
- api/v1alpha1/bmc_types.go: Add Tasks field and BMCTask types
- api/v1alpha1/zz_generated.deepcopy.go: Generated DeepCopy methods
- bmc/bmc.go: Add GetTaskStatus to BMC interface
- bmc/redfish.go: Implement GetTaskStatus in RedfishBaseBMC
- bmc/redfish_kube.go: Delegate GetTaskStatus to RedfishBaseBMC
- config/crd/bases/*.yaml: Generated CRD with Tasks field
- internal/controller/bmctask_controller.go: Update to use schemas.Task fields
After rebasing server_cleaning onto feature/bmctask-controller, resolved
conflicts between the two GetTaskStatus implementations:

- Removed duplicate GetTaskStatus declaration in BMC interface
- Unified on *schemas.Task return type (more general than CleaningTaskStatus)
- Fixed RedfishBMC -> RedfishBaseBMC type references
- Updated ServerCleaning controller to extract fields from schemas.Task
- Updated RedfishLocal mock to return schemas.Task

All build errors resolved, project compiles successfully.
**Problem:**
ServerCleaning controller had its own monitorBMCTasks() method that
polled task status every 30s, duplicating the functionality of the
BMCTask controller. This violates separation of concerns and makes
task polling inconsistent across controllers.

**Solution:**
Refactored ServerCleaning controller to use the centralized BMCTask
controller pattern:

1. **Removed CleaningTasks from API** (api/v1alpha1/servercleaning_types.go):
   - Deleted CleaningTasks field from ServerCleaningStatusEntry
   - Deleted CleaningTaskStatus type definition
   - Tasks are now tracked centrally in BMC.Status.Tasks

2. **Added BMC task helper methods** (servercleaning_controller.go):
   - addTaskToBMC() - adds BMCTask to BMC.Status.Tasks
   - getTasksForServer() - retrieves tasks from BMC for a server
   - checkTasksComplete() - checks if all tasks are terminal

3. **Refactored task creation** (initiateBMCCleaning):
   - Changed from creating CleaningTaskStatus to BMCTask
   - Tasks added to BMC.Status.Tasks via addTaskToBMC()
   - Uses BMCTaskType constants (DiskErase, BIOSReset, etc.)
   - BMCTask controller automatically polls these tasks

4. **Removed monitorBMCTasks() polling method**:
   - Deleted monitorBMCTasks() that polled BMC API every 30s
   - Deleted updateServerTasks() that stored tasks locally
   - Polling is now handled by dedicated BMCTask controller

5. **Refactored handleInProgressState**:
   - Changed from calling monitorBMCTasks() to checking BMC.Status.Tasks
   - Uses getTasksForServer() and checkTasksComplete() helpers
   - Calculates progress from BMC task status
   - Updates ServerCleaning status based on task completion

6. **Added BMC watch to trigger reconciliation**:
   - Added Watches(&metalv1alpha1.BMC{}) to SetupWithManager
   - Created mapBMCToServerCleaning() to map BMC updates to ServerCleaning
   - When BMCTask controller updates BMC.Status.Tasks, ServerCleaning reconciles
   - Enables reactive reconciliation instead of 30s polling

7. **Updated tests** (servercleaning_controller_test.go):
   - Changed test to check BMC.Status.Tasks instead of ServerCleaning status
   - Removed CleaningTasks assertions
   - Added BMC task verification

**Benefits:**
- Single source of truth for task polling (BMCTask controller)
- Consistent 30s polling interval across all controllers
- Reactive reconciliation via BMC watch (no RequeueAfter needed)
- Cleaner separation of concerns
- Reduced code duplication

**Architecture:**
- Controllers create tasks → add to BMC.Status.Tasks
- BMCTask controller polls tasks → updates BMC.Status.Tasks
- BMC watch triggers → ServerCleaning reconciles with latest status
All Go files in bmc/ must belong to the same package.
Changed cleaning-related files from 'package oem' to 'package bmc'
to resolve compilation errors.

Also removed duplicate DiskWipeMethod type definition and helper
functions to avoid redeclaration errors. The helper functions
(getDellWipePasses, getHPEWipeType, getLenovoWipeMethod) remain
in redfish.go where they are used by the wipe implementation.

Fixed unused parameter warnings in servercleaning_controller.go
and improved error handling in redfish.go.

Fixes: #704 CI failures
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change area/metal-automation documentation Improvements or additions to documentation size/XXL

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Node Storage Cleanup

2 participants