-
Notifications
You must be signed in to change notification settings - Fork 2
Multigres controller draft #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Completely redesigns the API based on the latest design doc, moving from individual component CRDs to a single "MultigresCluster" source of truth supported by reusable templates and read-only child resources. Key Changes: - **MultigresCluster**: Re-implemented as the Root CR. strictly ordered structs and added "templateDefaults" and "overrides" logic. - **Templates**: Added `CoreTemplate`, `CellTemplate`, and `ShardTemplate` types to support reusable configuration. - **Shared Types**: Created `shared_types.go` to centralize common structs (`StatelessSpec`, `ComponentConfig`, `StorageSpec`) and prevent circular deps. - **TopoServer**: Refactored `toposerver_types.go` to include `EtcdSpec` and `GlobalTopoServerRef`, acting as the authoritative source for topology config. - **Shard**: Moved `MultiOrchSpec`, `PoolSpec`, and `ShardImages` into `shard_types.go` to localize domain-specific logic. - **Children**: Updated `Cell`, `TableGroup`, and `Shard` to reflect their status as read-only, fully resolved child resources. Refactoring & Cleanup: - Deleted obsolete component types: `deploymenttemplate`, `etcd`, `multigateway`, and `multiorch`. - Applied strict "definition-before-use" ordering and added ASCII headers to all API files for better readability. - Standardized naming conventions (e.g., using `Spec` over `Config` for managed resources).
This commit hardens the v1alpha1 API by adding extensive CEL validations
to ensure data integrity and compliance with Kubernetes validation cost budgets.
Changes:
- **Budget Safety**: Added `+kubebuilder:validation:MaxLength`, `MaxItems`, and
`MaxProperties` to all string, slice, and map fields. This prevents
"worst-case" cost assumption errors in the API server.
- **String Safety**: Added `MinLength=1` to critical identifiers and image fields
to prevent silent failures with empty strings.
- **Logic Fixes**:
- Replaced invalid regex pattern on `Endpoints` slice with a CEL rule
(`self.all(x, x.matches(...))`) to correctly validate lists of URLs.
- Added CEL validation to `Databases` and `TableGroups` lists to ensure
only one entry can be marked as `default: true`.
- **Constraint Enforcement**:
- Added `Minimum` validation to replica counts.
- Enforced mutual exclusivity for `GlobalTopoServer` (etcd/external/template)
and `CellConfig` (zone/region).
- **Cleanup**: Removed experimental immutability rules to maintain flexibility
during early development.
This commit hardens the v1alpha1 API to comply with Kubernetes CEL validation cost budgets and prevents "worst-case" assumption errors in the API server. Key Changes: - **Budget Safety:** Introduced `CellName` (63 chars) and `EndpointUrl` (2048 chars) types to bound the validation cost of `[]string` lists, which otherwise trigger expensive 3MB-per-item worst-case assumptions. - **Limits:** Added `+kubebuilder:validation:MaxItems`, `MaxLength`, and `MaxProperties` to all lists, strings, and maps. - **Validation Logic:** - Replaced regex on `Endpoints` with a CEL rule (`self.all(x, x.matches(...))`). - Added CEL rule to ensure only one Database/TableGroup can be `default: true`. - Added strict `Minimum` and `MinLength` checks to prevent silent empty failures. - **Refactor:** Moved domain-specific types (`CellName`, `EndpointUrl`) to their respective files (`cell_types.go`, `toposerver_types.go`) while keeping them exported for shared use.
| if cluster.ObjectMeta.DeletionTimestamp.IsZero() { | ||
| if !controllerutil.ContainsFinalizer(cluster, finalizerName) { | ||
| controllerutil.AddFinalizer(cluster, finalizerName) | ||
| if err := r.Update(ctx, cluster); err != nil { | ||
| return ctrl.Result{}, err | ||
| } | ||
| } | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the deletion timestamp is there, we have to complete the reconciliation. Otherwise it would try to update the "deleting" resource. Once the resource becomes "deleting", you cannot "undelete", and thus this must be corrected.
Also, that means you will not need the else keyword, because once it hits the if statement, it MUST return, and there is no need to else. This goes with the idiomatic Go code of error path being indented, so that the main path stays non-indented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed on 190410c
I apologize it is a big change, I had to also change tests and I was struggling to get it to 100%, so far I am on 99.3%:
ok github.com/numtide/multigres-operator/pkg/cluster-handler/controller/multigrescluster (cached) coverage: 99.3% of statements
ok github.com/numtide/multigres-operator/pkg/cluster-handler/controller/tablegroup (cached) coverage: 100.0% of statements
…tion This commit refactors the Reconcile loop to handle resource deletion at the very beginning of the function. - Moves the deletion timestamp check to the top. - Returns early if the object is being deleted to prevent race conditions and ensure no further reconciliation logic runs. - Removes the `else` block and unindents the main reconciliation path for better readability and to follow idiomatic Go style. - Changes have been made to the tests to improve coverage and adapt to new change
|
Closing this PR in favour of PR 80 |
No description provided.