Skip to content

Day2 flow of erasure coding#2614

Draft
NIKHITHAVADDEMPUDI wants to merge 1 commit intored-hat-storage:masterfrom
NIKHITHAVADDEMPUDI:DAY2
Draft

Day2 flow of erasure coding#2614
NIKHITHAVADDEMPUDI wants to merge 1 commit intored-hat-storage:masterfrom
NIKHITHAVADDEMPUDI:DAY2

Conversation

@NIKHITHAVADDEMPUDI
Copy link
Contributor

@NIKHITHAVADDEMPUDI NIKHITHAVADDEMPUDI commented Mar 12, 2026

DAY2 flow of erasure coding for rbd and cephFS

https://issues.redhat.com/browse/RHSTOR-8544

Types & constants

packages/ocs/types.ts – Defines OCS/ODF types (e.g. StoragePoolKind, DataPoolErasureCoding, CephFilesystemKind) used for pool and cluster resources.
packages/ocs/constants/common.ts – Holds OCS pool constants (e.g. pool states, paths, ERASURE_CODING_FAILURE_DOMAIN, compression).
packages/odf/types/erasure-coding.ts – Defines the erasure-coding schema type { k, m }.
packages/odf/constants/erasure-coding.ts – Defines supported EC schemes and minimum node count for erasure coding.
Utils

packages/odf/utils/erasure-coding.ts – Helpers for erasure coding (overhead %, recommended scheme, node validation).
packages/odf/utils/ocs.ts – ODF/OCS helpers (e.g. node count for EC, CR lookups).

Day2 storage pool

packages/ocs/storage-pool/reducer.ts – Reducer and state shape for the storage pool form (name, replica, EC, compression, etc.).
packages/ocs/storage-pool/body.tsx – Renders the pool form (name, data protection, replication/EC, schema table, compression) and wires dispatch.
packages/ocs/storage-pool/footer.tsx – Pool form footer with Create/Cancel and required-value checks (e.g. EC schema).
packages/ocs/storage-pool/CreateStoragePool.tsx – Builds CephBlockPool/CephFS pool specs, runs create/patch, and hosts the standalone create-pool page and EC helpers.
packages/ocs/storage-pool/StoragePoolListPage.tsx – Lists block and filesystem pools and shows pool details (including EC k+m).
packages/ocs/modals/storage-pool/modal-footer.tsx – Modal footer with dynamic buttons (Create/Cancel, Finish, etc.) and primary-action disable logic.
packages/ocs/modals/storage-pool/create-storage-pool-modal.tsx – Modal that creates a pool (block or CephFS) and watches for success/failure.
packages/ocs/modals/storage-pool/update-storage-pool-modal.tsx – Modal that loads an existing pool into the form and runs update (replica/compression; EC read-only).

Erasure coding UI

packages/odf/.../erasure-coding/erasure-coding-schema-table.tsx – Table of EC schemes (k+m) with overhead and “Recommended,” and calls onSelectSchema on row select.
packages/odf/.../erasure-coding/erasure-coding-schema-table.scss – Styles for the EC schema table.

Attach storage:

packages/odf/.../attach-storage-storagesystem/state.ts – Attach-storage state, reducer, and createPayload (including poolDetails/EC for the backend).
packages/odf/.../attach-storage-storagesystem/attach-storage.tsx – Attach-storage page: LSO SC, device class, pool form, and submit that POSTs the payload.
packages/odf/.../attach-storage-storagesystem/storage-pool-form.tsx – Wraps pool data (cluster, existing names) and renders StoragePoolBody for the attach flow.
packages/odf/.../attach-storage-storagesystem/utils.ts – checkRequiredValues for the attach form (e.g. EC schema when EC is selected).
packages/odf/.../attach-storage-storagesystem/attach-storage-footer.tsx – Attach page footer with Attach/Cancel and disabled state from checkRequiredValues.

Screenshot 2026-03-12 at 8 02 37 PM Screenshot 2026-03-12 at 8 00 19 PM Screenshot 2026-03-12 at 8 00 10 PM Screenshot 2026-03-12 at 8 01 35 PM Screenshot 2026-03-12 at 8 00 47 PM

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 12, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: NIKHITHAVADDEMPUDI
Once this PR has been reviewed and has the lgtm label, please assign bipuladh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@NIKHITHAVADDEMPUDI
Copy link
Contributor Author

Total nodes considered for validating and displaying possible schema are displayed by

STORAGE POOL FLOW:

ODF creates each OSD with a RWO PVC; that PVC is bound to a PV, and RWO means that PV is attached on a single node—the one where the pod runs—so every node with a running OSD pod has at least one PV (the one backing that OSD’s PVC) on it.

How we find OSD pods: We treat a pod as an OSD pod if it has the label app=rook-ceph-osd (and it’s in the cluster namespace and in Running phase).
How we treat it as SSD: We treat that OSD as SSD if either (1) its label ceph.rook.io/DeviceSet matches a device set in the StorageCluster whose deviceClass is ssd, or (2) the pod has the label device-class=ssd.

Why both checks are required

  1. CR and pods can get out of sync
The StorageCluster is desired state; pod labels are actual state. After upgrades, CR edits (e.g. device set removed/renamed, or deviceClass removed), or delayed reconciliation, the CR may no longer have a device set that matches the pod’s ceph.rook.io/DeviceSet. If we only used the CR path, those OSDs would be dropped from the count even though they are still running and still SSD. The pod device-class label is the fallback that still identifies them as SSD when the CR lookup fails.
  2. Older or not-yet-reconciled pods may lack the label
In older ODF/Rook versions the operator did not set device-class on OSD pods; only the DeviceSet label existed. So for those clusters, the only way to know the device class is to resolve it from the CR via ceph.rook.io/DeviceSet. Relying only on the pod label would undercount or miss those OSDs.
    So: CR path handles “CR is the source of truth and pods may not have device-class”; pod label handles “CR is missing or no longer matches, but the pod still has the correct device class.” Using both keeps the SSD OSD count correct across versions, CR edits, and temporary divergence between CR and pod state. That’s the reason to keep both checks.

https://github.com/red-hat-storage/odf-console/pull/2614/changes#diff-d7ac0f36554ace4e170c388e693ecf106bb5a415766e02b93548c79726ce4244:~:text=export%20const%20getNodeCountWithOSDsAndSSDDeviceClass,%7D%3B

https://github.com/red-hat-storage/odf-console/pull/2614/changes#diff-0c5dba22a3edadadb4ff07c5610de68279608e0a6218f615ad682ee8a0870007R24:~:text=export%20const%20OSD_APP_LABEL_KEY,ceph%2Dosd%27%3B
ATTACHED STORAGE FLOW:

Total nodes for validating and displaying possible schemas in Attach Storage (detailed):
In the attach storage flow, the total nodes used to validate and show possible erasure coding schemes are the number of nodes that have PersistentVolumes (PVs) backed by the LSO storage class the user selected in the “Storage class” dropdown on the Attach Storage page.

https://github.com/red-hat-storage/odf-console/pull/2614/changes#diff-57e3096b45231cf1c1e8e0a908e39252ced8e85d949031b392084e07d95eedbcR240-R244:~:text=//%20Total%20number%20of,247

@SanjalKatiyar
Copy link
Collaborator

SanjalKatiyar commented Mar 16, 2026

How we find OSD pods: We treat a pod as an OSD pod if it has the label app=rook-ceph-osd (and it’s in the cluster namespace and in Running phase). How we treat it as SSD: We treat that OSD as SSD if either (1) its label ceph.rook.io/DeviceSet matches a device set in the StorageCluster whose deviceClass is ssd, or (2) the pod has the label device-class=ssd.

@NIKHITHAVADDEMPUDI assuming app=rook-ceph-osd and device-class are reliable labels which UI can use to get OSD pods, how will u ultimately map it to the node on which these OSDs are running, pod spec ?? That's the final piece of information that we need.

Also, plz don't rely on hardcoded "ssd" value (it's usually "ssd", but that's not a guarantee), we already fetch default blockpool and use deviceClass value of that as default (check: https://github.com/red-hat-storage/odf-console/blob/master/packages/ocs/storage-pool/CreateStoragePool.tsx#L184-L196), use that same value here.

@SanjalKatiyar
Copy link
Collaborator

SanjalKatiyar commented Mar 16, 2026

How we find OSD pods: We treat a pod as an OSD pod if it has the label app=rook-ceph-osd (and it’s in the cluster namespace and in Running phase). How we treat it as SSD: We treat that OSD as SSD if either (1) its label ceph.rook.io/DeviceSet matches a device set in the StorageCluster whose deviceClass is ssd, or (2) the pod has the label device-class=ssd.

@NIKHITHAVADDEMPUDI assuming app=rook-ceph-osd and device-class are reliable labels which UI can use to get OSD pods, how will u ultimately map it to the node on which these OSDs are running, pod spec ?? That's the final piece of information that we need.

Also, plz don't rely on hardcoded "ssd" value (it's usually "ssd", but that's not a guarantee), we already fetch default blockpool and use deviceClass value of that as default (check: https://github.com/red-hat-storage/odf-console/blob/master/packages/ocs/storage-pool/CreateStoragePool.tsx#L184-L196), use that same value here.

@parth-gr can u plz confirm how reliable these labels app=rook-ceph-osd and device-class are on an OSD pods ?
Like are they always added ? Will they get reconciled if someone removes them manually ? etc...

@parth-gr
Copy link
Member

parth-gr commented Mar 17, 2026

@parth-gr can u plz confirm how reliable these labels app=rook-ceph-osd and device-class are on an OSD pods ? Like are they always added ? Will they get reconciled if someone removes them manually ? etc...

Yes they are always added.
How we are doing the mapping?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants