Skip to content

Conversation

cemakd
Copy link
Contributor

@cemakd cemakd commented Aug 4, 2025

This is a copy of PR: #2097

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
This PR adds a cache that periodically (configured to every minute currently) looks at the /dev/disk/by-id/ directory and evaluates the symlinks there. It maintains a cache of the symlink and the real path it points to.

This will help with debugging future filesystem issues. In a past OMG, we found that our insight into changes in symlinks for specific disks hampered our ability to debug. Logging marked the real path of the disk at mount and unmount, but the change in between couldn't be detected.

This PR will print those links every minute, also logging when elements of the cache change.

An example:

I0521 00:23:51.418261      12 cache.go:62] periodic symlink cache read: /dev/disk/by-id/google-persistent-disk-0 -> /dev/sda; /dev/disk/by-id/google-pvc-f5418f78-dc07-4d69-9487-6c4a7232dd67 -> /dev/sdb; /dev/disk/by-id/scsi-0Google_PersistentDisk_persistent-disk-0 -> /dev/sda; /dev/disk/by-id/scsi-0Google_PersistentDisk_pvc-f5418f78-dc07-4d69-9487-6c4a7232dd67 -> /dev/sdb

The cache will also note if a symlink is broken.

NOTE: Currently this filters out any thing in by-id/ that ends with -part[0-9]*$. This removes partitions, which are noise. Mounting partitions directly isn't well supported in GKE, but we may want to test that in the future.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 4, 2025
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 4, 2025
@tonyzhc
Copy link
Contributor

tonyzhc commented Aug 4, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Aug 4, 2025
Copy link
Contributor

@msau42 msau42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe the testing that has been done for this change?

return newDeviceCacheForNode(period, node), nil
}

func TestDeviceCache(period time.Duration, node *v1.Node) *DeviceCache {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the test functions to the _test.go file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were not real test function but rather creating a DeviceCache for testing purposes used by other packages. I've renamed them to NewTestDeviceCache to distinguish them from actual test functions.

// of the string (after the last "/") and call AddVolume for that
for _, volume := range node.Status.VolumesInUse {
klog.Infof("Adding volume %s to cache", string(volume))
vID, err := pvNameFromVolumeID(string(volume))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this ignore volume types that are not PD/hyperdisk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not ignoring them, I've pushed a commit that checks if the volume is for the driver pd.csi.storage.gke.io. IIUC that should be sufficient to do the filtering.

// Evaluate the symlink
realPath, err := filepath.EvalSymlinks(device.symlink)
if err != nil {
klog.Warningf("Error evaluating symlink for volume %s: %v", volumeID, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful of logging this since this is called repeatedly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the polling period is 10 mins, would it be okay to log this? Otherwise it might be hard to figure out any problems with realpath evaluation?
I'm not sure if that's a realistic problem to log, I can remove this warning log if you think it's not necessary

klog.Fatalf("Failed to get node info from API server: %v", err.Error())
}

deviceCache, err := linkcache.NewDeviceCacheForNode(ctx, 1*time.Minute, *nodeName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the time period configurable so that we can adjust it if it ends up spamming too many logs? In fact I would probably start the logging at 10 mins initially.

In the future, if we want to turn this into a metric, I would be more comfortable reducing the polling period but removing logs to avoid log spam.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added a flag disk-cache-sync-period with a default value of 10 minutes when unspecified.

}

// Look at the dir for a symlink that matches the pvName
symlink := filepath.Join(d.dir, "google-"+deviceName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been tested with nvme disks as well? It would be better to reuse functions in https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/master/pkg/deviceutils/device-utils.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done specific testing for nvme disks but I've pushed a new commit that utilizes device-utils.go for evaluating the symlinks for a volumeID

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed the testing for both NVMe and SCSI disk interface types.

I used the following machine types for the tests:

  • SCSI: e2-standard-2
  • NVMe: c3-standard-4

Here are the log outputs:

SCSI: The log shows both the google- and scsi- prefixed symlinks successfully resolving to the same device path (/dev/sdb).

Cache contents:

map[
  /dev/disk/by-id/google-pvc-b08e725b-7bdf-499a-af8f-9bfba4c9ef69:{volumeID:projects/enginakdemir-joonix/zones/us-central1-c/disks/pvc-b08e725b-7bdf-499a-af8f-9bfba4c9ef69 realPath:/dev/sdb} 
  /dev/disk/by-id/scsi-0Google_PersistentDisk_pvc-b08e725b-7bdf-499a-af8f-9bfba4c9ef69:{volumeID:projects/enginakdemir-joonix/zones/us-central1-c/disks/pvc-b08e725b-7bdf-499a-af8f-9bfba4c9ef69 realPath:/dev/sdb}
]

NVMe: The log shows the google- prefixed symlink resolving to the NVMe device path (/dev/nvme0n2), while the scsi- symlink has an empty realPath.

Cache contents:

map[
  /dev/disk/by-id/google-pvc-289e075f-e731-4e75-9519-4f0e5e4832ae:{volumeID:projects/enginakdemir-joonix/zones/us-central1-c/disks/pvc-289e075f-e731-4e75-9519-4f0e5e4832ae realPath:/dev/nvme0n2} 
  /dev/disk/by-id/scsi-0Google_PersistentDisk_pvc-289e075f-e731-4e75-9519-4f0e5e4832ae:{volumeID:projects/enginakdemir-joonix/zones/us-central1-c/disks/pvc-289e075f-e731-4e75-9519-4f0e5e4832ae realPath:}
]

@cemakd
Copy link
Contributor Author

cemakd commented Aug 7, 2025

Can you describe the testing that has been done for this change?

I haven't done any additional testing on this PR since I took it over from Julian. As far as I know he was struggling to get the e2e tests passing, which I fixed with a few nil pointers checks.

I'll do some more testing on this and get back to you

@cemakd cemakd force-pushed the logs-for-device-mappings branch from 61ea81b to 61aeffd Compare August 7, 2025 20:58
}

// Look at the status.volumesInUse field. For each, take the last section
// of the string (after the last "/") and call AddVolume for that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment with what the expected syntax is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// The first token is of the form "kubernetes.io/csi/<driver-name>" or just "<driver-name>".
// We should check if it contains the driver name we are interested in.
if !strings.Contains(tokens[0], driverName) {
klog.V(5).Infof("Skipping volume %q because it is not a %s volume.", volumeName, driverName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be common if a customer has multiple csi drivers. We probably don't need this log.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this log line, done

@cemakd cemakd force-pushed the logs-for-device-mappings branch from 61aeffd to ada0422 Compare September 26, 2025 18:55
@cemakd cemakd force-pushed the logs-for-device-mappings branch from ada0422 to f1b9c24 Compare September 26, 2025 20:06
Change volumes map to use symlink as key and add volumeID and real
path to the deviceMappings object, this is because there can be
multiple symlinks per volume

Filter out caching and logging info on non pd/hyperdisk volumes

Failure to get real path at add volume time is no longer an error
the real path may be unavailable at this time

Fix the device_windows.go NewDeviceCacheForNode parameters

Add clarifying comments and remove excessive logging

Move wg.Add out of goroutine to fix -verify presubmit
@cemakd cemakd force-pushed the logs-for-device-mappings branch from f1b9c24 to 9460d9d Compare September 26, 2025 20:30
@msau42
Copy link
Contributor

msau42 commented Sep 29, 2025

/lgm
/approve

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 29, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cemakd, msau42

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 29, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 29, 2025
@k8s-ci-robot
Copy link
Contributor

@cemakd: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2019 9460d9d link false /test pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2019
pull-gcp-compute-persistent-disk-csi-driver-e2e-arm dbd2883 link false /test pull-gcp-compute-persistent-disk-csi-driver-e2e-arm

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@msau42
Copy link
Contributor

msau42 commented Sep 29, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 29, 2025
@k8s-ci-robot k8s-ci-robot merged commit d285089 into kubernetes-sigs:master Sep 30, 2025
7 of 10 checks passed
@cemakd
Copy link
Contributor Author

cemakd commented Sep 30, 2025

/cherry-pick release-1.21

@cemakd cemakd deleted the logs-for-device-mappings branch September 30, 2025 17:42
@k8s-infra-cherrypick-robot

@cemakd: new pull request created: #2185

In response to this:

/cherry-pick release-1.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants