Skip to content

Conversation

@ikegami-t
Copy link
Contributor

This is to monitor the system changes.

@ikegami-t
Copy link
Contributor Author

Will do consider to change the PR changes to add --continue or --delay option instead of the nvme top command. (Or just rename to nvme watch command instead.)

@ikegami-t ikegami-t force-pushed the nvme-top branch 2 times, most recently from 2a0fb7a to 50ece42 Compare December 29, 2025 07:23
@ikegami-t
Copy link
Contributor Author

Hi @igaw, @shroffni and @martin-belanger,

Just updated the PR changes to add --delay option instead of nvme top command (but not --continuous option) as mentioned by the matrix chat.

@shroffni
Copy link
Contributor

shroffni commented Jan 7, 2026

After looking at the code, it does not seem appropriate to add a --delay option at the global configuration level.

The main issue is that a global --delay does not make sense for all nvme-cli commands. For example, commands such as create-ns or delete-ns should not accept a --delay option. If create-ns were invoked with a non-zero delay, it could repeatedly create namespaces in a loop until the device runs out of space. Similarly, commands like delete-ns, set-feature, format, and several others have no meaningful semantics when executed periodically with a delay.

If the goal is simply to monitor or repeatedly execute a command, the existing watch utility already provides this functionality. While watch may not be the most efficient solution, it is sufficient for user-space tooling and works reasonably well today.

If we still believe that a --delay option is necessary, then it should be added selectively, only to those nvme-cli sub-commands where periodic execution actually makes sense. One such example is show-topology, and this could also be useful for upcoming tools like nvme-top.

In summary, rather than introducing a global --delay option, it would be better to:

  • Identify the specific sub-commands where delayed or periodic execution is meaningful, and
  • Add support for --delay only to those commands.

@igaw
Copy link
Collaborator

igaw commented Jan 7, 2026

I agree with @shroffni's arguments.

@ikegami-t
Copy link
Contributor Author

Just changed the PR commit to add the option only for the show-topology command at first. Thank you.

@shroffni
Copy link
Contributor

shroffni commented Jan 9, 2026

It’s good that the --delay option is now limited to show-topology. However, the current implementation lacks scrolling support, which makes it difficult to use in systems with many subsystems or namespaces. In such cases, the show-topology output may not fit on a single screen, and with the current behavior it becomes inconvenient—or effectively impossible—to view the full output.

For comparison, it would be useful to look at how tools like top handle scrolling and screen updates.

I added your patch on my system, and found that the screen flickers noticeably, and the rendering does not appear smooth when the output is repainted on every --delay interval. This also needs improvement. One possible optimization would be to compare the output between two intervals and avoid repainting the screen if the output has not changed.

To further minimize flickering, it would also help to collect the entire command output into a single buffer and render it in one operation, rather than printing each line individually as it is generated.

There would be some other alternatives as well to avoid flickering. I'd suggest review implementation of tools like watch/top etc.

This is to monitor the system changes.

Signed-off-by: Tokunori Ikegami <[email protected]>
@ikegami-t
Copy link
Contributor Author

Just refixed the patch as mentioned so please review again. Thank you.

@shroffni
Copy link
Contributor

The recent change that compares the current and previous output buffers and only prints the output when they differ is a good improvement. It significantly reduces screen flickering by avoiding unnecessary repaints, since the output is refreshed only when something has changed during the last interval.

I pulled this change and rebuilt nvme-cli locally. On my system, I have multiple NVMe disks with several namespaces, so the show-topology output spans multiple screens. When running show-topology with the --delay option, the output naturally extends beyond one screen.

With the current change, each time the output is refreshed it automatically scrolls to the last screen/page. If I want to inspect the first screen, I have to scroll back manually, which is acceptable initially. However, this behavior differs from tools such as top. In top, even when the output spans multiple screens, the display remains anchored at the first screen by default. If the user navigates using the arrow keys or Page Up/Page Down, top stays at the selected position and continues updating only the visible portion of the output, rather than jumping back to the end on each refresh.

In contrast, with show-topology, even if I scroll back to the first screen/page, the next refresh (when the output changes) automatically scrolls me back to the last screen/page. This makes it difficult to focus on a specific portion of the output, as the view keeps jumping away from where I am looking. From a usability standpoint, this behavior is inconvenient and should be addressed.

Additionally, while testing this change, I noticed that some junk characters occasionally appear or get appended—typically on the last screen—when show-topology detects a mismatch between the previous and current output and dumps the updated output.

I would strongly encourage testing this change on a system with multiple disks and namespaces, where show-topology produces multiple screens of output. That setup makes both of these issues—auto-scrolling and stray characters—much easier to observe and debug.

I may be asking too much here but we have to make this usable for the end user so please bear with me.

@ikegami-t
Copy link
Contributor Author

Understood. Will do study more as refering top and watch, etc. Thank you.

@igaw
Copy link
Collaborator

igaw commented Jan 13, 2026

If it helps you can create a bunch of nvmet-tcp targets on your system and connect to them. Maybe we should add a test setup for this from the beginning, e.g. having a script which setups a bunch of nvmet targets against null device backends etc. Just as idea.

@ikegami-t
Copy link
Contributor Author

Thank you. Noted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants