-
Notifications
You must be signed in to change notification settings - Fork 309
flame: fixed perf->flamegraph to handle terminated processes #5992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
873be1c
to
092950e
Compare
092950e
to
67c934e
Compare
@@ -58,7 +71,7 @@ flame_cmd_fn( args_t * args, | |||
install_parent_signals(); | |||
|
|||
fd_topo_join_workspaces( &config->topo, FD_SHMEM_JOIN_MODE_READ_ONLY ); | |||
fd_topo_fill( &config->topo ); | |||
fd_topo_fill_resilient( &config->topo ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this? The workspaces should still be valid?
if( FD_UNLIKELY( -1==kill( (int)pid, 0 ) ) ) { | ||
if( FD_UNLIKELY( errno==ESRCH ) ) FD_LOG_ERR(( "tile %s:%lu is not running", config->topo.tiles[ i ].name, config->topo.tiles[ i ].kind_id )); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like changing this to WARNING should be pretty much the only change needed? Why all the other pid / tid bounds checks + resilient thing etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fd_topo_workspace_fill does FD_TEST's itself that fail (fseq checks).
For PID checks, I was getting many PID/TID's = 0, and a tid=438430872235253 for writer:1, and filtered around them so perf wouln't err due to a failed attachment.
I also think you previously had a check for it, so I included a similar filter to just continue on without them.
FD_TEST( pid<=INT_MAX );
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What FD_TEST would fail? I don't think that seems right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's failing on:
main src/disco/topo/fd_topo.c(178): tile net in_link_fseq[0] = (nil) main src/disco/topo/fd_topo.c(179): FAIL: tile->in_link_fseq[ j ]
I'll take a deeper look into why it might be failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you providing the config file to the flamegraph tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I just realized I was using the wrong one. That makes a lot more sense. Thank you, ill make the changes so it just fixes the warnings.
No description provided.