Skip to content

Conversation

@tylerjereddy
Copy link
Collaborator

Fixes #831

  • prevent log_get_generic_record() from attempting to access the LUSTRE fcounters field, which
    apparently does not exist

Fixes darshan-hpc#831

* prevent `log_get_generic_record()` from attempting
to access the `LUSTRE` `fcounters` field, which
apparently does not exist
@tylerjereddy tylerjereddy added bug Something isn't working pydarshan labels Oct 17, 2022
@shanedsnyder
Copy link

Looking at the code, it looks like log_get_generic_record() isn't intended to be used on Lustre module data. See this line in mod_read_all_records():

        unsupported =  ['DXT_POSIX', 'DXT_MPIIO', 'LUSTRE', 'APMPI', 'APXC', 'HEATMAP']

To be honest, I can't really wrap my heard around all the logic here, but it seems pretty clear that things need some work. My understanding is that log_get_generic_record() was intended for use with modules that have fixed-length records and that don't require module-specific logic for getting records, but there's a clear discrepancy in how/if modules are gated from calling this function in various places in the Report interface:

  1. mod_read_all_records() returns an error if user calls with an unsupported module from above
  2. mod_records() allows any module to be used, ultimately leading to error in the CFFI backend (as the original user reports)

This code needs more thorough auditing, but I think something like the following might make more sense:

  1. mod_read_all_records() should read literally every record as its name implies, multiplexing to module-specific code rather than "generic" records if need be
  2. mod_records() should similarly support any given module, with multiplexing to module-specific code if need be
  3. log_get_generic_records() should probably enforce any unsupported modules internally, rather than relying on higher levels to reproduce this logic
  4. log_get_generic_records() ought to support reading of APMPI and APXC data. They are listed in the "unsupported" list, but they are fixed length so this should be fine -- I wonder if this was just a hack around early issues in PyDarshan in terms of integration with APXC/APMPI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working pydarshan

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PyDarshan mod_records fails on LUSTRE records

2 participants