Add labels to disease module parameters by mnjowe · Pull Request #1610 · UCL/TLOmodel

mnjowe · 2025-04-23T08:12:00Z

This PR is a first step in addressing issue #1036. We aim to have all parameters assigned a label as @marghe-molaro has clearly explained here

marghe-molaro

Hi @mnjowe,

Many thanks for this, it's looking good! I just had a couple of comments:

I think we need to both ensure and test that label, MIN, and MAX for each parameter are either loaded from the relevant resource file, or correctly default to 'unassigned', 'None' and 'None' when not specified. This is to anticipate a situation where developers progressively start to fill these values in, but this is not done simultaneously for all module/parameters and/or modules;
It would be amazing if we could have, at the time of the initialisation of the simulation, a log entry of the breakdown of the different parameter types, for each module and then for all combined. Eventually, this would look something like this (for the example case where only Rti and Alri modules are loaded in the sim, and where numbers are obviously made up):

'unassigned', 'free', 'constant', 'context_specific', 'total'
Rti: 5, 15, 3, 7, 30
Alri: 3, 16, 9, 11, 39
TLO total: 8, 31, 12, 18, 69

It will be incredibly exciting to be able to see this breakdown, as it will give us both an initial insight into the model's predictive validity and an idea of the effort required to generalise the model to other countries, both from a calibration and parameter harvesting perspective!

src/tlo/core.py

marghe-molaro · 2025-04-23T08:49:38Z

tests/test_core.py

+        Parameter(Types.REAL, 'parameter description', label='other')

 class TestLoadParametersFromDataframe:
    """Tests for the load_parameters_from_dataframe method


I think we might need to modify the load_parameters_from_dataframe function too, to ensure that if resource files do include the label and a MIN/MAX range this is loaded properly?

I think with the lines I have numbered below, load_parameters_from_dataframe will always load properly, regardless of some extra columns added to the resource file.

--1-- resource.set_index('parameter_name', inplace=True) skipped_data_types = ('DATA_FRAME', 'SERIES') # for each supported parameter, convert to the correct type for parameter_name in resource.index[resource.index.notnull()]: parameter_definition = self.PARAMETERS[parameter_name] if parameter_definition.type_.name in skipped_data_types: continue # For each parameter, raise error if the value can't be coerced --2-- parameter_value = resource.at[parameter_name, 'value']

It will load the 'value' correctly, but will it assign label, MIN, and MAX correctly too?

But I think we have to decide if we want to capture module parameter stats with regard to the labels here in which case we need to modify load_parameters? i.e. like you said

unassigned free constant context_specific total module_name: 5 15 3 7 30

Again, should we initialise labels, min, and max in both module and resource files? Or should we initialise these during load parameters as we are doing with module parameters?

Labels, min, and max should be defined in the resource files only.
IF they are not specified in the resource files, they should be assigned default values - of 'unassigned', 'None', and 'None' respectively.

It will load the 'value' correctly, but will it assign label, MIN, and MAX correctly too?

No. Needs to be modified to accommodate that

tests/test_core.py

tbhallett · 2025-04-23T10:05:25Z

Thanks @mnjowe and @marghe-molaro

Looping in @tamuri here as this is deep inside the framework.

tamuri · 2025-04-23T11:37:09Z

Thanks for pinging me. First thought: these changes should be on the Parameter class, not the base Specifiable class. Because these changes do not apply to a Property, which also inherits from Specifiable.

mnjowe · 2025-04-23T11:44:09Z

Absolutely true. Thanks Asif

mnjowe · 2025-04-23T14:04:08Z

'unassigned', 'free', 'constant', 'context_specific', 'total'
Rti: 5, 15, 3, 7, 30
Alri: 3, 16, 9, 11, 39
TLO total: 8, 31, 12, 18, 69

Just to be clear on how this is to be logged, we are only interested in logging parameter label statistics per each disease module, right? Not including min and max values

marghe-molaro · 2025-04-23T15:23:01Z

'unassigned', 'free', 'constant', 'context_specific', 'total'
Rti: 5, 15, 3, 7, 30
Alri: 3, 16, 9, 11, 39
TLO total: 8, 31, 12, 18, 69

Just to be clear on how this is to be logged, we are only interested in logging parameter label statistics per each disease module, right? Not including min and max values

Yes that's correct, we are only interested in logging parameter label statistics

mnjowe · 2025-05-05T10:55:02Z

Hi @marghe-molaro. Following our discussion, I’ve made some updates to how we’ll capture label statistics. A key change is that I’ve reverted the recent modifications to the Parameters class. The main reasons for this are:

The fact that modelers won't define labels or their MIN/MAX values in the parameters section of the module; instead, these will be specified in the resource file.
These label definitions will not be used during simulation runtime within disease modules. They will only be referenced at the end of the simulation when logging statistics.

mnjowe · 2025-05-05T10:59:46Z

I’d however appreciate some clarification on the preferred way of capturing label statistics. Should these be written to the log file in the outputs folder, or is it sufficient to simply print a statement at the end of the simulation(using logger.info)? Currently, I’m doing the latter. This will inform the checks to be included in test_parameter_label in test_core.py

marghe-molaro · 2025-05-05T15:11:36Z

I’d however appreciate some clarification on the preferred way of capturing label statistics. Should these be written to the log file in the outputs folder, or is it sufficient to simply print a statement at the end of the simulation(using logger.info)? Currently, I’m doing the latter. This will inform the checks to be included in test_parameter_label in test_core.py

Anywhere is fine for now.

… update labels dictionary in simulation

mnjowe · 2025-05-07T08:28:00Z

Hi @marghe-molaro or @tamuri. Can you advise on the best way to connect this call in core to this function here in simulation. On my machine, it's running just fine, but it's crashing here, something I think I know the reason for. It seems by the time I'm calling the load_parameters function, self.sim hasn't been initialised yet, which results in the following error here

Also, any thoughts on how I can pass the data from core to simulation for computing label stats, but also logging the stats at the end of the simulation? Or if there is another good way of doing this, you can advise, I'm flexible to change.

After this is suggested @marghe-molaro , I will finish by adding Min and Max labels to this function , perhaps also renaming it to something like param_labels_data

tamuri · 2025-05-07T19:57:13Z

Can you advise on the best way to connect this call in core to this function here in simulation.

Is there a reason this information needs to be stored in Simulation rather than being kept inside the module that's loading the resource file? To log, you would iterate over each of the modules and get the module's labels as necessary.

You could add any logging for simulation end to simulation.finalise(). Again, I would call function inside of Module rather than bringing in the logic into Simulation.

mnjowe · 2025-05-08T06:44:32Z

That's a good question @tamuri. I think the idea is to load the information about modules and their parameter labels once then compute and log their statistics once after simulation end in sim.finalise.

The thought of doing this completely in Module crossed my mind, and I wanted to use the on_simulation_end function, but then that function is sometimes also called by disease modules, and I wouldn't want the compute_label_stats function to be executed in those disease modules.

Apart from simulation_end, I didn't see any other place in the Module where I can do this and can be executed at the end of the simulation. That's why I opted to transfer the labels info to the finalise method in simulation

tamuri · 2025-05-08T06:57:36Z

Yes, so I think not much needs to change. Instead of storing all the modules' labels together, store each module's labels in a field in the module itself. Then, in Simulation.finalise put the minimum amount of logic to collect the required information together. It looks like the key for the param_labels_data dict is simply the module name? So this is the same as looping over each of the modules and collecting the information. You could do that in the same loop that iterates over the modules and calls on_simulation_end on each of them. I mean not to put the logic inside on_simulation_end but rather something like:

def finalise(self, wall_clock_time: Optional[float] = None) -> None:
        for module in self.modules.values():
            module.on_simulation_end()
            labels = module.get_labels()
            # collect labels
        # log labels

mnjowe · 2025-05-08T07:06:06Z

Ah, I see. Thanks Asif, and yes, the param_labels_data dict key is simply the module name

marghe-molaro · 2025-05-09T08:33:15Z

Hi @mnjowe, I think we need to include an additional parameter label category, called something like "scenario", which would indicate parameters that are used by developers to define hypothetical future scenarios (and which therefore don't fall into any of the previously defined categories). For example, a parameter like ART_coverage_from_2025 used to compare hypothetical future trends in ART coverage would classify as a "scenario" parameter.

Ideally, these "scenario" parameters should eventually be separated from standard parameters— stored in separate files/kept outside the resources directory + handled in a distinct variable class, but I think we should do this at a later stage. Let's first get all module developers to declare these parameters as "scenario", and then someone can tidy up the structure of the code at a later stage. What do you think?

…ng label data to simulation. updated test

mnjowe · 2025-05-09T13:02:54Z

Hi everyone. Now the tests below are failing because of resource files like the attached ones. My assumption is that the lower and upper columns will only take values that are integers, floats, or None. The attached resource file has some values that do not align with my assumption. Should I contact Tara and see how we can resolve them?
parameters.csv

mnjowe · 2025-05-09T13:09:05Z

Hi @mnjowe, I think we need to include an additional parameter label category, called something like "scenario", which would indicate parameters that are used by developers to define hypothetical future scenarios (and which therefore don't fall into any of the previously defined categories). For example, a parameter like ART_coverage_from_2025 used to compare hypothetical future trends in ART coverage would classify as a "scenario" parameter.

Ideally, these "scenario" parameters should eventually be separated from standard parameters— stored in separate files/kept outside the resources directory + handled in a distinct variable class, but I think we should do this at a later stage. Let's first get all module developers to declare these parameters as "scenario", and then someone can tidy up the structure of the code at a later stage. What do you think?

Looks like a good idea. Just wondering though, do we have enough/more of these parameters that will require a separate folder outside the resources folder and a separate class?

marghe-molaro · 2025-05-09T13:09:35Z

Hi @mnjowe, it looks like Tara is already defining some columns that most likely overlap with our definition.
My advise would be, for now, to ensure that our column names can't be confused with existing ones - so renaming them something like prior_MIN and prior_MAX - and then let the module developers themselves decide how to merge their existing columns with our new standardised ones. This should circumvent the error.

mnjowe · 2025-05-09T14:25:52Z

Great, thanks!

marghe-molaro · 2025-05-16T15:25:21Z

@mnjowe noting here what we decided at today's meeting:

Replace labels used so far as follows:

free --> undetermined
constant --> universal
context_specific --> local
scenario --> scenario

Small check that parameter type is respected in [MIN,MAX] columns, to ensure that if parameter is list, there will be a list of MIN values and MAX values respectively.
[In the future] Add a test performing a cold read of the parameter labels, to prevent any merges with master with parameters are unassigned
Current timeline for roll-out: merge this PR into master by the 27th of May so that we can share these changes with developers at that day's Stand Up meeting!

…romDataframe class

…ameters

mnjowe · 2025-05-26T08:49:34Z

Hi @tamuri and @marghe-molaro, kindly find above my latest commits reflecting the discussions we've had so far.

One observation: some disease modules and their parameters are not initializing values via the load_parameters_from_dataframe function. As a result, metadata for those parameters is not being captured.

To flag this, I’ve introduced a not_init_via_load_param label in the logging output. For example, an entry like {'Demography': {'not_init_via_load_param': 7, 'unassigned': 1}} indicates that 7 parameters in the Demography module were initialized outside of load_parameters_from_dataframe, only one parameter was initialised via load_parameters_from_dataframe and was assigned a label of unassigned.

Let me know your thoughts. Thanks

marghe-molaro · 2025-05-26T09:25:56Z

Hi @mnjowe,

Thank you so much for these latest changes (which I will review asap) and this very interesting observation. I really like your proposed solution of keeping tabs on parameters not initialised through theload_parameters_from_dataframe function. Can I ask, if not initialised through that function, how are they initialised?

mnjowe · 2025-05-26T09:39:55Z

some modules are initialising parameter values like the below

mnjowe · 2025-05-26T09:42:06Z

and then there are modules like Demography who are combining both ways

marghe-molaro · 2025-05-26T09:54:30Z

Thank you very much for providing these examples @mnjowe. These instances all seem to be linked with 'operational' modules - HealthBurden, Demography, etc - rather than disease modules, so it's something that we can address ourselves later without requiring the input of disease module developers. So for now your logging solution is perfect.

marghe-molaro · 2025-05-27T11:54:27Z

Here is the presentation describing how we would like developers to update their reference files: https://docs.google.com/presentation/d/1VvkJiwEUrSxjS9f49IU6KT5aVYFbNqK1qpvKxOSGmuY/edit?usp=sharing

marghe-molaro · 2025-06-03T11:09:28Z

Hi @tamuri, just a quick reminder that your review on this PR would be highly appreciated, as we're hoping to get it into master before merging PR #1287 and PR #1609 which are almost ready

tamuri · 2025-06-03T11:30:47Z

Sorry for the delay - I've blocked all of tomorrow to work through this (and the other PRs). (Inspired by Tim)

…ameters # Conflicts: # src/tlo/simulation.py

tbhallett

I'm happy with the overall design here and will try it out on CC module (#1287) immediately. But, defer to @tamuri on the implementation.
Also, do we not need a test of it?

…_module_parameters

src/tlo/simulation.py

src/tlo/core.py

Co-authored-by: Asif Tamuri <tamuri@gmail.com>

mnjowe · 2025-06-05T08:40:26Z

I'm happy with the overall design here and will try it out on CC module (#1287) immediately. But, defer to @tamuri on the implementation. Also, do we not need a test of it?

Thanks @tbhallett, I have added some tests from here. Or do you mean some other tests?

tbhallett · 2025-06-05T08:45:20Z

I'm happy with the overall design here and will try it out on CC module (#1287) immediately. But, defer to @tamuri on the implementation. Also, do we not need a test of it?

Thanks @tbhallett, I have added some tests from here. Or do you mean some other tests?

Sorry --- I had missed those: all looks good to me.

added label argument to parameter class. added a test

8efeb3e

mnjowe requested review from marghe-molaro and tbhallett April 23, 2025 08:12

mnjowe self-assigned this Apr 23, 2025

marghe-molaro reviewed Apr 23, 2025

View reviewed changes

tbhallett requested a review from tamuri April 23, 2025 10:05

marghe-molaro mentioned this pull request Apr 23, 2025

generalising the model for any country #1043

Open

mnjowe added 2 commits April 30, 2025 09:09

added a method in simulation to log label stats

444af80

logging label stats

99adc2f

added test, updated label type check in core, added a new function to…

545b438

… update labels dictionary in simulation

marghe-molaro mentioned this pull request May 7, 2025

Formalising a validation framework #1615

Open

mnjowe added 2 commits May 9, 2025 13:12

updated function to include min and max in core. updated way of passi…

3539884

…ng label data to simulation. updated test

fixing imports order

68d3126

mnjowe added 2 commits May 26, 2025 11:26

modified logging of metadata. added some tests in TestLoadParametersF…

1dedbcb

…romDataframe class

Merge branch 'master' into mnjowe/assign_labels_to_disease_module_par…

fa60a62

…ameters

marghe-molaro mentioned this pull request Jun 3, 2025

Wasting Module #1609

Merged

tbhallett mentioned this pull request Jun 3, 2025

Cervical cancer #1287

Merged

tamuri added 2 commits June 4, 2025 14:49

Merge branch 'master' into mnjowe/assign_labels_to_disease_module_par…

9bc67a4

…ameters # Conflicts: # src/tlo/simulation.py

Remove unused field

63ceec6

tbhallett approved these changes Jun 4, 2025

View reviewed changes

tamuri added 2 commits June 4, 2025 21:29

Merge branch 'refs/heads/master' into mnjowe/assign_labels_to_disease…

e43da53

…_module_parameters

Collect the parameter labels once as we loop over each module

2e201e3

tamuri reviewed Jun 4, 2025

View reviewed changes

src/tlo/simulation.py Outdated Show resolved Hide resolved

tamuri reviewed Jun 4, 2025

View reviewed changes

src/tlo/core.py Outdated Show resolved Hide resolved

tamuri and others added 3 commits June 4, 2025 23:09

Print which module & log key failed in assert

ce5fe2f

update logging of labels

4f2d184

Co-authored-by: Asif Tamuri <tamuri@gmail.com>

remove set metadata method. update the metadata dictionary directly

bcb0b6e

Co-authored-by: Asif Tamuri <tamuri@gmail.com>

Remove unused function

80c6a8e

tamuri approved these changes Jun 5, 2025

View reviewed changes

tamuri merged commit 79faf00 into master Jun 5, 2025
62 checks passed

tamuri deleted the mnjowe/assign_labels_to_disease_module_parameters branch June 5, 2025 21:25

Conversation

mnjowe commented Apr 23, 2025

Uh oh!

marghe-molaro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

marghe-molaro Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

mnjowe Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

marghe-molaro Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

mnjowe Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mnjowe Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

marghe-molaro Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

mnjowe Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tbhallett commented Apr 23, 2025

Uh oh!

tamuri commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mnjowe commented Apr 23, 2025

Uh oh!

mnjowe commented Apr 23, 2025

Uh oh!

marghe-molaro commented Apr 23, 2025

Uh oh!

mnjowe commented May 5, 2025

Uh oh!

mnjowe commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marghe-molaro commented May 5, 2025

Uh oh!

mnjowe commented May 7, 2025

Uh oh!

tamuri commented May 7, 2025

Uh oh!

mnjowe commented May 8, 2025

Uh oh!

tamuri commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mnjowe commented May 8, 2025

Uh oh!

marghe-molaro commented May 9, 2025

Uh oh!

mnjowe commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mnjowe commented May 9, 2025

Uh oh!

marghe-molaro commented May 9, 2025

Uh oh!

mnjowe commented May 9, 2025

Uh oh!

marghe-molaro commented May 16, 2025

Uh oh!

mnjowe commented May 26, 2025

Uh oh!

marghe-molaro commented May 26, 2025

Uh oh!

mnjowe commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mnjowe Apr 23, 2025 •

edited

Loading

tamuri commented Apr 23, 2025 •

edited

Loading

mnjowe commented May 5, 2025 •

edited

Loading

tamuri commented May 8, 2025 •

edited

Loading

mnjowe commented May 9, 2025 •

edited

Loading

mnjowe commented May 26, 2025 •

edited

Loading

mnjowe commented May 26, 2025 •

edited

Loading