-
Notifications
You must be signed in to change notification settings - Fork 0
Description
We want to transition away from reading results files in the old-style massive json.gz results files, towards the new-style parquets + params.json, which are lighter and faster. See https://github.com/The-Strategy-Unit/nhp_planning/issues/190. This requires converting some older results files, which are tagged with run_stage metadata, into the new format.
I'm working on this in sprint 12 (techdebt). To get this moving quickly I'm working in a folder of nhp_analysis. This is in PR at time of writing: https://github.com/The-Strategy-Unit/nhp_analysis/pull/181.
However, these functions could be quite useful in future for converting arbitrary json.gz results into their *.parquet and params.json form. They could then get the other benefits of {reskit} like defensive programming and tests. i've set this to a low priority for now.