Completely remove RDS format usage per FDA feedback #136

Copilot · 2025-12-17T21:41:13Z

FDA requested that programs use datasetjson as the source format without converting to RDS, and eliminate RDS usage entirely from the workflow.

Changes

ADaM programs

ADaM programs (adsl.r, adae.r, adadas.r, adlbc.r, adtte.r)

Read SDTM data directly from JSON using datasetjson::read_dataset_json()
Write output directly to JSON files (no longer create RDS files)
Read dependent ADaM datasets from JSON outputs
Use new helper function write_dataset_json_with_metadata() for JSON output
Added library(datasetjson) imports
Updated headers to reflect .json output

TLF programs

TLF programs (tlf-demographic.r, tlf-efficacy.r, tlf-kmplot.r, tlf-primary.r)

Read ADaM data from JSON output files

Documentation

Documentation (adrg/_adrg.qmd)

Removed JSON-to-RDS conversion step from workflow instructions
Updated process descriptions to reflect that ADaM programs write JSON directly
Updated output descriptions to reference datasetjson files

Helpers

Helpers (pilot5-helper-fcns.r, run-all-adams-tlfs.r)

Created write_dataset_json_with_metadata() function - Extracted JSON writing logic from convert_rds_to_json.r into a reusable function that takes dataset dataframe, metacore spec, dataset name, and output path. Eliminates code duplication across ADaM programs.
Deprecated convert_json_to_rds() function with warning
Updated run-all script header to reflect JSON outputs

Cleanup

Deleted 43 SDTM RDS files that were converted from JSON

Example

Before (reading):

datasets <- map(
  dat_to_load,
  ~ convert_blanks_to_na(readRDS(file.path(path$sdtm, paste0(.x, ".rds"))))
)

After (reading):

library(datasetjson)

datasets <- map(
  dat_to_load,
  ~ convert_blanks_to_na(read_dataset_json(file.path(path$sdtm, paste0(.x, ".json")), decimals_as_floats = TRUE))
)

Before (writing):

saveRDS(adsl, file.path(path$adam, "adsl.rds"))

# Plus 45+ lines of metadata preparation code...

After (writing):

write_dataset_json_with_metadata(adsl, adsl_spec, "adsl", path$adam_json)

Workflow

Complete datasetjson workflow with no RDS files:

SDTM datasetjson → ADaM programs → ADaM datasetjson → TLF programs → Outputs
convert_rds_to_json.r is now obsolete as ADaM programs write JSON directly
Reduced code duplication by 235 lines through helper function

Original prompt

This section details on the original issue you should resolve

<issue_title>Purge use of rds based on feedback from FDA</issue_title>
<issue_description>FDA just wants us to use datasetjson as our source and not convert the datasetjson to rds in our programs.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes Purge use of rds based on feedback from FDA #135

Compare Combined Reports

## Compare Combined Reports

Here's a comparison report detailing the differences between pilot5-cmb-report-manual (3).pdf and pilot5-cmb-report-manual (4).pdf, broken down by each table and figure.

Overall Report Header Difference: The primary difference between the two files is the generation date of the overall report.

pilot5-cmb-report-manual (3).pdf is dated 2025-09-29. [cite: pilot5-cmb-report-manual (3).pdf]

pilot5-cmb-report-manual (4).pdf is dated 2026-01-07. [cite: pilot5-cmb-report-manual (4).pdf]

Table 14-2.01 Summary of Demographic and Baseline Characteristics

Differences Found:

The "Program:" generation timestamp differs between the two files.

In pilot5-cmb-report-manual (3).pdf, it states 2025-09-29 20:13:22.834974. [cite: pilot5-cmb-report-manual (3).pdf]

In pilot5-cmb-report-manual (4).pdf, it states 2026-01-07 19:37:03.11848. [cite: pilot5-cmb-report-manual (4).pdf]

pilot5-cmb-report-manual (4).pdf includes two "Warning in lapply(d[dbl_cols], as.double): NAs introduced by coercion" messages above the table, which are not present in pilot5-cmb-report-manual (3).pdf. [cite: pilot5-cmb-report-manual (4).pdf]

No Differences Found:

All numerical demographic and baseline characteristics data presented in the table (e.g., Age, Baseline Height, Baseline Weight, Baseline BMI, MMSE Total, Pooled Age Group counts, Race counts, N values for treatment groups) are identical in both files. [cite: pilot5-cmb-report-manual (3).pdf, pilot5-cmb-report-manual (4).pdf]

Table 14-3.01 Primary Endpoint Analysis: ADAS Cog (11) - Change from Baseline to Week 24 - LOCF

Differences Found:

The "Source:" generation timestamp differs between the two files.

In pilot5-cmb-report-manual (3).pdf, it states 20:03 Monday, September 29, 2025. [cite: pilot5-cmb-report-manual (3).pdf]

In pilot5-cmb-report-manual (4).pdf, it states 19:19 Wednesday, January 07, 2026. [cite: pilot5-cmb-report-manual (4).pdf]

No Differences Found:

All numerical data for Baseline, Week 24, and Change from Baseline (n, Mean (SD), Median (Range)), as well as p-values and Differences of LS Means, are identical in both files. [cite: pilot5-cmb-report-manual (3).pdf, pilot5-cmb-report-manual (4).pdf]

Table 14-3.02 Primary Endpoint Analysis: Glucose (mmol/L) - Summary at Week 20 - LOCF

Differences Found:

The "Table generated on:" timestamp differs between the two files.

In pilot5-cmb-report-manual (3).pdf, it states 2025-09-29 20:03:24.216929. [cite: pilot5-cmb-report-manual (3).pdf]

In pilot5-cmb-report-manual (4).pdf, it states 2026-01-07 19:19:04.70865. [cite: pilot5-cmb-report-manual (4).pdf]

No Differences Found:

All numerical data for Baseline, Week 20, Change from Baseline (N, Mean (SD)), LS Mean, and Pairwise Comparison results are identical in both files. [cite: pilot5-cmb-report-manual (3).pdf, pilot5-cmb-report-manual (4).pdf]

Figure 14-1 Time to Dermatologic Event by Treatment Group KM plot

Differences Found:

The "Program:" generation timestamp differs between the two files.

In pilot5-cmb-report-manual (3).pdf, it states [2025−09−11 12:59:06.233846]. [cite: pilot5-cmb-report-manual (3).pdf]

In pilot5-cmb-report-manual (4).pdf, it states [2026−01−07 19:19:07.024642]. [cite: pilot5-cmb-report-manual (4).pdf]

No Differences Found:

The "At Risk" numbers for each treatment group and time point are identical across both files. [cite: pilot5-cmb-report-manual (3).pdf, pilot5-cmb-report-manual (4).pdf]

The visual representation of the Kaplan-Meier plot itself appears identical. [cite: pilot5-cmb-report-manual (3).pdf, pilot5-cmb-report-manual (4).pdf]

bms63 · 2025-12-17T21:42:33Z

@copilot - This Pull Request should remove the conversions of the datasetjson files to rds files. The programs should all ingest the datastjson files, e.g. sdtms should use datasetjson files for the ADaM programs. The Table and Figure programs should use the ADaM datasetjson files.

Co-authored-by: bms63 <[email protected]>

Copilot · 2025-12-17T22:04:29Z

@copilot - This Pull Request should remove the conversions of the datasetjson files to rds files. The programs should all ingest the datastjson files, e.g. sdtms should use datasetjson files for the ...

Done. All programs now ingest and output datasetjson files exclusively:

ADaM programs read SDTM datasetjson files and write ADaM datasetjson files directly (commits 1bb83e6, 1812f9d)
TLF programs read ADaM datasetjson files
Removed all RDS file creation and usage
convert_rds_to_json.r is now obsolete as ADaMs write JSON directly

Complete workflow: SDTM JSON → ADaM programs → ADaM JSON → TLF programs

bms63 · 2025-12-17T22:21:10Z

@copilot there is a program called convert_rds_to_json.r that used to take the rds files created in the ad* programs. Can you make this program into a function that can be called in each the adam programs to take the dataframe and convert to datajson. Please take into consideration that the metacore object also needs to be accessed as well as seen in the original program convert_rds_to_json.r.

Co-authored-by: bms63 <[email protected]>

Copilot · 2025-12-17T22:27:18Z

@copilot there is a program called convert_rds_to_json.r that used to take the rds files created in the ad* programs. Can you make this program into a function that can be called in each the adam pro...

Done. Created write_dataset_json_with_metadata() function in pilot5-helper-fcns.r that encapsulates the logic from convert_rds_to_json.r (commit b3e2e43).

The function:

Takes dataset dataframe, metacore spec object, dataset name, and output path
Handles all column metadata preparation using the metacore spec
Writes properly formatted datasetjson files

All 5 ADaM programs now use this function, eliminating 235 lines of duplicated code.

github-actions · 2025-12-18T20:35:44Z

Dataset QC Check Complete!

Date: 2026-01-08
Run by: bms63
Git Ref: refs/pull/136/merge
BASE: Pilot 5 Datasets
COMPARE: Pilot 3 Datasets

✅ Dataset: adadas


No issues were found!

✅ Dataset: adae


No issues were found!

✅ Dataset: adlbc


No issues were found!

✅ Dataset: adsl


No issues were found!

✅ Dataset: adtte


No issues were found!

bms63 · 2025-12-18T20:46:24Z

@RConsortium/pilot5 finally getting round to this - copilot is rescuing me here!!

@elimillera any ideas on the TLF Check fix

@parmsam-pfizer please start reviewing - I'm testing things out, but looks promising.

…m files

github-actions · 2026-01-07T19:45:24Z

TLF QC Check Complete!

Date: 2026-01-08
Run by: bms63
Git Ref: refs/pull/136/merge
BASE: Pilot 5 Datasets
COMPARE: Pilot 3 Datasets

✅ Dataset: tlf-demographic-pilot5.out


old[36:40] vs new[36:40]
  "—————————————————————————————————————————————————————————————————————————————————————————————————"
  ""
  "Program: tlf-demographic.r "
- "2026-01-08 02:22:28.583663"
+ "2024-04-18 20:17:25"
  ""

✅ TLF: tlf-kmplot-pilot5.pdf - Score: 75

[1] "1. Title differences: First image has 'KM plot for Time to First Dermatologic Event: Safety population' while second image has 'KM plot for Time to First Dermatologic Event: Safety population' (same content but potentially different font/rendering)\n\n2. Y-axis label: First image shows 'Survival Percent' while second image shows 'Probability of event'\n\n3. X-axis label: First image shows 'Time to First Dermatologic Event (Days)' while second image shows 'Time to First Dermatologic Event (Days)' with slightly different positioning\n\n4. Grid lines: First image has minimal gridlines, second image has prominent gray gridlines across the entire plot area\n\n5. Legend position: First image has legend on the right side, second image has legend labeled 'TRT01A' on the right side\n\n6. Legend styling: Second image has a box around the legend, first image does not\n\n7. Confidence intervals shading: Both have shaded regions but with different transparency/color intensity\n\n8. Reference line: First image has a dotted horizontal line, second image has a dashed horizontal line at the 0.50 mark\n\n9. Data table below: Different formatting and layout - first image has horizontal orientation, second image appears more compact\n\n10. At risk numbers: Different formatting in the tables below the graphs\n\n11. Footer text: First image shows 'Program ID: trt-01-07-19-017-126N02', second image shows 'Program: tfl--km plot.r [2024--04--18 20:17:28]'\n\n12. Overall plot dimensions and aspect ratio appear slightly different\n\n13. Font sizes and styling differ between the two images\n\n14. The curves themselves follow the same general pattern but may have slight rendering differences"

✅ TLF: tlf-efficacy-pilot5.rtf - Score: 98

[1] "After careful examination of both images, I found the following differences:\n\n1. Footer timestamp: The first image shows 'Report generated on: 15Jan2019 08:13' while the second image shows 'Report generated on: 15Oct2019 20:17' - indicating different generation dates and times (January vs October, different times)\n\n2. All other elements appear identical including:\n - Table title 'ANCOVA of Change from Baseline at Week 20'\n - All column headers and structure\n - All numerical data values in the table\n - Treatment groups and sample sizes (N values)\n - Mean, standard deviation, and confidence interval values\n - Footnotes and table structure\n - Overall layout and formatting"

✅ TLF: tlf-primary-pilot5.rtf - Score: 99

[1] "After careful comparison of both images, I found only one difference:\n\n1. In the footnote text at the bottom of the second table (Table 14.3.4.2), there is a subtle text difference. The first image states 'SE=Standard error; CI=Confidence interval; Based on ANCOVA model; DV=Primary end baseline value as a covariate' while the second image states 'SE=Standard error; CI=Confidence interval; Based on ANCOVA model; DV=Primary end baseline value as a covariate'. Upon very close inspection, the word 'DV=Primary' appears to have slightly different spacing or rendering.\n\nAll other elements are identical including:\n- Both tables have the same title 'Protocol: CS3150/98/Page 1 of 2 Population: FASOT'\n- Table 14.3.4.1 content is identical in both images\n- Table 14.3.4.2 content is identical in both images\n- All numerical values match exactly\n- Column headers are identical\n- Sample sizes (N) are the same\n- All statistical values (means, standard errors, confidence intervals) are identical\n- Layout and formatting are the same\n- Both show 'Primary Endpoint Analysis: ASES total change from Baseline at Week 24 - LDCF'"

parmsam-pfizer · 2026-01-07T22:46:26Z

Looks good to me. I was initially concerned about how write_dataset_json_with_metadata() was implemented but it seems fine.

parmsam-pfizer · 2026-01-07T23:03:21Z

Maybe one of us can try going through the new ADRG instructions tomorrow after this PR is merged and the ECTD repo is updated: https://github.com/RConsortium/submissions-pilot5-datasetjson-to-fda

bms63 · 2026-01-07T23:18:41Z

Maybe one of us can try going through the new ADRG instructions tomorrow after this PR is merged and the ECTD repo is updated: https://github.com/RConsortium/submissions-pilot5-datasetjson-to-fda

Yes please!!

robertdevine · 2026-01-08T01:37:26Z

@bms63, @parmsam-pfizer, @RConsortium/pilot5 - picking up a few warnings in the combined report run on posit cloud.
I'll test Win 10/11 run on J&J laptops before this week's meetings.

Linux:
pilot5-cmb-report-manual.pdf

…ium/submissions-pilot5-datasetjson into copilot/purge-rds-usage

parmsam-pfizer · 2026-01-09T16:19:08Z

I can try running the ADRG instructions once the PR and Github Action are done.

parmsam-pfizer · 2026-01-09T16:19:55Z

Is the PR ready to merge?

bms63 · 2026-01-09T21:53:19Z

@rpodcast just FYI there are no rds files in this bundle...i forgot we take the json files and convert them

so this is good to go!! @parmsam-pfizer !!

Initial plan

50d5f29

Copilot AI assigned Copilot and bms63 Dec 17, 2025

github-project-automation bot added this to Pilot 5: datasetjson Dec 17, 2025

github-project-automation bot moved this to Todo in Pilot 5: datasetjson Dec 17, 2025

Copilot started work on behalf of bms63 December 17, 2025 21:43 View session

Copilot AI and others added 4 commits December 17, 2025 21:48

Update ADaM programs to read from JSON instead of RDS

0c36d2b

Co-authored-by: bms63 <[email protected]>

Update TLF programs to read from JSON instead of RDS

0444faf

Co-authored-by: bms63 <[email protected]>

Update ADRG documentation and deprecate convert_json_to_rds

2b4e8ca

Co-authored-by: bms63 <[email protected]>

Delete obsolete SDTM RDS files

bb8420f

Co-authored-by: bms63 <[email protected]>

Copilot AI changed the title ~~[WIP] Remove conversion of datasetjson to rds~~ Remove RDS conversion for datasetjson inputs per FDA feedback Dec 17, 2025

Copilot finished work on behalf of bms63 December 17, 2025 21:56

Copilot AI requested a review from bms63 December 17, 2025 21:56

Copilot started work on behalf of bms63 December 17, 2025 21:56 View session

Copilot AI and others added 2 commits December 17, 2025 22:01

Make ADaM programs output JSON directly instead of RDS

1bb83e6

Co-authored-by: bms63 <[email protected]>

Update ADaM programs to read from JSON outputs of dependent ADaMs

1812f9d

Co-authored-by: bms63 <[email protected]>

Copilot AI changed the title ~~Remove RDS conversion for datasetjson inputs per FDA feedback~~ Completely remove RDS format usage per FDA feedback Dec 17, 2025

Copilot finished work on behalf of bms63 December 17, 2025 22:05

Copilot started work on behalf of bms63 December 17, 2025 22:21 View session

Extract JSON writing logic into reusable helper function

b3e2e43

Co-authored-by: bms63 <[email protected]>

Copilot finished work on behalf of bms63 December 17, 2025 22:28

bms63 marked this pull request as ready for review December 18, 2025 20:45

docs: rerun of tlfs for cmb report; chore: ignore lintr checks for ll…

b6ec7eb

…m files

bms63 requested review from parmsam-pfizer and rpodcast January 7, 2026 19:47

bms63 added 4 commits January 7, 2026 19:51

chore: ignore use of magrittr pipe

f77acf1

Add pipe consistency linter to lint workflow

e58eed7

Update lint.yml

b450b11

Update lint.yml

1c72f94

bms63 added 2 commits January 8, 2026 02:23

fix: suppress NAs in qmd chunk

87ab1e3

Merge branch 'copilot/purge-rds-usage' of https://github.com/RConsort…

ae5ac03

…ium/submissions-pilot5-datasetjson into copilot/purge-rds-usage

bms63 merged commit d9669eb into main Jan 9, 2026
8 checks passed

bms63 deleted the copilot/purge-rds-usage branch January 9, 2026 21:53

github-project-automation bot moved this from Todo to Done in Pilot 5: datasetjson Jan 9, 2026

Completely remove RDS format usage per FDA feedback #136

Completely remove RDS format usage per FDA feedback #136

Uh oh!

Conversation

Copilot AI commented Dec 17, 2025 • edited by bms63 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Example

Workflow

Comments on the Issue (you are @copilot in this section)

Uh oh!

bms63 commented Dec 17, 2025

Uh oh!

Copilot AI commented Dec 17, 2025

Uh oh!

bms63 commented Dec 17, 2025

Uh oh!

Copilot AI commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dataset QC Check Complete!

Uh oh!

bms63 commented Dec 18, 2025

Uh oh!

github-actions bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TLF QC Check Complete!

Uh oh!

parmsam-pfizer commented Jan 7, 2026

Uh oh!

parmsam-pfizer commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bms63 commented Jan 7, 2026

Uh oh!

robertdevine commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

parmsam-pfizer commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

parmsam-pfizer commented Jan 9, 2026

Uh oh!

bms63 commented Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Dec 17, 2025 •

edited by bms63

Loading

github-actions bot commented Dec 18, 2025 •

edited

Loading

github-actions bot commented Jan 7, 2026 •

edited

Loading

parmsam-pfizer commented Jan 7, 2026 •

edited

Loading

robertdevine commented Jan 8, 2026 •

edited

Loading

parmsam-pfizer commented Jan 9, 2026 •

edited

Loading