- Multi-valued ordinals fully functional: All 33 tests passing in
test-Entity-multivalued-ordinals.R - STF roundtrip: Multi-valued ordinals correctly import/export through STF format
- Validation: New validator
validate_entity_multivalued_ordinal_levels()ensures expanded values match ordinal_levels - Vocabulary fix: Critical bug fixed in
get_hydrated_variable_and_category_metadata()- ordinal variables now get vocabulary fromordinal_levelsmetadata (canonical source) rather than from data
- R/Entity-methods.R:1454-1457: Removed blocking code that prevented ordinal + multi-valued combination
- R/Entity-methods.R:1589-1593:
sync_ordinal_data()now filters out multi-valued ordinals (they stay as character strings, not factors) - R/Entity-methods.R:1256-1293:
set_variable_ordinal_levels()now handles multi-valued by expanding values and validating against levels
- R/entity-validators-data-types.R:211-220:
validate_entity_ordinal_factors()skips multi-valued ordinals - R/entity-validators-data-types.R:305-371: New
validate_entity_multivalued_ordinal_levels()validator - R/validators.R:169-170: Registered new validator in baseline profile
- R/Entity-methods.R:1174-1196: Split categorical/binary and ordinal processing
- Categorical/binary: vocabulary from
as.factor(data) %>% levels()(line 1157) - Ordinal: vocabulary from
ordinal_levelsmetadata (line 1178) - this ensures correct order and includes user-defined levels not in data
- Categorical/binary: vocabulary from
- R/Study-export-VDI.R:344-399: String variables - separate single/multi-valued, use pmap for multi-valued
- R/Study-export-VDI.R:405-462: Number variables - same pattern
- R/Study-export-VDI.R:466-525: Date variables - same pattern
- Key: Filters use
is_multi_valued == 1(it's converted to integer at line 275) - Performance: Single-valued variables processed in batch (critical for gene expression with 1000s of variables)
- Multi-valued expansion: Each variable independently expanded with ID replication using rowwise/unnest pattern
- inst/extdata/toy_example/householdsMultiValuedOrdinals.tsv: Test data with multi-valued ordinals
- tests/testthat/test-Entity-multivalued-ordinals.R: 33 tests, all passing ✅
- tests/testthat/test-Study-export-VDI-multivalued.R: Tests written but need fixture fixes
Problem: Test entities from householdsMultiValuedOrdinals.tsv have ID detection issues
Distances.to.wellgets detected as ID column (all unique values in small dataset)- Tests fail validation before VDI export even runs
Why This Happened:
- Unlucky test data - "Distances to well" ended up with all unique values
- Real-world data wouldn't have this issue
- Existing VDI tests use
make_study()which creates properly structured test data
Solutions for Next Session:
Create better test data or use existing make_study() pattern:
# Instead of entity_from_file, consider:
study <- make_study(name = 'multivalued test')
households <- study %>% get_root_entity() %>%
set_variables_multivalued('SomeVar' = ';')Each test needs proper setup. Pattern that works:
households <- entity_from_file(file_path, name='household') %>%
quiet() %>%
set_variables_multivalued('Construction.materials' = ';') %>%
redetect_columns_as_variables('Distances.to.well') # Fix ID detectionCurrent State of VDI Tests:
- Lines 1-80: test 1 (string) - partially fixed, needs validation
- Lines 83-127: test 2 (number) - partially fixed
- Lines 140-192: test 3 (date) - needs redetect fix
- Lines 194-246: test 4 (integer) - needs redetect fix
- Lines 248-326: test 5 (ordinals) - needs redetect fix (also line ~260)
- Lines 332-407: test 6 (mixed) - fixed syntax, needs validation
-
Fix VDI test fixtures (30 min)
- Option A: Rewrite to use
make_study()pattern - Option B: Add
redetect_columns_as_variables()to remaining tests (lines 140, 194, 248-260)
- Option A: Rewrite to use
-
Run full test suite (10 min)
docker exec --user rstudio study-wrangler-dev R -e "setwd('/study.wrangler'); library(devtools); test()"
- Check for regressions in existing tests
- Verify ordinal tests still pass
- Verify existing VDI tests still pass
-
Test existing multi-valued functionality (5 min)
- Run
test-Entity-multivalued.Rto ensure we didn't break existing multi-valued (non-ordinal) behavior - Especially check line 18-25 which previously tested that ordinals can't be multi-valued
- Run
-
Documentation (10 min)
- Update CLAUDE.md if needed with multi-valued ordinal guidance
- Consider adding example to vignettes
- Character strings with delimiters:
"Low;High;Medium" - NOT converted to factors: Multi-valued ordinals must remain character type
- Validation: Expanded values must all exist in
ordinal_levels
# For each multi-valued variable
data %>%
select(id = {{id_col}}, value = all_of(var)) %>%
rowwise() %>%
mutate(values = list(strsplit(value, delim)[[1]])) %>%
ungroup() %>%
tidyr::unnest(values) %>%
mutate("{id_col_vdi}" := id,
attribute_stable_id = stable_ids[var],
string_value = values, # or number_value, date_value
.keep = "none")R/Entity-methods.R: ~100 lines changed (ordinals, hydrated metadata)R/entity-validators-data-types.R: ~70 lines added (new validator)R/validators.R: 2 lines added (register validator)R/Study-export-VDI.R: ~180 lines changed (multi-valued expansion)- Tests: 2 new files, ~400 lines
- Do you want to use
make_study()pattern for VDI tests, or fix existingentity_from_file()approach? - Should we add multi-valued ordinal examples to vignettes?
- Any specific edge cases to test before considering this complete?