-
-
Notifications
You must be signed in to change notification settings - Fork 3
picks #270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gogonzo
wants to merge
151
commits into
main
Choose a base branch
from
redesign_extraction@main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
picks #270
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- multiple variables concatenates values
- tests
testing coverage
This was referenced Oct 30, 2025
Code Coverage SummaryDiff against mainResults for commit: 621603d Minimum allowed coverage is ♻️ This comment has been updated with latest results |
Unit Tests Summary 1 files 28 suites 17s ⏱️ Results for commit 621603d. ♻️ This comment has been updated with latest results. |
Unit Test Performance Difference
Additional test case details
Results for commit 74816b8 ♻️ This comment has been updated with latest results. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes insightsengineering/NEST-roadmap#36
Related issues in the milestone:
See related PRs:
Installation
Table of content (not interactive)
Motivation
Merging relational data
Consider following tables
orders,order_items,products,customersconnected with join keysfollowing sql convention
{child}.{$parent}_id = {$parent+"s"}.id, for exampleorders.customer_id = customers.id.teal.datasetup would look like this:Sample data setup with relational tables and join keys
Imagine now a scenario of a
ggplotwhere one wants to selectx,y,color,facet_rows,facet_colsfrom any variable in any dataset.Example ggplot with dynamic variable selection
In order to create above visualization, datasets need to be merged as
ggplot::aesis related to singledata object. Problem is solvable as
teal.datahas enough information to determine correctmerge call based and selected variables and
join_keys(describing relationships between datasets).Using
dplyronly we need to perform following merge operation given that following variables havebeen selected:
orders.order_dateorder_items.order_items.total_priceproducts.categorycustomers.countryMerge operation using dplyr joins
Now
anlcan produce desired visualizationCreating visualization with merged dataset
Handling ambiguous variables
When merging datasets containing duplicated variable names
dplyr::*_join(suffix = c(".x", ".y"))automaticallyadds a suffix to the columns, so that names are alway unique.
Merging interactively
Developing system which can interactively handle merge is a challenging task not only for a reasons described above
but also due to additional layers which need to control described operation. These layers include:
1. Configure possible selection
picks and choices/selected
We came with an idea of
pickswhich allows app-developer to specifydatasets,variablesandvaluesto beselected by app-user during an app run. Each of them is based on the idea of
choices/selectedwhere app-developerprovides
choicesand what isselectedby default. App-user though changesselectedinteractively.graph TB subgraph AppDeveloper["👨💻 App Developer (Configuration Time)"] Dev[App Developer] end subgraph Variables["variables()"] VAR_Choices["choices"] VAR_Selected["selected"] end subgraph AppUser["👤 App User (Runtime)"] User[App User] UI["selectInput<br/>(UI Component)"] end Dev -->|"sets choices"| VAR_Choices Dev -->|"sets default selected"| VAR_Selected VAR_Choices -->|"displayed in"| UI VAR_Selected -->|"initial value in"| UI UI -->|"presents choices"| User User -->|"changes selection"| VAR_Selected classDef devStyle fill:#e1f5ff,stroke:#0066cc,stroke-width:2px classDef userStyle fill:#fff4e1,stroke:#cc6600,stroke-width:2px classDef choicesStyle fill:#d4edda,stroke:#28a745,stroke-width:2px classDef selectedStyle fill:#fff3cd,stroke:#ffc107,stroke-width:2px class Dev devStyle class User,UI userStyle class VAR_Choices choicesStyle class VAR_Selected selectedStyleNew design bases on an idea that a module can consume its arguments referring to any variable in any dataset. Consider following example, where:
x,yandfacetarguments to create an interactive inputs,x,y,facetTo provide choices and default selection for
x,yandfacetwe propose following api:Proposed API using
picks()for variable selectionWhere each function creates an object which holds the information consumed by the framework.
choicesandselectedcan be either:tidyselectselection_helpers (?tidyselect::language)Relationship between
pickselementsEach
pickselement is evaluated in a sequence starting fromdatasets.selectedin one of them determines possible choices of the next one. For example:If
datasetsis selected to beiris, then following variables'schoiceswill be variables of iris.selectedcan't be something else than achoicesand so on.graph TB subgraph "picks()" subgraph "datasets()" DS_Choices["choices<br/>(available datasets)"] DS_Selected["selected<br/>(chosen dataset)"] end subgraph "variables()" VAR_Choices["choices<br/>(available variables)"] VAR_Selected["selected<br/>(chosen variable)"] end end DS_Choices -->|"user selects from"| DS_Selected DS_Selected -->|"determines"| VAR_Choices VAR_Choices -->|"user selects from"| VAR_Selected DS_Selected -.->|"e.g., if 'iris' selected"| VAR_Choices VAR_Choices -.->|"then choices become<br/>iris columns"| VAR_Selected classDef choicesStyle fill:#d4edda,stroke:#28a745,stroke-width:2px classDef selectedStyle fill:#fff3cd,stroke:#ffc107,stroke-width:2px classDef flowStyle fill:#e8f4f8,stroke:#0066cc,stroke-width:1px,stroke-dasharray: 5 5 class DS_Choices,VAR_Choices choicesStyle class DS_Selected,VAR_Selected selectedStyle style DS_Selected stroke-width:3px style VAR_Choices stroke-width:3pxExample settings
Please read carefully the code and see the description to understand how
pickswork.Strict variables picks
picksbelow will create an input in the module where single variable can be selected fromc("Sepal.Length", "Sepal.Width").multiple = FALSEdisallow user to select more than one choice.Example: Strict variable picks with single selection
Dynamic variables choices
Following
pickswill create an input in the module where user will be able to select any variable from iris (any =everything()) and by default1-st will be selected. Be careful, setting explicitselectedwhenchoicesthrows a warning as it is not certain for example that"Species" %in% everything().Example: Dynamic variable choices with tidyselect
Dynamic variables from multiple datasets
Consider a situation when one wants to select a variable from either
irisormtcars. Instead of forcing app-developer to enumerate all possible choices foririsandmtcars. Following picks will create two related inputs for datasets and for variables. Input for variables will automatically update when dataset selection changes.Example: Multiple datasets with dynamic variables
Dynamic everything
In extreme scenario also lists of datasets could be unknown. Or to avoid writing too much text, one can specify following
picks.Example: Fully dynamic dataset and variable selection
Implementation in
teal_moduleteal_modulewill acceptx,yandfacetand hand-over them to bothuiandserver.Module definition with picks arguments
On the
uipart it is necessary to callpicks_uifor eachpicksobject.UI implementation with picks_ui
In the
server,picksare utilized inpicks_srvwhich can be called per each pick, or for all at once (as in the example below).picks_srvis used only to resolve dynamic choices/selected and to handle interaction between inputs.selectorscontain a list of selected datasets/variables for eachpick. In this exampleselectorsstructure looks like this:To create a merged-dataset using information from app-user selection one needs to call
merge_srv.picks_srvdoesn't do anything else than controlling a selection for a number of reasons:variablesto perform merge. For example, some might be controlled withpicks_ui/srvand have an UI element and some might be fixed and added optionally to themerge_srv(selectors).merge_srvreturns a list with two reactives:data:teal_dataobject with merged dataset andmerge_vars: named list of variables. List is named after selector name, for examplemerge_vars()$facetServer implementation with merge_srv
App example
Complete working app example with relational data and dynamic merging
Select specific variable(s) from a specific dataset
Select specific variable(s) from a selected dataset
Select unknown variable(s) from a selected dataset
filtering by any variable
picksprovides no equivalent tofilter_specfeature. To achieve this, please create ateal_transform_modulewitha filtering mechanism.
Conversion from to data-extract-spec
pickswill completelly replacedata_extract_spec(des) and will be the only tool to select-and-mergein
tealframework. So far des will be supported as soft deprecated.help("as.picks")contains the information how to convert des into picks but in a limited scope.
data_extract_spec(or a list of des) containing onlyselect_specare convertible 1:1 to thepicksfilter_specis not convertible topicksas it variables used in filter can be different than variables selected inselect_spec, thus hierarchical form ofpickscan't handle this case.filter_speccan be converted toteal_transform_moduleand used intransformatorsargument instead and we recommend to do so.teal.transform::teal_transform_filterprovides a simplified way to create suchtransformator.Issues with
data_extract_specAPI of
data_extract_specis bad encapsulation, hard to extend, confusing and easy to break.Bad encapsulation
In
filter_specone can specifychoices = value_choices(dataname, subset = function(data)), this is vulnerablefor multiple failures:
value_choices("dataname")can be completelly different thandata_extract_spec(dataname), guess what happens? ;]subset = function(data)is a function of a dataset, it means that even ifvars = "Species"has been provideed one still have to dolevels(data$Species)instead oflevels(column)As you can see, there are few places where scope of the classes is not well separated which leads to:
data_extract_spec(dataname)andvalue_choices(dataname),vars = "Species",levels(data$Species)Conclusion:
value_choices(data)shouldn't have an argumentdataas it is "given" by thedata_extract_spec. Same applies tovariable_choices(data).value_choices(subset = function(data))should be a function of column which is set infilter_spec(vars)Hard to extend
Recently one user asked to make
data_extract_spec(datanames)delayed, so that it will adjust automatically to the existing datasets when resolved. Problem is API allow to have des for multiple datasets only when one knows their names. It is just done by making a list-of-des. Following code will produce dropdown with "iris" and "mtcars" and app-user will be able to switch between datasets (switching between des)Proposition was that
dataname = dataset_choices(), similar to thefilter_spec(vars = variable_choices()). Let's consider how would it look like:To achive this, package would have to be seriously refactored, to be able to do following:
Let's just use above example and change function names: