Conversation
Unit Tests Summary 1 files 37 suites 26m 51s ⏱️ Results for commit 2594752. ♻️ This comment has been updated with latest results. |
Unit Test Performance Difference
Additional test case details
Results for commit 620595b ♻️ This comment has been updated with latest results. |
gogonzo
left a comment
There was a problem hiding this comment.
This branch needs a proper code reducing.
R/tm_g_scatterplotmatrix.R
Outdated
| vi <- col_names[i] | ||
| vj <- col_names[j] |
There was a problem hiding this comment.
what about more descriptive object names?
| vi <- col_names[i] | |
| vj <- col_names[j] | |
| colname_i <- col_names[i] | |
| colname_j <- col_names[j] |
There was a problem hiding this comment.
Currently I changed to
xi <- ANL[[col_names[i]]]
xj <- ANL[[col_names[j]]]so that you can understand it's an x-th variable in a matrix.
There was a problem hiding this comment.
Plots generated with lots of duplicate code. I think there are several ways to simplify them:
- Reuse a base plot and then update it for each specific purpose.
- Apply the theme and general characteristics after the plots are wrapped by patchwork ( use the right operator to apply to all plots).
Sorry I didn't assign myself and we have duplicated efforts but I agree with the previous review
I ran the example and text is too small to read:
Code Coverage SummaryDiff against mainResults for commit: 2594752 Minimum allowed coverage is ♻️ This comment has been updated with latest results |
|
Hey @gogonzo and @llrs-roche - thanks for the review. I mainly aimed to get a quick prototype of the output, so we could compare different approaches. Will work on improving the PR and making it more polished and production-ready. |
| checkmate::assert_multi_class(post_output, c("shiny.tag", "shiny.tag.list", "html"), null.ok = TRUE) | ||
| checkmate::assert_choice(default_plot_type, seq.int(1L, length(plot_choices))) | ||
| checkmate::assert_string(default_outlier_label) | ||
| assert_decorators(decorators) |
There was a problem hiding this comment.
this is repeated in line 245/246. so I deleted duplication
| ~ if (length(.) <= 1) "Please select at least 2 columns.", | ||
| ~ if (length(.) > 5) "Please select at most 5 columns." |
There was a problem hiding this comment.
I added this limitation so we don't create too much plots. It already takes some time to create 25 plots when we have 5 variables and Add correlation is not checked.
| bslib::popover( | ||
| icon("circle-info"), | ||
| title = "NA handling options", | ||
| tags$dl( | ||
| tags$dt("Everything"), | ||
| tags$dd("Return NA for a pair if either variable contains any missing value."), | ||
| tags$dt("All observations"), | ||
| tags$dd("Assume no NAs are present; throws an error if any are found."), | ||
| tags$dt("Complete observations"), | ||
| tags$dd("Listwise deletion – only rows with no NAs across all selected variables."), | ||
| tags$dt("NA or complete"), | ||
| tags$dd("Like complete observations but returns NA instead of an error when no complete cases exist."), | ||
| tags$dt("Pairwise complete"), | ||
| tags$dd("Use all rows where both variables in a pair are non-missing (maximises available data).") | ||
| ), | ||
| options = list(trigger = "hover focus") |
| title = "NA handling", | ||
| tags$p(tags$b("Checked:"), "use pairwise complete observations (each pair correlated over rows where both values are present)."), | ||
| tags$p(tags$b("Unchecked:"), "reveals a dropdown with all five", tags$code("stats::cor()"), "use= options."), | ||
| options = list(trigger = "hover focus") | ||
| ) |
| datasets = data, | ||
| selector_list = selector_list | ||
| ) | ||
| ) |> debounce(500) |
There was a problem hiding this comment.
Added debounce, so that when 3 variables out of 5 are selected to be plotted, and when we want to quickly add the 4th and the 5th variable, we dont get a plot for 4 variables and then for 5 variables, but we get the plot for 5 variables
R/tm_g_scatterplotmatrix.R
Outdated
| add_cor_value = add_cor, | ||
| cor_method_value = cor_method, | ||
| cor_use_value = cor_use, | ||
| alpha_value = alpha_val, | ||
| varnames_value = varnames |
There was a problem hiding this comment.
we need to pass reactive values so that you see add_cor <- TRUE instead of code like i < j && TRUE
| if (is.numeric(xi)) { | ||
| p <- p + ggplot2::geom_density(fill = "steelblue", alpha = alpha) | ||
| } else { | ||
| p <- p + ggplot2::geom_bar(fill = "steelblue", alpha = alpha) | ||
| } |
There was a problem hiding this comment.
This can be even shorter if we dont want to pass fill = "steelblue" or opactiy (alpha)
I was thinking if something like this is possible
custom_geom <- ifelse(is.numeric(x1), geom_density, geom_bar)
p + custom_geom(list_of_parameters)
or
p + do.call(custom_geom, list_of_parameters)
| | `tm_g_response` | plot (ggplot) | | ||
| | `tm_g_scatterplot` | plot (ggplot) | | ||
| | `tm_g_scatterplotmatrix` | plot (trellis) | | ||
| | `tm_g_scatterplotmatrix` | plot (patchwork) | |
There was a problem hiding this comment.
| | `tm_g_scatterplotmatrix` | plot (patchwork) | | |
| | `tm_g_scatterplotmatrix` | plot (patchwork/ggplot) | |
R/tm_g_scatterplotmatrix.R
Outdated
| p <- ggplot2::ggplot(data.frame(x = xj, y = xi)) + ggplot2::labs(x = NULL, y = NULL) | ||
| n_num <- is.numeric(xi) + is.numeric(xj) | ||
| if (n_num == 2) p <- p + ggplot2::aes(x = x, y = y) + ggplot2::geom_point(color = "steelblue", alpha = alpha) | ||
| if (n_num == 1) p <- p + ggplot2::aes(x = x, y = y) + ggplot2::geom_boxplot(fill = "steelblue", alpha = alpha) | ||
| if (n_num == 0) p <- p + ggplot2::aes(x = x, fill = y) + ggplot2::geom_bar(position = "dodge", alpha = alpha) + ggplot2::labs(fill = NULL) | ||
| p |
R/tm_g_scatterplotmatrix.R
Outdated
| cv <- if (!is.null(cor_mat) && is.numeric(xi) && is.numeric(xj)) cor_mat[col_names[i], col_names[j]] else NA_real_ | ||
| col <- if (is.na(cv)) "grey50" else if (cv > 0) "firebrick" else "steelblue" | ||
| ggplot2::ggplot() + | ||
| ggplot2::annotate("text", x = 0.5, y = 0.5, fontface = "bold", color = col, | ||
| label = if (!is.na(cv)) sprintf("%.2f", cv) else if (is.numeric(xi) && is.numeric(xj)) "NA" else "-", | ||
| size = if (!is.na(cv)) max(3, abs(cv) * 8 + 3) else if (is.numeric(xi) && is.numeric(xj)) 3 else 4) + | ||
| ggplot2::xlim(0, 1) + ggplot2::ylim(0, 1) + ggplot2::theme_void() |
Signed-off-by: Marcin <133694481+m7pr@users.noreply.github.com>
Signed-off-by: Marcin <133694481+m7pr@users.noreply.github.com>
| #' @note When *Add Correlation* is enabled, a simple **Omit NAs** checkbox | ||
| #' controls NA handling (checked = `"pairwise.complete.obs"`, matching the | ||
| #' historical default). Unchecking it reveals a dropdown with all five | ||
| #' `stats::cor()` `use` options for advanced control. |
There was a problem hiding this comment.
I think we can remove. Not really that important
|
Hey @llrs-roche and @gogonzo thanks for your time spent to review this. I revmoed labels from sub-plots so that we only see plots and legends dont take the whole space. I added also some tooltips to explain the Add Correlation / NA omit options - how those are handled in I tested this branch with below app devtools::load_all("teal")
devtools::load_all("teal.modules.general")
# ── Data ────────────────────────────────────────────────────────────────────
# 6 variables (4 numeric, 2 factor) covering every panel type:
# numeric × numeric → scatter (lower triangle)
# factor × numeric → boxplot
# numeric × factor → boxplot
# factor × factor → grouped bar
# diagonal numeric → density
# diagonal factor → bar
#
# Only 5 are pre-selected to demonstrate the max-5 validation limit —
# the 6th variable ("height") is available in the dropdown to swap in.
set.seed(42)
data <- teal_data()
data <- within(data, {
demo <- data.frame(
age = rnorm(200, mean = 50, sd = 10),
weight = rnorm(200, mean = 75, sd = 15),
bmi = rnorm(200, mean = 25, sd = 4),
height = rnorm(200, mean = 170, sd = 10),
sex = factor(sample(c("M", "F"), 200, replace = TRUE)),
region = factor(sample(c("North", "South", "East", "West"), 200, replace = TRUE)),
stringsAsFactors = FALSE
)
})
# ── App ─────────────────────────────────────────────────────────────────────
app <- init(
data = data,
modules = modules(
tm_g_scatterplotmatrix(
label = "Scatterplot Matrix",
variables = teal.transform::data_extract_spec(
dataname = "demo",
select = teal.transform::select_spec(
label = "Variables:",
choices = teal.transform::variable_choices("demo"),
selected = c("age", "weight", "bmi", "sex", "region"),
multiple = TRUE,
ordered = TRUE,
fixed = FALSE
)
)
)
)
)
shinyApp(app$ui, app$server)
|
|
Hey @gogonzo last touches on the labels. I think we should just show the labels at the outside of the matrix for the categorical variables (no labels where red lines, labels only where the blue line is). This saves a lot of space. Also the labels for the categorical variables are trimmed, so they don't take the whole space of the plot
I would appreciate your opinion in general on the PR. I think it's ready in it's current state. The code being long in show r code could be tackled as a separate extension to teal via styler pacakge. |







Part of
Satisfies last comment from 2025
Alternative to
This PR rewrite
scatterplotmatrixso it usespatchworkCoe do generate plots