Skip to content

Commit e42671e

Browse files
committed
doc update
1 parent 779de2a commit e42671e

File tree

2 files changed

+29
-21
lines changed

2 files changed

+29
-21
lines changed

NEWS.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,20 @@
55
when `NA`s are detected.
66
- In `tb()`
77
+ Fix for broken proportions in freq tables
8-
+ New parameters `fct.to.chr` and `recalculate` for freq tables
8+
+ New parameters `fct.to.chr` and `recalculate` for `freq()` tables
99
+ Parameter `na.rm` deprecated
1010
- In `dfSummary()`:
1111
+ New parameter `class` allows switching off class reporting in *Variable*
1212
column.
13-
- In `freq()` & `ctable()`:
14-
+ New parameter `na.val` allows specifying a value (factor level) that
13+
- In `freq()`, `ctable()` and `dfSummary()`:
14+
+ New parameter `na.val` allows specifying a value / factor level that
1515
is to be considered `NA`. In turn, the value "(Missing)" is no longer
16-
considered missing by default; using `na.val = "(Missing)"`
17-
will yield the same results.
16+
considered missing by default (using `na.val = "(Missing)"`
17+
will yield the same results).
1818
+ Fix for weights not being applied correctly in by-group processing.
19+
+ **Labelled vectors** ("labelled" / "haven_labelled") are treated like
20+
factors in `freq()`, and in `dfSummary()` when all values have a label.
21+
Future versions will extend support to `ctable()`.
1922
- In `descr()`:
2023
+ "n" (total number of observations, also displayed in heading) added to
2124
available statistics.
@@ -25,8 +28,7 @@
2528
excludes *Pct. Valid* from, *common* statistics.
2629
+ Fix for *N* in header showing 1st group's size rather than global size.
2730
+ Fix for weights not being applied correctly in by-group processing.
28-
- Optimized metadata extraction
29-
- Improved support for dplyr::group_by()
31+
- `define_keywords()` now uses RStudio's api for dialogs.
3032
- `llabel()` wrapper added for `label(x, all = TRUE)`
3133

3234
# summarytools 1.0.2 (2022-07-10)

vignettes/introduction.Rmd

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ txt <- data.frame(
7474
)
7575
7676
kable(txt, format = "html", escape = FALSE, align = c('l', 'l')) |>
77-
kable_paper(full_width = FALSE, position = "left") |>
77+
kable_classic(full_width = FALSE, position = "left") |>
7878
column_spec(1, extra_css = "vertical-align:top") |>
7979
column_spec(2, extra_css = "vertical-align:top")
8080
```
@@ -115,11 +115,9 @@ Results can be
115115
weights
116116
- **Multilingual**:
117117
+ Built-in translations exist for French, Portuguese, Spanish, Russian, and
118-
Turkish. Users can easily add custom translations or modify existing ones
119-
as needed
118+
Turkish. Users can easily add custom translations or modify existing
119+
languages at will
120120
- **Flexible and extensible**:
121-
+ The built-in features used to support alternate languages provide a way to
122-
modify a great number of terms used in outputs (headings and tables)
123121
+ **Pipe operators** from
124122
[magrittr](https://cran.r-project.org/package=magrittr) (`%>%`, `%$%`) and
125123
[pipeR](https://cran.r-project.org/package=pipeR) (`%>>%`) are fully
@@ -130,6 +128,12 @@ Results can be
130128
+ **By-group processing** is easily achieved using the package's `stby()`
131129
function which is a slightly modified version of `base::by()`, but
132130
`dplyr::group_by()` is also supported
131+
+ Version 1.1 introduced support for **labelled vectors** (classes *labelled*
132+
/ *haven_labelled*), which are being treated as factors in `freq()`, and
133+
in `dfSummary()` when all values are labelled. A future release will have
134+
`ctable()` behave similarly.
135+
+ Parameter `na.val` allows treating a special value as `NA` in `freq()`,
136+
`ctable()` and `dfSummary()` (feature introduced in version 1.1.0).
133137
+ [**Pander options**](http://rapporter.github.io/pander/) can be used to
134138
customize or enhance plain text and markdown tables
135139
+ Base R's `format()` arguments are also supported by **summarytools**'
@@ -567,8 +571,10 @@ dfs$Variable <- NULL # This deletes the Variable column
567571
# 6. Grouped Statistics: stby()
568572

569573
To produce optimal results, **summarytools** has its own version of
570-
the base `by()` function. It's called `stby()`, and we use it exactly as we
571-
would `by()`:
574+
the base `by()` function. It's called `stby()`, and we use it as we
575+
would `by()`, with a notable difference: set the `useNA` parameter to `TRUE`
576+
to create an additional group for observations containing `NA`s on the grouping variable(s) (see example in section 6.2).
577+
572578

573579
```{r}
574580
(iris_stats_by_species <- stby(data = iris,
@@ -578,6 +584,7 @@ would `by()`:
578584
transpose = TRUE))
579585
```
580586

587+
581588
## 6.1 Special Case of descr() with stby()
582589

583590
When used to produce split-group statistics for a single variable, `stby()`
@@ -589,7 +596,8 @@ with(tobacco,
589596
stby(data = BMI,
590597
INDICES = age.gr,
591598
FUN = descr,
592-
stats = c("mean", "sd", "min", "med", "max"))
599+
stats = c("mean", "sd", "min", "med", "max"),
600+
useNA = TRUE)
593601
)
594602
```
595603

@@ -623,10 +631,8 @@ with(tobacco,
623631

624632
To create grouped statistics with `freq()`, `descr()` or `dfSummary()`, it is
625633
possible to use **dplyr**'s `group_by()` as an alternative to `stby()`.
626-
Syntactic differences aside, one key distinction is that `group_by()` considers
627-
`NA` values on the grouping variable(s) as a valid category, albeit with a
628-
warning suggesting the use of `forcats::fct_na_value_to_level` to make
629-
`NA`'s explicit in factors. Following this advice, we get:
634+
Usings `forcats::fct_na_value_to_level` to make `NA`'s explicit in factors is
635+
recommended:
630636

631637
```{r, eval=FALSE}
632638
library(dplyr)
@@ -1263,8 +1269,8 @@ The package comes with no guarantees. It is a work in progress and
12631269
feedback is welcome. Please open an [issue on GitHub](https://github.com/dcomtois/summarytools/issues) if you find a
12641270
bug or wish to submit a feature request.
12651271

1266-
**summarytools** is the result of **many** hours of work. If you find the
1267-
package brings value to your work, please take a moment to make a small
1272+
**summarytools** is the result of many hours of work. If it
1273+
brings value to your work, please consider making a small
12681274
donation using this [Paypal link](https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=HMN3QJR7UMT7S&item_name=Help+scientists,+data+scientists+and+analysts+around+the+globe&currency_code=CAD&source=url).
12691275

12701276

0 commit comments

Comments
 (0)