Skip to content

Commit e8df348

Browse files
rdboyesdrizk1kdpsingh
authored
basic logging for main verbs (#138)
* basic logging for select, mutate, and transmute * unit testing for logs * remove deepdiffs dependency * adds tests for logs on the rest of the functions * typo fix * add mutate numbers to log * adds join logging, fix cov x * Fix esedge case for logging with grouped data frames. * :newsize mode logs correct type * add detail for row_change and col_change * add brief docs, bump v, up news * fixes log when grouped mutate, adds fillmissing, dropmissing log support * fixed fxn call * fix join log if stmnt, bump cov attempt w 2tests * add slice log support * change slice_min_max to not use`@filter` bc of logging msg dupes * adds unite, sep, sep_rows * adds logging for nests * minor docs edits for settings * exclude log.jl from code coverage for now --------- Co-authored-by: Daniel Rizk <[email protected]> Co-authored-by: Karandeep Singh <[email protected]>
1 parent d72c425 commit e8df348

File tree

15 files changed

+1052
-628
lines changed

15 files changed

+1052
-628
lines changed

NEWS.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# TidierData.jl updates
22

3+
## v.0.17.0 - 2025-03-24
4+
- Adds logging ability to track changes to data frames with `TidierData_set("log", true)`
5+
- Adds docs describing logging and code printing
6+
37
## v0.16.5 - 2025-01-11
48
- Bugfix: Corrected bug when using `Module.function()` syntax within expressions, which was previously causing errors due to the module being escaped.
59

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "TidierData"
22
uuid = "fe2206b3-d496-4ee9-a338-6a095c4ece80"
33
authors = ["Karandeep Singh"]
4-
version = "0.16.5"
4+
version = "0.17.0"
55

66
[deps]
77
Chain = "8be319e6-bccf-4806-a6f7-6fae938471bc"
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# TidierData.jl comes with two settings that make it easier to understand the transformations that are being applied to a data frame and to troubleshoot errors. These settings are `log` and `code`. The `log` setting outputs information about the data frame after each transformation, including the number of missing values and the number of unique values in each column. The `code` setting outputs the code that is being executed by the TidierData.jl macros. By default, both settings are set to `false`. This page will review the `log` and `code` settings using the movies dataset.
2+
#
3+
# We recommend setting the `log` setting to `true` in general, and especially when you are first learning TidierData.jl. This will help you understand how the data frame is being transformed at each step. The `code` setting is useful for debugging errors in TidierData.jl chains.
4+
5+
using TidierData
6+
using RDatasets
7+
8+
movies = dataset("ggplot2", "movies");
9+
10+
# ## `log`
11+
# Logging is set to `false` by default but can enabled as follows:
12+
13+
TidierData_set("log", true)
14+
15+
# When enabled, each macro called will show information about its transformation of the data. Logging can be especially useful to catch silent bugs (those that do not result in an error).
16+
#
17+
# When column values are changed, it will report the number new missing values, the percentage of missing values, and the number of unique values.
18+
19+
@chain movies begin
20+
@filter(Year > 2000)
21+
@mutate(Budget_cat = case_when(Budget > 18000 => "high",
22+
Budget > 2000 => "medium",
23+
Budget > 100 => "low",
24+
true => missing))
25+
@filter(!ismissing(Budget))
26+
@group_by(Year, Budget_cat)
27+
@summarize(Avg_Budget = mean(Budget), n = n())
28+
@ungroup
29+
@arrange(n)
30+
end
31+
32+
TidierData_set("log", false) # disable logging
33+
34+
# ## `code`
35+
# Code printing is set to `false` by default. Enabling this setting prints the underlying DataFrames.jl code created by TidierData.jl macros. It can be useful for debugging, especially for users who understand DataFrames.jl syntax, or for filing bug reports.
36+
37+
TidierData_set("code", true) # enable macro code output
38+
39+
@chain movies begin
40+
@select(Title, Year, Budget)
41+
@slice_sample(n = 10)
42+
end
43+
44+
TidierData_set("code", false) # disable macro code output

docs/mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ plugins:
117117
nav:
118118
- "Home": "index.md"
119119
- "Movies dataset" : "examples/generated/UserGuide/dataset_movies.md"
120+
- "Setting Options" : "examples/generated/UserGuide/settings.md"
120121
- "@select" : "examples/generated/UserGuide/select.md"
121122
- "@rename" : "examples/generated/UserGuide/rename.md"
122123
- "@mutate" : "examples/generated/UserGuide/mutate_transmute.md"

0 commit comments

Comments
 (0)