-
Notifications
You must be signed in to change notification settings - Fork 24
Expand file tree
/
Copy path_quarto.yml
More file actions
227 lines (221 loc) · 8.31 KB
/
_quarto.yml
File metadata and controls
227 lines (221 loc) · 8.31 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
project:
type: website
post-render: scripts/post-render.py
format:
html:
theme: flatly
css:
- styles.css
- reference/_styles-quartodoc.css
toc: true
grid:
sidebar-width: 270px
body-width: 950px
margin-width: 225px
gutter-width: 1rem
filters:
- interlinks
interlinks:
fast: true
sources:
numpy:
url: https://numpy.org/doc/stable/
python:
url: https://docs.python.org/3/
website:
title: Pointblank
google-analytics: "G-XSFKYZM9GW"
search:
show-item-context: true
page-navigation: true
bread-crumbs: false
favicon: assets/fav-logo.png
site-url: https://posit-dev.github.io/pointblank/
description: "Find out if your data is what you think it is"
navbar:
left:
- text: User Guide
file: user-guide/index.qmd
- text: Examples
file: demos/index.qmd
- href: reference/index.qmd
text: API Reference
- href: blog/index.qmd
text: Pointblog
right:
- icon: discord
href: https://discord.com/invite/YH7CybCNCQ
- icon: github
href: https://github.com/posit-dev/pointblank
sidebar:
id: user-guide
contents:
- section: "Getting Started"
contents:
- user-guide/index.qmd
- user-guide/installation.qmd
- section: "Validation Plan"
contents:
- user-guide/validation-overview.qmd
- user-guide/validation-methods.qmd
- user-guide/column-selection-patterns.qmd
- user-guide/preprocessing.qmd
- user-guide/segmentation.qmd
- user-guide/thresholds.qmd
- user-guide/actions.qmd
- user-guide/briefs.qmd
- section: "Advanced Validation"
contents:
- user-guide/expressions.qmd
- user-guide/schema-validation.qmd
- user-guide/assertions.qmd
- user-guide/draft-validation.qmd
- section: "Post Interrogation"
contents:
- user-guide/step-reports.qmd
- user-guide/extracts.qmd
- user-guide/sundering.qmd
- section: "Data Inspection"
contents:
- user-guide/preview.qmd
- user-guide/col-summary-tbl.qmd
- user-guide/missing-vals-tbl.qmd
html-table-processing: none
quartodoc:
package: pointblank
dir: reference
title: API Reference
style: pkgdown
dynamic: true
render_interlinks: true
renderer:
style: markdown
table_style: description-list
sections:
- title: Validate
desc: >
When performing data validation, you'll need the `Validate` class to get the process started.
It's given the target table and you can optionally provide some metadata and/or failure
thresholds (using the `Thresholds` class or through shorthands for this task). The
`Validate` class has numerous methods for defining validation steps and for obtaining
post-interrogation metrics and data.
contents:
- name: Validate
members: []
- name: Thresholds
- name: Actions
members: []
- name: FinalActions
- name: Schema
members: []
- name: DraftValidation
members: []
- title: Validation Steps
desc: >
Validation steps can be thought of as sequential validations on the target data. We call
`Validate`'s validation methods to build up a validation plan: a collection of steps that,
in the aggregate, provides good validation coverage.
contents:
- name: Validate.col_vals_gt
- name: Validate.col_vals_lt
- name: Validate.col_vals_ge
- name: Validate.col_vals_le
- name: Validate.col_vals_eq
- name: Validate.col_vals_ne
- name: Validate.col_vals_between
- name: Validate.col_vals_outside
- name: Validate.col_vals_in_set
- name: Validate.col_vals_not_in_set
- name: Validate.col_vals_null
- name: Validate.col_vals_not_null
- name: Validate.col_vals_regex
- name: Validate.col_vals_expr
- name: Validate.rows_distinct
- name: Validate.rows_complete
- name: Validate.col_exists
- name: Validate.col_schema_match
- name: Validate.row_count_match
- name: Validate.col_count_match
- name: Validate.conjointly
- name: Validate.specially
- title: Column Selection
desc: >
A flexible way to select columns for validation is to use the `col()` function along with
column selection helper functions. A combination of `col()` + `starts_with()`, `matches()`,
etc., allows for the selection of multiple target columns (mapping a validation across many
steps). Furthermore, the `col()` function can be used to declare a comparison column (e.g.,
for the `value=` argument in many `col_vals_*()` methods) when you can't use a fixed value
for comparison.
contents:
- name: col
- name: starts_with
- name: ends_with
- name: contains
- name: matches
- name: everything
- name: first_n
- name: last_n
- name: expr_col
- title: Interrogation and Reporting
desc: >
The validation plan is put into action when `interrogate()` is called. The workflow for
performing a comprehensive validation is then: (1) `Validate()`, (2) adding validation
steps, (3) `interrogate()`. After interrogation of the data, we can view a validation report
table (by printing the object or using `get_tabular_report()`), extract key metrics, or we
can split the data based on the validation results (with `get_sundered_data()`).
contents:
- name: Validate.interrogate
- name: Validate.get_tabular_report
- name: Validate.get_step_report
- name: Validate.get_json_report
- name: Validate.get_sundered_data
- name: Validate.get_data_extracts
- name: Validate.all_passed
- name: Validate.assert_passing
- name: Validate.assert_below_threshold
- name: Validate.above_threshold
- name: Validate.n
- name: Validate.n_passed
- name: Validate.n_failed
- name: Validate.f_passed
- name: Validate.f_failed
- name: Validate.warning
- name: Validate.error
- name: Validate.critical
- title: Inspection and Assistance
desc: >
The *Inspection and Assistance* group contains functions that are helpful for getting to
grips on a new data table. Use the `DataScan` class to get a quick overview of the data,
`preview()` to see the first and last few rows of a table, `col_summary_tbl()` for a
column-level summary of a table, and `missing_vals_tbl()` to see where there are missing
values in a table. Several datasets included in the package can be accessed via the
`load_dataset()` function. On the assistance side, the `assistant()` function can be used to
get help with Pointblank.
contents:
- name: DataScan
- name: preview
- name: col_summary_tbl
- name: missing_vals_tbl
- name: assistant
- name: load_dataset
- title: Utility Functions
desc: >
The *Utility Functions* group contains functions that are useful accessing metadata about
the target data. Use `get_column_count()` or `get_row_count()` to get the number of columns
or rows in a table. The `get_action_metadata()` function is useful when building custom
actions since it returns metadata about the validation step that's triggering the action.
Lastly, the `config()` utility lets us set global configuration parameters.
contents:
- name: get_column_count
- name: get_row_count
- name: get_action_metadata
- name: get_validation_summary
- name: config
- title: Prebuilt Actions
desc: >
The *Prebuilt Actions* group contains a function that can be used to send a Slack
notification when validation steps exceed failure threshold levels or just to provide a
summary of the validation results, including the status, number of steps, passing and
failing steps, table information, and timing details.
contents:
- name: send_slack_notification