Skip to content

Commit 45eeb76

Browse files
committed
Revise docstring content on segmentation
1 parent d17a7ad commit 45eeb76

File tree

1 file changed

+144
-80
lines changed

1 file changed

+144
-80
lines changed

pointblank/validate.py

Lines changed: 144 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -3427,12 +3427,16 @@ def col_vals_gt(
34273427
(i.e., no validation steps will be created for them).
34283428

34293429
A list with a combination of column names and tuples can be provided as well. This allows
3430-
for more complex segmentation scenarios. The following inputs are all valid:
3430+
for more complex segmentation scenarios. The following inputs are both valid:
34313431

3432-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
3433-
in the `"region"` column and specific dates in the `"date"` column
3434-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
3435-
columns
3432+
```
3433+
# Segments from all unique values in the `region` column
3434+
# and specific dates in the `date` column
3435+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
3436+
3437+
# Segments from all unique values in the `region` and `date` columns
3438+
segments=["region", "date"]
3439+
```
34363440

34373441
The segmentation is performed during interrogation, and the resulting validation steps will
34383442
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -3715,12 +3719,16 @@ def col_vals_lt(
37153719
(i.e., no validation steps will be created for them).
37163720

37173721
A list with a combination of column names and tuples can be provided as well. This allows
3718-
for more complex segmentation scenarios. The following inputs are all valid:
3722+
for more complex segmentation scenarios. The following inputs are both valid:
37193723

3720-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
3721-
in the `"region"` column and specific dates in the `"date"` column
3722-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
3723-
columns
3724+
```
3725+
# Segments from all unique values in the `region` column
3726+
# and specific dates in the `date` column
3727+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
3728+
3729+
# Segments from all unique values in the `region` and `date` columns
3730+
segments=["region", "date"]
3731+
```
37243732

37253733
The segmentation is performed during interrogation, and the resulting validation steps will
37263734
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -4002,12 +4010,16 @@ def col_vals_eq(
40024010
(i.e., no validation steps will be created for them).
40034011

40044012
A list with a combination of column names and tuples can be provided as well. This allows
4005-
for more complex segmentation scenarios. The following inputs are all valid:
4013+
for more complex segmentation scenarios. The following inputs are both valid:
40064014

4007-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
4008-
in the `"region"` column and specific dates in the `"date"` column
4009-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
4010-
columns
4015+
```
4016+
# Segments from all unique values in the `region` column
4017+
# and specific dates in the `date` column
4018+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
4019+
4020+
# Segments from all unique values in the `region` and `date` columns
4021+
segments=["region", "date"]
4022+
```
40114023

40124024
The segmentation is performed during interrogation, and the resulting validation steps will
40134025
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -4288,12 +4300,16 @@ def col_vals_ne(
42884300
(i.e., no validation steps will be created for them).
42894301

42904302
A list with a combination of column names and tuples can be provided as well. This allows
4291-
for more complex segmentation scenarios. The following inputs are all valid:
4303+
for more complex segmentation scenarios. The following inputs are both valid:
42924304

4293-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
4294-
in the `"region"` column and specific dates in the `"date"` column
4295-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
4296-
columns
4305+
```
4306+
# Segments from all unique values in the `region` column
4307+
# and specific dates in the `date` column
4308+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
4309+
4310+
# Segments from all unique values in the `region` and `date` columns
4311+
segments=["region", "date"]
4312+
```
42974313

42984314
The segmentation is performed during interrogation, and the resulting validation steps will
42994315
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -4572,12 +4588,16 @@ def col_vals_ge(
45724588
(i.e., no validation steps will be created for them).
45734589

45744590
A list with a combination of column names and tuples can be provided as well. This allows
4575-
for more complex segmentation scenarios. The following inputs are all valid:
4591+
for more complex segmentation scenarios. The following inputs are both valid:
45764592

4577-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
4578-
in the `"region"` column and specific dates in the `"date"` column
4579-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
4580-
columns
4593+
```
4594+
# Segments from all unique values in the `region` column
4595+
# and specific dates in the `date` column
4596+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
4597+
4598+
# Segments from all unique values in the `region` and `date` columns
4599+
segments=["region", "date"]
4600+
```
45814601

45824602
The segmentation is performed during interrogation, and the resulting validation steps will
45834603
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -4860,12 +4880,16 @@ def col_vals_le(
48604880
(i.e., no validation steps will be created for them).
48614881

48624882
A list with a combination of column names and tuples can be provided as well. This allows
4863-
for more complex segmentation scenarios. The following inputs are all valid:
4883+
for more complex segmentation scenarios. The following inputs are both valid:
48644884

4865-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
4866-
in the `"region"` column and specific dates in the `"date"` column
4867-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
4868-
columns
4885+
```
4886+
# Segments from all unique values in the `region` column
4887+
# and specific dates in the `date` column
4888+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
4889+
4890+
# Segments from all unique values in the `region` and `date` columns
4891+
segments=["region", "date"]
4892+
```
48694893

48704894
The segmentation is performed during interrogation, and the resulting validation steps will
48714895
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -5162,12 +5186,16 @@ def col_vals_between(
51625186
(i.e., no validation steps will be created for them).
51635187

51645188
A list with a combination of column names and tuples can be provided as well. This allows
5165-
for more complex segmentation scenarios. The following inputs are all valid:
5189+
for more complex segmentation scenarios. The following inputs are both valid:
51665190

5167-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
5168-
in the `"region"` column and specific dates in the `"date"` column
5169-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
5170-
columns
5191+
```
5192+
# Segments from all unique values in the `region` column
5193+
# and specific dates in the `date` column
5194+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
5195+
5196+
# Segments from all unique values in the `region` and `date` columns
5197+
segments=["region", "date"]
5198+
```
51715199

51725200
The segmentation is performed during interrogation, and the resulting validation steps will
51735201
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -5478,12 +5506,16 @@ def col_vals_outside(
54785506
(i.e., no validation steps will be created for them).
54795507

54805508
A list with a combination of column names and tuples can be provided as well. This allows
5481-
for more complex segmentation scenarios. The following inputs are all valid:
5509+
for more complex segmentation scenarios. The following inputs are both valid:
54825510

5483-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
5484-
in the `"region"` column and specific dates in the `"date"` column
5485-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
5486-
columns
5511+
```
5512+
# Segments from all unique values in the `region` column
5513+
# and specific dates in the `date` column
5514+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
5515+
5516+
# Segments from all unique values in the `region` and `date` columns
5517+
segments=["region", "date"]
5518+
```
54875519

54885520
The segmentation is performed during interrogation, and the resulting validation steps will
54895521
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -5750,12 +5782,16 @@ def col_vals_in_set(
57505782
(i.e., no validation steps will be created for them).
57515783

57525784
A list with a combination of column names and tuples can be provided as well. This allows
5753-
for more complex segmentation scenarios. The following inputs are all valid:
5785+
for more complex segmentation scenarios. The following inputs are both valid:
57545786

5755-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
5756-
in the `"region"` column and specific dates in the `"date"` column
5757-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
5758-
columns
5787+
```
5788+
# Segments from all unique values in the `region` column
5789+
# and specific dates in the `date` column
5790+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
5791+
5792+
# Segments from all unique values in the `region` and `date` columns
5793+
segments=["region", "date"]
5794+
```
57595795

57605796
The segmentation is performed during interrogation, and the resulting validation steps will
57615797
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -6003,12 +6039,16 @@ def col_vals_not_in_set(
60036039
(i.e., no validation steps will be created for them).
60046040

60056041
A list with a combination of column names and tuples can be provided as well. This allows
6006-
for more complex segmentation scenarios. The following inputs are all valid:
6042+
for more complex segmentation scenarios. The following inputs are both valid:
60076043

6008-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
6009-
in the `"region"` column and specific dates in the `"date"` column
6010-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
6011-
columns
6044+
```
6045+
# Segments from all unique values in the `region` column
6046+
# and specific dates in the `date` column
6047+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
6048+
6049+
# Segments from all unique values in the `region` and `date` columns
6050+
segments=["region", "date"]
6051+
```
60126052

60136053
The segmentation is performed during interrogation, and the resulting validation steps will
60146054
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -6247,12 +6287,16 @@ def col_vals_null(
62476287
(i.e., no validation steps will be created for them).
62486288

62496289
A list with a combination of column names and tuples can be provided as well. This allows
6250-
for more complex segmentation scenarios. The following inputs are all valid:
6290+
for more complex segmentation scenarios. The following inputs are both valid:
62516291

6252-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
6253-
in the `"region"` column and specific dates in the `"date"` column
6254-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
6255-
columns
6292+
```
6293+
# Segments from all unique values in the `region` column
6294+
# and specific dates in the `date` column
6295+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
6296+
6297+
# Segments from all unique values in the `region` and `date` columns
6298+
segments=["region", "date"]
6299+
```
62566300

62576301
The segmentation is performed during interrogation, and the resulting validation steps will
62586302
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -6486,12 +6530,16 @@ def col_vals_not_null(
64866530
(i.e., no validation steps will be created for them).
64876531

64886532
A list with a combination of column names and tuples can be provided as well. This allows
6489-
for more complex segmentation scenarios. The following inputs are all valid:
6533+
for more complex segmentation scenarios. The following inputs are both valid:
64906534

6491-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
6492-
in the `"region"` column and specific dates in the `"date"` column
6493-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
6494-
columns
6535+
```
6536+
# Segments from all unique values in the `region` column
6537+
# and specific dates in the `date` column
6538+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
6539+
6540+
# Segments from all unique values in the `region` and `date` columns
6541+
segments=["region", "date"]
6542+
```
64956543

64966544
The segmentation is performed during interrogation, and the resulting validation steps will
64976545
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -6733,12 +6781,16 @@ def col_vals_regex(
67336781
(i.e., no validation steps will be created for them).
67346782

67356783
A list with a combination of column names and tuples can be provided as well. This allows
6736-
for more complex segmentation scenarios. The following inputs are all valid:
6784+
for more complex segmentation scenarios. The following inputs are both valid:
67376785

6738-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
6739-
in the `"region"` column and specific dates in the `"date"` column
6740-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
6741-
columns
6786+
```
6787+
# Segments from all unique values in the `region` column
6788+
# and specific dates in the `date` column
6789+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
6790+
6791+
# Segments from all unique values in the `region` and `date` columns
6792+
segments=["region", "date"]
6793+
```
67426794

67436795
The segmentation is performed during interrogation, and the resulting validation steps will
67446796
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -6976,12 +7028,16 @@ def col_vals_expr(
69767028
(i.e., no validation steps will be created for them).
69777029

69787030
A list with a combination of column names and tuples can be provided as well. This allows
6979-
for more complex segmentation scenarios. The following inputs are all valid:
7031+
for more complex segmentation scenarios. The following inputs are both valid:
69807032

6981-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
6982-
in the `"region"` column and specific dates in the `"date"` column
6983-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
6984-
columns
7033+
```
7034+
# Segments from all unique values in the `region` column
7035+
# and specific dates in the `date` column
7036+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
7037+
7038+
# Segments from all unique values in the `region` and `date` columns
7039+
segments=["region", "date"]
7040+
```
69857041

69867042
The segmentation is performed during interrogation, and the resulting validation steps will
69877043
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -7367,12 +7423,16 @@ def rows_distinct(
73677423
(i.e., no validation steps will be created for them).
73687424

73697425
A list with a combination of column names and tuples can be provided as well. This allows
7370-
for more complex segmentation scenarios. The following inputs are all valid:
7426+
for more complex segmentation scenarios. The following inputs are both valid:
73717427

7372-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
7373-
in the `"region"` column and specific dates in the `"date"` column
7374-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
7375-
columns
7428+
```
7429+
# Segments from all unique values in the `region` column
7430+
# and specific dates in the `date` column
7431+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
7432+
7433+
# Segments from all unique values in the `region` and `date` columns
7434+
segments=["region", "date"]
7435+
```
73767436

73777437
The segmentation is performed during interrogation, and the resulting validation steps will
73787438
be numbered sequentially. Each segment will have its own validation step, and the results
@@ -7604,12 +7664,16 @@ def rows_complete(
76047664
(i.e., no validation steps will be created for them).
76057665

76067666
A list with a combination of column names and tuples can be provided as well. This allows
7607-
for more complex segmentation scenarios. The following inputs are all valid:
7667+
for more complex segmentation scenarios. The following inputs are both valid:
76087668

7609-
- `segments=["region", ("date", ["2023-01-01", "2023-01-02"])]`: segments on unique values
7610-
in the `"region"` column and specific dates in the `"date"` column
7611-
- `segments=["region", "date"]`: segments on unique values in the `"region"` and `"date"`
7612-
columns
7669+
```
7670+
# Segments from all unique values in the `region` column
7671+
# and specific dates in the `date` column
7672+
segments=["region", ("date", ["2023-01-01", "2023-01-02"])]
7673+
7674+
# Segments from all unique values in the `region` and `date` columns
7675+
segments=["region", "date"]
7676+
```
76137677

76147678
The segmentation is performed during interrogation, and the resulting validation steps will
76157679
be numbered sequentially. Each segment will have its own validation step, and the results

0 commit comments

Comments
 (0)