Skip to content

Commit 74220c6

Browse files
committed
fix segments
1 parent d0c1550 commit 74220c6

File tree

1 file changed

+21
-10
lines changed

1 file changed

+21
-10
lines changed

md-docs/user_guide/segment.md

Lines changed: 21 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,16 @@
11

22
# Segment
33

4-
A Segment is a subset of the population, created according to a specific set of rules. A [Task] can include several Segments, each defined by its own rule and monitored in parallel alongside the whole population. The objective of a Segment is to allow the analysis of specific groups of data, whose variations might go unnoticed if only the whole population is monitored.
4+
A Segment is a subset of the data distribution that identifies a sub-domain inside the data.
5+
It is defined by a set of rules over data dimensions and metadata.
6+
A [Task] can include several Segments and there are no constrains about how they are specified.
7+
Indeed, two Segments can have some intersections in the data space.
58

9+
When Segments are specified for a [Task], monitoring is performed both on the whole data, called _all population_, and for each Segment.
10+
The objective of a Segment is to allow the analysis of specific groups of data, whose variations might go unnoticed if only the whole population is monitored.
611

7-
Segments, similarly to the [Data schema], must be defined before sending any data to the Platform. They must to be created all at once, as they can't be modified upon creation. Additionally, their definition needs to happen
8-
after the creation of the Data Schema, as the rules for the Segment are based on the columns defined there.
12+
Segments, similarly to the [Data schema], must be defined before sending any data to the Platform.
13+
They must to be created all at once, as they can't be modified upon creation. Additionally, their definition needs to happen after the creation of the Data Schema, as the rules for the Segment are based on the columns defined there.
914

1015

1116
## Segment Structure
@@ -21,14 +26,15 @@ Segments can be created both through the Web App and the SDK.
2126

2227
## Segment Rules
2328

24-
A rule is a condition that a sample must satisfy to be part of a Segment. Each Segment can have multiple roles, which are applied in AND between them.
29+
A rule is a condition over a single data dimension that a specific sample must match to be considered part of the Segment.
30+
Each Segment has from one to several rules, which are applied in AND between them.
2531
A rule is defined by the following fields:
2632

2733
| Field | Description |
2834
| --------- | ------- |
2935
| Column name | The name of the column in the Data Schema that the rule is applied to. A rule can be applied only on columns of role INPUT, TARGET and METADATA|
3036
| Operator | The operator defining the rule. It can be either `IN` or `OUT` |
31-
| Values | This field can have 2 possible meaning, according to the data type of the column of the rule: <br><ul><li>The data type is float: values is a list of intervals that defines the ranges over which the operator is applied. The values defining the interval are always included in the interval.</li><li>The data type is categorical or string: values is a list containing the exact values over which the operator is applied</li></ul> |
37+
| Values | This field can have two possible meaning, according to the data type of the column specified in the rule: <br><ul><li>The data type is float: Values is a series of ranges [a, b] that define the numeric intervals over which the operator is applied. The range is closed, meaning that the extremes are always considered in it. When operator is `IN`, the ranges are in OR, whereas, when the operator is `OUT` they are in AND.</li><li>The data type is categorical or string: Values is a list which elements must match the content of the column. When operator is `IN`, the column value must be one of the specified elements, while, when operator is `OUT` it must not be one of them. </li></ul> |
3238

3339
## Examples
3440

@@ -76,7 +82,7 @@ This segment would include the samples with `Sample ID` equal to `id_0`, `id_1`
7682
)
7783
```
7884

79-
- A Segment that includes all samples where the value of the column `X_0` is between greater or equal than 13 and the value of the column `X_1` is strictly less than 24:
85+
- A Segment that includes all samples where the value of the column `X_0` is greater or equal than 13 and the value of the column `X_1` is strictly less than 24:
8086

8187
| Field | Value |
8288
| --------- | ------- |
@@ -103,7 +109,7 @@ This segment would include the sample with `Sample ID` equal to `id_3`.
103109
NumericSegmentRule(
104110
column_name='X_1',
105111
operator=SegmentOperator.IN,
106-
values=[SegmentRuleNumericRange(end_value=22)]
112+
values=[SegmentRuleNumericRange(end_value=23)]
107113
)
108114
]
109115
)
@@ -137,7 +143,7 @@ This segment would include the samples with `Sample ID` equal to `id_0` and `id_
137143
operator=SegmentOperator.IN,
138144
values=[SegmentRuleNumericRange(end_value=10), SegmentRuleNumericRange(start_value=14)]
139145
),
140-
NumericSegmentRule(
146+
CategoricalSegmentRule(
141147
column_name='Metadata_1',
142148
operator=SegmentOperator.IN,
143149
values=['A1', 'A3']
@@ -168,11 +174,16 @@ This segment would include the samples with `Sample ID` equal to `id_2` and `id_
168174
Segment(name=f'Segment 3',
169175
rules=[
170176
NumericSegmentRule(
177+
column_name='X_1',
178+
operator=SegmentOperator.OUT,
179+
values=[SegmentRuleNumericRange(end_value=21), SegmentRuleNumericRange(start_value=23)]
180+
),
181+
CategoricalSegmentRule(
171182
column_name='y_0',
172183
operator=SegmentOperator.IN,
173-
values=[SegmentRuleNumericRange(end_value=10), SegmentRuleNumericRange(start_value=14)]
184+
values=['class_0']
174185
),
175-
NumericSegmentRule(
186+
CategoricalSegmentRule(
176187
column_name='Metadata_1',
177188
operator=SegmentOperator.IN,
178189
values=['A1']

0 commit comments

Comments
 (0)