Skip to content

Commit 3260d52

Browse files
committed
Writing a small description of each CS DSL function in the documentation
1 parent b7e9f59 commit 3260d52

File tree

1 file changed

+190
-1
lines changed

1 file changed

+190
-1
lines changed

docs/StardustDocs/topics/ColumnSelectors.md

Lines changed: 190 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Access-->
44

5-
[`DataFrame`](DataFrame.md) provides a DSL for selecting an arbitrary set of columns.
5+
[`DataFrame`](DataFrame.md) provides a DSL for selecting an arbitrary set of columns: the Columns Selection DSL.
66

77
Column selectors are used in many operations:
88

@@ -39,6 +39,195 @@ df.move { name.firstName and name.lastName }.after { city }
3939
</tab>
4040
</tabs>
4141

42+
#### Functions Overview:
43+
44+
##### First (Col), Last (Col), Single (Col)
45+
`first {}`, `firstCol()`, `last {}`, `lastCol()`, `single {}`, `singleCol()`
46+
47+
Returns the first, last, or single column from the top-level, specified [column group](DataColumn.md#columngroup),
48+
or `ColumnSet` that adheres to the optional given condition. If no column adheres to the given condition,
49+
`NoSuchElementException` is thrown.
50+
51+
##### Col
52+
`col(name)`, `col(5)`, `this[5]`
53+
54+
Creates a [ColumnAccessor](DataColumn.md#column-accessors) (or `SingleColumn`) for a column with the given
55+
argument from the top-level or specified [column group](DataColumn.md#columngroup). The argument can be either an
56+
index (`Int`) or a reference to a column (`String`, `ColumnPath`, `KProperty`, or `ColumnAccessor`;
57+
any [AccessApi](apiLevels.md)).
58+
59+
##### Value Col, Frame Col, Col Group
60+
`valueCol(name)`, `valueCol(5)`, `frameCol(name)`, `frameCol(5)`, `colGroup(name)`, `colGroup(5)`
61+
62+
Creates a [ColumnAccessor](DataColumn.md#column-accessors) (or `SingleColumn`) for a
63+
[value column](DataColumn.md#valuecolumn) / [frame column](DataColumn.md#framecolumn) /
64+
[column group](DataColumn.md#columngroup) with the given argument from the top-level or
65+
specified [column group](DataColumn.md#columngroup). The argument can be either an index (`Int`) or a reference
66+
to a column (`String`, `ColumnPath`, `KProperty`, or `ColumnAccessor`; any [AccessApi](apiLevels.md)).
67+
The functions can be both typed and untyped (in case you're supplying a column name, -path, or index).
68+
These functions throw an `IllegalArgumentException` if the column found is not the right kind.
69+
70+
##### Cols
71+
`cols {}`, `cols()`, `cols(colA, colB)`, `cols(1, 5)`, `cols(1..5)`, `[{}]`
72+
73+
Creates a subset of columns (`ColumnSet`) from the top-level, specified [column group](DataColumn.md#columngroup),
74+
or `ColumnSet`.
75+
You can use either a `ColumnFilter`, or any of the `vararg` overloads for any [AccessApi](apiLevels.md).
76+
The function can be both typed and untyped (in case you're supplying a column name, -path, or index (range)).
77+
78+
##### Range of Columns
79+
`colA.."colB"`
80+
81+
Creates a `ColumnSet` containing all columns from `colA` to `colB` (inclusive) from the top-level.
82+
Columns inside [column groups](DataColumn.md#columngroup) are also supported
83+
(as long as they share the same direct parent), as well as any combination of [AccessApi](apiLevels.md).
84+
85+
##### Value Columns, Frame Columns, Column Groups
86+
`valueCols {}`, `valueCols()`, `frameCols {}`, `frameCols()`, `colGroups {}`, `colGroups()`
87+
88+
Creates a subset of columns (`ColumnSet`) from the top-level, specified [column group](DataColumn.md#columngroup),
89+
or `ColumnSet` containing only [value columns](DataColumn.md#valuecolumn) / [frame columns](DataColumn.md#framecolumn) /
90+
[column groups](DataColumn.md#columngroup) that adhere to the optional condition.
91+
92+
##### Cols of Kind
93+
`colsOfKind(Value, Frame) {}`, `colsOfKind(Group, Frame)`
94+
95+
Creates a subset of columns (`ColumnSet`) from the top-level, specified [column group](DataColumn.md#columngroup),
96+
or `ColumnSet` containing only columns of the specified kind(s) that adhere to the optional condition.
97+
98+
##### All (Cols)
99+
`all()`, `allCols()`
100+
101+
Creates a `ColumnSet` containing all columns from the top-level, specified [column group](DataColumn.md#columngroup),
102+
or `ColumnSet`. This is the opposite of `none()` and equivalent to `cols()` without filter.
103+
Note, on [column groups](DataColumn.md#columngroup), `all` is named `allCols` instead to avoid confusion.
104+
105+
##### All (Cols) After, -Before, -From, -Up To
106+
`allAfter(colA)`, `allBefore(colA)`, `allColsFrom(colA)`, `allColsUpTo(colA)`
107+
108+
Creates a `ColumnSet` containing a subset of columns from the top-level,
109+
specified [column group](DataColumn.md#columngroup), or `ColumnSet`.
110+
The subset includes:
111+
- `all(Cols)Before(colA)`: All columns before the specified column, excluding that column.
112+
- `all(Cols)After(colA)`: All columns after the specified column, excluding that column.
113+
- `all(Cols)From(colA)`: All columns from the specified column, including that column.
114+
- `all(Cols)UpTo(colA)`: All columns up to the specified column, including that column.
115+
116+
NOTE: The `{}` overloads of these functions in the Plain DSL and on [column groups](DataColumn.md#columngroup)
117+
are a `ColumnSelector` (relative to the receiver).
118+
On `ColumnSets` they are a `ColumnFilter` instead.
119+
120+
##### Cols at any Depth
121+
`colsAtAnyDepth {}`, `colsAtAnyDepth()`
122+
123+
Creates a `ColumnSet` containing all columns from the top-level, specified [column group](DataColumn.md#columngroup),
124+
or `ColumnSet` at any depth if they satisfy the optional given predicate. This means that columns (of all three kinds!)
125+
nested inside [column groups](DataColumn.md#columngroup) are also included.
126+
This function can also be followed by another `ColumnSet` filter-function like `colsOf<>()`, `single()`,
127+
or `valueCols()`.
128+
129+
**For example:**
130+
131+
Depth-first search to a column containing the value "Alice":
132+
133+
`df.select { colsAtAnyDepth().first { "Alice" in it.values() } }`
134+
135+
The columns at any depth excluding the top-level:
136+
137+
`df.select { colGroups().colsAtAnyDepth() }`
138+
139+
All [value-](DataColumn.md#valuecolumn) and [frame columns](DataColumn.md#framecolumn) at any depth:
140+
141+
`df.select { colsAtAnyDepth { !it.isColumnGroup } }`
142+
143+
All value columns at any depth nested under a column group named "myColGroup":
144+
145+
`df.select { myColGroup.colsAtAnyDepth().valueCols() }`
146+
147+
148+
**Converting from deprecated syntax:**
149+
150+
`dfs { condition }` -> `colsAtAnyDepth { condition }`
151+
152+
`allDfs(includeGroups = false)` -> `colsAtAnyDepth { includeGroups || !it.isColumnGroup() }`
153+
154+
`dfsOf<Type> { condition }` -> `colsAtAnyDepth().colsOf<Type> { condition }`
155+
156+
`cols { condition }.recursively()` -> `colsAtAnyDepth { condition }`
157+
158+
`first { condition }.rec()` -> `colsAtAnyDepth { condition }.first()`
159+
160+
`all().recursively()` -> `colsAtAnyDepth()`
161+
162+
##### Cols in Groups
163+
`colsInGroups {}`, `colsInGroups()`
164+
165+
Creates a `ColumnSet` containing all columns that are nested in the [column groups](DataColumn.md#columngroup) at
166+
the top-level, specified [column group](DataColumn.md#columngroup), or `ColumnSet` adhering to an optional predicate.
167+
This is useful if you want to select all columns that are "one level down".
168+
169+
This function used to be called `children()` in the past.
170+
171+
**For example:**
172+
173+
To get the columns inside all [column groups](DataColumn.md#columngroup) in a [dataframe](DataFrame.md),
174+
instead of having to write:
175+
176+
`df.select { colGroupA.cols() and colGroupB.cols() ... }`
177+
178+
you can use:
179+
180+
`df.select { colsInGroups() }`
181+
182+
or with filter:
183+
184+
`df.select { colsInGroups { "user" in it.name } }`
185+
186+
Similarly, you can take the columns inside all [column groups](DataColumn.md#columngroup) in a `ColumnSet`:
187+
188+
`df.select { colGroups { "my" in it.name }.colsInGroups() }`
189+
190+
##### Take (Last) (Cols) (While)
191+
`take(5)`, `takeLastCols(2)`, `takeLastWhile {}`, `takeColsWhile {}`,
192+
193+
Creates a `ColumnSet` containing the first / last `n` columns from the top-level,
194+
specified [column group](DataColumn.md#columngroup), or `ColumnSet` or those that adhere to the given condition.
195+
Note, to avoid ambiguity, `take` is called `takeCols` when called on a [column group](DataColumn.md#columngroup).
196+
197+
##### Drop (Last) (Cols) (While)
198+
`drop(5)`, `dropLastCols(2)`, `dropLastWhile {}`, `dropColsWhile {}`
199+
200+
Creates a `ColumnSet` without the first / last `n` columns from the top-level,
201+
specified [column group](DataColumn.md#columngroup), or `ColumnSet` or those that adhere to the given condition.
202+
Note, to avoid ambiguity, `drop` is called `dropCols` when called on a [column group](DataColumn.md#columngroup).
203+
204+
##### Select from [Column Group](DataColumn.md#columngroup)
205+
`colGroupA.select {}`, `"colGroupA" {}`
206+
207+
Creates a `ColumnSet` containing the columns selected by the provided `ColumnsSelector` relative to the specified
208+
[column group](DataColumn.md#columngroup). In practice, this means you're opening a new selection scope inside a
209+
[column group](DataColumn.md#columngroup) and selecting columns from there.
210+
The selected columns are referenced individually and "unpacked" from their parent
211+
[column group](DataColumn.md#columngroup).
212+
213+
**For example:**
214+
215+
Select `myColGroup.someCol` and all `String` columns from `myColGroup`:
216+
217+
`df.select { myColGroup.select { someCol and colsOf<String>() } }`
218+
219+
220+
221+
`df.select { "myGroupCol" { "colA" and expr("newCol") { colB + 1 } } }`
222+
223+
`df.select { "pathTo"["myGroupCol"].select { "colA" and "colB" } }`
224+
225+
`df.select { it["myGroupCol"].asColumnGroup()() { "colA" and "colB" } }`
226+
227+
TODO
228+
229+
#### Examples:
230+
42231
**Select columns by name:**
43232

44233
<!---FUN columnSelectors-->

0 commit comments

Comments
 (0)