Skip to content

Commit eb2f5ee

Browse files
committed
Fixed grammar in the first part
1 parent 89f3163 commit eb2f5ee

27 files changed

+113
-79
lines changed

docs/StardustDocs/topics/ColumnSelectors.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Access-->
44

5-
[`DataFrame`](DataFrame.md) provides DSL for selecting arbitrary set of columns.
5+
[`DataFrame`](DataFrame.md) provides a DSL for selecting an arbitrary set of columns.
66

77
Column selectors are used in many operations:
88

@@ -187,19 +187,19 @@ df.select {
187187
Person::name.single { it.name().startsWith("first") }
188188
}
189189

190-
// recursive traversal of all columns, excluding ColumnGroups from result
190+
// recursive traversal of all columns, excluding ColumnGroups from a result
191191
df.select { cols { !it.isColumnGroup() }.recursively() }
192192

193-
// depth-first-search traversal of all columns, including ColumnGroups in result
193+
// depth-first-search traversal of all columns, including ColumnGroups in a result
194194
df.select { all().recursively() }
195195

196196
// recursive traversal with condition
197197
df.select { cols { it.name().contains(":") }.recursively() }
198198

199-
// recursive traversal of columns of given type
199+
// recursive traversal of columns of a given type
200200
df.select { colsOf<String>().rec() }
201201

202-
// all columns except given column set
202+
// all columns except a given column set
203203
df.select { except { colsOf<String>() } }
204204

205205
// union of column sets

docs/StardustDocs/topics/DataColumn.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ See [how to create columns](createColumn.md)
77

88
### Properties
99
* `name: String` — name of the column, should be unique within containing dataframe
10-
* `path: ColumnPath` — path to the column, depends on the way column was retrieved from dataframe
10+
* `path: ColumnPath` — path to the column, depends on the way column was retrieved from the dataframe
1111
* `type: KType` — type of elements in the column
12-
* `hasNulls: Boolean` — flag indicating whether column contains `null` values
12+
* `hasNulls: Boolean` — flag indicating whether a column contains `null` values
1313
* `values: Iterable<T>` — column data
1414
* `size: Int` — number of elements in the column
1515

@@ -24,7 +24,7 @@ It can store values of primitive (integers, strings, decimals etc.) or reference
2424

2525
#### ColumnGroup
2626

27-
Container for nested columns. Is used to create column hierarchy.
27+
Container for nested columns. It is used to create column hierarchy.
2828

2929
#### FrameColumn
3030

@@ -36,7 +36,11 @@ Special case of [`ValueColumn`](#valuecolumn) that stores other [`DataFrames`](D
3636

3737
## Column accessors
3838

39-
`ColumnAccessors` are used for [typed data access](columnAccessorsApi.md) in [`DataFrame`](DataFrame.md). `ColumnAccessor` stores column [`name`](#properties) (for top-level columns) or column path (for nested columns), has type argument that corresponds to [`type`](#properties) of thep column, but it doesn't contain any actual data.
39+
`ColumnAccessors` are used for [typed data access](columnAccessorsApi.md) in [`DataFrame`](DataFrame.md).
40+
`ColumnAccessor`
41+
stores column [`name`](#properties) (for top-level columns) or column path (for nested columns),
42+
has type argument that corresponds to [`type`](#properties) of the column,
43+
but it doesn't contain any actual data.
4044

4145
<!---FUN columnAccessorsUsage-->
4246

@@ -45,11 +49,11 @@ val age by column<Int>()
4549

4650
// Access fourth cell in the "age" column of dataframe `df`.
4751
// This expression returns `Int` because variable `age` has `ColumnAccessor<Int>` type.
48-
// If dataframe `df` has no column "age" or column "age" has type which is incompatible with `Int`,
49-
// runtime exception will be thrown.
52+
// If dataframe `df` has no column "age" or column "age" has a type which is incompatible with `Int`,
53+
// a runtime exception will be thrown.
5054
df[age][3] + 5
5155

52-
// Access first cell in the "age" column of dataframe `df`.
56+
// Access the first cell in the "age" column of dataframe `df`.
5357
df[0][age] * 2
5458

5559
// Returns new dataframe sorted by age column (ascending)
@@ -74,7 +78,7 @@ val name by column<String>()
7478

7579
<!---END-->
7680

77-
To assign column name explicitly, pass it as an argument.
81+
To assign a column name explicitly, pass it as an argument.
7882

7983
<!---FUN createColumnAccessorRenamed-->
8084

@@ -106,7 +110,7 @@ val firstName by name.column<String>()
106110

107111
<!---END-->
108112

109-
You can also create virtual accessor that doesn't point to a real column but computes some expression on every data access:
113+
You can also create a virtual accessor that doesn't point to a real column but computes some expression on every data access:
110114

111115
<!---FUN columnAccessorComputed-->
112116
<tabs>

docs/StardustDocs/topics/DataRow.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,12 @@
1010
* `next(): DataRow?` — next row (`null` for the last row)
1111
* `diff { rowExpression }: T` — difference between results of [row expression](#row-expressions) calculated for current and previous rows
1212
* `values(): List<Any?>` — list of all cell values from the current row
13-
* `valuesOf<T>(): List<T>` — list of values of given type
13+
* `valuesOf<T>(): List<T>` — list of values of a given type
1414
* `columnsCount(): Int` — number of columns
1515
* `columnNames(): List<String>` — list of all column names
1616
* `columnTypes(): List<KType>` — list of all column types
1717
* `namedValues(): List<NameValuePair<Any?>>` — list of name-value pairs where `name` is a column name and `value` is cell value
18-
* `namedValuesOf<T>(): List<NameValuePair<T>>` — list of name-value pairs where value has given type
18+
* `namedValuesOf<T>(): List<NameValuePair<T>>` — list of name-value pairs where value has given a type
1919
* `transpose(): DataFrame<NameValuePair<*>>` — dataframe of two columns: `name: String` is column names and `value: Any?` is cell values
2020
* `transposeTo<T>(): DataFrame<NameValuePair<T>>`— dataframe of two columns: `name: String` is column names and `value: T` is cell values
2121
* `getRow(Int): DataRow` — row from [`DataFrame`](DataFrame.md) by row index
@@ -54,7 +54,7 @@ Row condition is a special case of [row expression](#row-expressions) that retur
5454
// Row condition is used to filter rows by index
5555
df.filter { index() % 5 == 0 }
5656

57-
// Row condition is used to drop rows where `age` is the same as in previous row
57+
// Row condition is used to drop rows where `age` is the same as in the previous row
5858
df.drop { diff { age } == 0 }
5959

6060
// Row condition is used to filter rows for value update
@@ -76,10 +76,11 @@ The following [statistics](summaryStatistics.md) are available for `DataRow`:
7676
* `rowStd`
7777
* `rowMedian`
7878

79-
These statistics will be applied only to values of appropriate types and incompatible values will be ignored.
80-
For example, if [`DataFrame`](DataFrame.md) has columns of type `String` and `Int`, `rowSum()` will successfully compute sum of `Int` values in a row and ignore `String` values.
79+
These statistics will be applied only to values of appropriate types, and incompatible values will be ignored.
80+
For example, if [`DataFrame`](DataFrame.md) has columns of a type `String` and `Int`,
81+
`rowSum()` will successfully compute a sum of `Int` values in a row and ignore `String` values.
8182

82-
To apply statistics only to values of particular type use `-Of` versions:
83+
To apply statistics only to values of a particular type, use `-Of` versions:
8384
* `rowMaxOf<T>`
8485
* `rowMinOf<T>`
8586
* `rowSumOf<T>`

docs/StardustDocs/topics/access.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,5 +18,5 @@ df.values() // Sequence<Any?>
1818
**Learn how to:**
1919
* [Access data by index](indexing.md)
2020
* [Iterate over data](iterate.md)
21-
* [Get single row](getRow.md)
21+
* [Get a single row](getRow.md)
2222
* [Get single column](getColumns.md)

docs/StardustDocs/topics/add.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
Returns [`DataFrame`](DataFrame.md) which contains all columns from original [`DataFrame`](DataFrame.md) followed by newly added columns.
66
Original [`DataFrame`](DataFrame.md) is not modified.
77

8-
## Create new column and add it to [`DataFrame`](DataFrame.md)
8+
## Create a new column and add it to [`DataFrame`](DataFrame.md)
99

1010
```text
1111
add(columnName: String) { rowExpression }
@@ -44,7 +44,8 @@ df.add("year of birth") { 2021 - "age"<Int>() }
4444

4545
See [row expressions](DataRow.md#row-expressions)
4646

47-
You can use `newValue()` function to access value that was already calculated for preceding row. It is helpful for recurrent computations:
47+
You can use `newValue()` function to access value that was already calculated for the preceding row.
48+
It is helpful for recurrent computations:
4849

4950
<!---FUN addRecurrent-->
5051

@@ -223,7 +224,9 @@ df.add(df1, df2)
223224

224225
## addId
225226

226-
Adds column with sequential values 0, 1, 2,... New column will be added in the beginning of columns list and will become the first column in [`DataFrame`](DataFrame.md).
227+
Adds column with sequential values 0, 1, 2,...
228+
New column will be added in the beginning of a column list
229+
and will become the first column in [`DataFrame`](DataFrame.md).
227230

228231
```
229232
addId(name: String = "id")

docs/StardustDocs/topics/adjustSchema.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[//]: # (title: Adjust schema)
22

3-
[`DataFrame`](DataFrame.md) interface has type argument `T` that doesn't affect contents of [`DataFrame`](DataFrame.md),
4-
but marks [`DataFrame`](DataFrame.md) with a type that represents data schema that this [`DataFrame`](DataFrame.md) is supposed to have.
3+
[`DataFrame`](DataFrame.md) interface has type argument `T` that doesn't affect the contents of [`DataFrame`](DataFrame.md),
4+
but marks [`DataFrame`](DataFrame.md) with a type that represents the data schema that this [`DataFrame`](DataFrame.md) is supposed to have.
55
This argument is used to generate [extension properties](extensionPropertiesApi.md) for typed data access.
66

77
Another place where this argument has a special role is in [interop with data classes](collectionsInterop.md#interop-with-data-classes):

docs/StardustDocs/topics/apiLevels.md

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,21 @@
22

33
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.ApiLevels-->
44

5-
By nature data frames are dynamic objects, column labels depend on the input source and also new columns could be added
6-
or deleted while wrangling. Kotlin, in contrast, is a statically typed language and all types are defined and verified
7-
ahead of execution. That's why creating a flexible, handy, and, at the same time, safe API to a data frame is tricky.
5+
By nature, data frames are dynamic objects,
6+
column labels depend on the input source, and also new columns could be added
7+
or deleted while wrangling.
8+
Kotlin, in contrast, is a statically typed language and all types are defined and verified
9+
ahead of execution.
810

9-
In the Kotlin DataFrame library we provide four different ways to access columns, and, while they are essentially different, they
11+
That's why creating a flexible, handy, and, at the same time, safe API to a data frame is tricky.
12+
13+
In the Kotlin DataFrame library, we provide four different ways to access columns,
14+
and, while they are essentially different, they
1015
look pretty similar in the data wrangling DSL.
1116

1217
## List of Access APIs
1318

14-
Here's a list of all APIs in order of increasing safety.
19+
Here's a list of all APIs in order to increase safety.
1520

1621
* [**String API**](stringApi.md) <br/>
1722
Columns are accessed by `string` representing their name. Type-checking is done at runtime, name-checking too.
@@ -128,9 +133,9 @@ df.add("lastName") { name.split(",").last() }
128133

129134
</tabs>
130135

131-
The `titanic.csv` file could be found [here](https://github.com/Kotlin/dataframe/blob/master/data/titanic.csv).
136+
The `titanic.csv` file can be found [here](https://github.com/Kotlin/dataframe/blob/master/data/titanic.csv).
132137

133-
# Comparing the APIs
138+
# Comparing APIs
134139

135140
The [String API](stringApi.md) is the simplest and unsafest of them all. The main advantage of it is that it can be
136141
used at any time, including when accessing new columns in chain calls. So we can write something like:
@@ -144,17 +149,17 @@ We don't need to interrupt a function call chain and declare a column accessor o
144149

145150
In contrast, generated [extension properties](extensionPropertiesApi.md) are the most convenient and the safest API.
146151
Using it, you can always be sure that you work with correct data and types.
147-
But its bottleneck is the moment of generation.
148-
To get new extension properties you have to run a cell in a notebook,
152+
But its bottleneck is the moment of a generation.
153+
To get new extension properties, you have to run a cell in a notebook,
149154
which could lead to unnecessary variable declarations.
150-
Currently, we are working on compiler a plugin that generates these properties on the fly while typing!
155+
Currently, we are working on a compiler with a plugin that generates these properties on the fly while typing!
151156

152157
The [Column Accessors API](columnAccessorsApi.md) is a kind of trade-off between safety and needs to be written ahead of
153158
the execution type declaration. It was designed to better be able to write code in an IDE without a notebook experience.
154159
It provides type-safe access to columns but doesn't ensure that the columns really exist in a particular data frame.
155160

156161
The [KProperties API](KPropertiesApi.md) is useful when you already have declared classed in your application business
157-
logic with fields that correspond columns of a data frame.
162+
logic with fields that correspond to columns of a data frame.
158163

159164
<table>
160165
<tr>

docs/StardustDocs/topics/cast.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,12 @@ cast<T>(verify = false)
88
```
99

1010
**Parameters:**
11-
* `verify: Boolean = false` — when `true`, throws exception if [`DataFrame`](DataFrame.md) doesn't match given schema. Otherwise, just changes format type without actual data check.
11+
* `verify: Boolean = false`
12+
when `true`, throws exception if [`DataFrame`](DataFrame.md) doesn't match the given schema.
13+
Otherwise, just change a format type without actual data check.
1214

13-
Use this operation to change formal type of [`DataFrame`](DataFrame.md) to match expected schema and enable generated [extension properties](extensionPropertiesApi.md) for it.
15+
Use this operation to change a formal type of [`DataFrame`](DataFrame.md)
16+
to match the expected schema and enable generated [extension properties](extensionPropertiesApi.md) for it.
1417

1518
```kotlin
1619
@DataSchema

docs/StardustDocs/topics/collectionsInterop.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,16 +7,23 @@ _Kotlin DataFrame_ and _Kotlin Collection_ represent two different approaches to
77
* [`DataFrame`](DataFrame.md) stores data by fields/columns
88
* `Collection` stores data by records/rows
99

10-
Although [`DataFrame`](DataFrame.md) doesn't implement [`Collection`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-collection/#kotlin.collections.Collection) or [`Iterable`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-iterable/) interface, it has many similar operations,
11-
such as [`filter`](filter.md), [`take`](sliceRows.md#take), [`first`](first.md), [`map`](map.md), [`groupBy`](groupBy.md) etc.
10+
Although [`DataFrame`](DataFrame.md)
11+
doesn't implement [`Collection`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-collection/#kotlin.collections.Collection)
12+
or [`Iterable`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-iterable/)
13+
interface, it has many similar operations,
14+
such as [`filter`](filter.md), [`take`](sliceRows.md#take),
15+
[`first`](first.md), [`map`](map.md), [`groupBy`](groupBy.md) etc.
1216

1317
[`DataFrame`](DataFrame.md) has two-way compatibility with [`Map`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-map/) and [`List`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-list/):
1418
* `List<T>` -> `DataFrame<T>`: [toDataFrame](createDataFrame.md#todataframe)
1519
* `DataFrame<T>` -> `List<T>`: [toList](toList.md)
1620
* `Map<String, List<*>>` -> `DataFrame<*>`: [toDataFrame](createDataFrame.md#todataframe)
1721
* `DataFrame<*>` -> `Map<String, List<*>>`: [toMap](toMap.md)
1822

19-
Columns, rows and values of [`DataFrame`](DataFrame.md) can be accessed as [`List`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-list/), [`Iterable`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-iterable/) and [`Sequence`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.sequences/-sequence/) accordingly:
23+
Columns, rows, and values of [`DataFrame`](DataFrame.md)
24+
can be accessed as [`List`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-list/),
25+
[`Iterable`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-iterable/)
26+
and [`Sequence`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.sequences/-sequence/) accordingly:
2027

2128
<!---FUN getRowsColumns-->
2229

@@ -54,7 +61,8 @@ val df = list.toDataFrame()
5461

5562
<!---END-->
5663

57-
Mark original data class with [`DataSchema`](schemas.md) annotation to get [extension properties](extensionPropertiesApi.md) and perform data transformations.
64+
Mark the original data class with [`DataSchema`](schemas.md)
65+
annotation to get [extension properties](extensionPropertiesApi.md) and perform data transformations.
5866

5967
<!---FUN listInterop3-->
6068

@@ -87,6 +95,6 @@ val result = df2.toListOf<Output>()
8795

8896
<!---END-->
8997

90-
### Converting columns with objects instances to ColumnGroup
98+
### Converting columns with object instances to ColumnGroup
9199

92100
[unfold](unfold.md) can be used as [`toDataFrame()`](createDataFrame.md#todataframe) analogue for specific columns inside existing dataframes

docs/StardustDocs/topics/columnAccessorsApi.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ For frequently accessed columns type casting can be reduced by [Column Accessors
77
<!---FUN accessors1-->
88

99
```kotlin
10-
val survived by column<Boolean>() // accessor for Boolean column with name 'survived'
10+
val survived by column<Boolean>() // accessor for Boolean column with the name 'survived'
1111
val home by column<String>()
1212
val age by column<Int?>()
1313
val name by column<String>()
@@ -29,7 +29,7 @@ DataFrame.read("titanic.csv")
2929

3030
<!---END-->
3131

32-
The `titanic.csv` file could be found [here](https://github.com/Kotlin/dataframe/blob/master/data/titanic.csv).
32+
The `titanic.csv` file can be found [here](https://github.com/Kotlin/dataframe/blob/master/data/titanic.csv).
3333

3434
<warning>
3535
Note that it still doesn’t solve the problem of whether the column actually exists in a data frame, but reduces type casting.

0 commit comments

Comments
 (0)