Skip to content

Commit a9702d7

Browse files
committed
Update README.md [ci skip]
1 parent da44852 commit a9702d7

File tree

1 file changed

+23
-30
lines changed

1 file changed

+23
-30
lines changed

README.md

Lines changed: 23 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -82,10 +82,10 @@ Keep in mind that both `add:atKey:` and `atKey:put:` messages don't create a new
8282

8383
### Creating DataFrame
8484
There are four ways of creating a data frame:
85-
[1. from an array of rows or columns](#1-creating-dataframe-from-an-array-of-rows-or-columns)
86-
[2. from matrix](#2-creating-dataframe-from-a-matrix)
87-
[3. from file](#3-reading-data-from-file)
88-
[4. loading a built-in dataset](#4-loading-the-built-in-datasets)
85+
1. [from an array of rows or columns](#1-creating-dataframe-from-an-array-of-rows-or-columns)
86+
2. [from matrix](#2-creating-dataframe-from-a-matrix)
87+
3. [from file](#3-reading-data-from-file)
88+
4. [loading a built-in dataset](#4-loading-the-built-in-datasets)
8989

9090
#### 1. Creating DataFrame from an array of rows or columns
9191
The easiest and most straightforward way of creating a DataFrame is by passing all data in an array of arrays to `fromRows:` or `fromColumns:` message. Here is an example of initializing a DataFrame with rows:
@@ -207,36 +207,14 @@ df at: 3 at: 2.
207207
df at: 3 at: 2 put: true.
208208
```
209209

210-
### Adding new rows and columns to DataFrame
211-
New rows and columns can be appended to the data frame using messages `addRow:named` and `addColumn:named`. Like in the case of DataSeries, you must provide a name for these new elements, since it can not continue the existing sequence of names.
212-
213-
```smalltalk
214-
df addRow: #('Lviv' 0.724 true) named: #D.
215-
df addColumn: #(4 3 4) named: #Rating.
216-
```
217-
218-
The same can be done using messages `row:put:` and `column:put:` with non-existing keys. DataFrame will append the new key and associate it with a given row or column
219-
220-
```smalltalk
221-
df at: #D put: #('Lviv' 0.724 true).
222-
df at: #Rating put: #(4 3 4).
223-
```
224-
225210
#### Head & tail
226-
Now let's take a look at some bigger dataset, for example, Boston Housing Data
211+
When working with bigger datasets it's often useful to access only the first or the last 5 rows. This can be done using `head` and `tail` messages. To see how they work let's load the Housing dataset.
227212

228213
```smalltalk
229214
df := DataFrame loadHousing.
230215
```
231216

232-
This dataset has 489 entries. Printing this many rows is unnecessary. On larger datasets it can also be time consuming. So in order to make sure that the data was loaded and to take a quick look on it, we can print its head (first 5 rows) or tail (last 5 rows)
233-
234-
```smalltalk
235-
df head.
236-
df tail.
237-
```
238-
239-
Data frame responds to these messages with another `DataFrame` object containing the requested rows. Here is the example output of the `df head` message
217+
This dataset has 489 entries. Printing all these rows in order to understand how this data looks like is unnecessary. On larger datasets it can also be time consuming. To take a quick look on your data, use `df head` or `df tail`
240218

241219
```
242220
| RM LSTAT PTRATIO MDEV
@@ -248,14 +226,14 @@ Data frame responds to these messages with another `DataFrame` object containing
248226
5 | 7.147 5.33 18.7 760200.0
249227
```
250228

251-
It is also possible to specify the number of rows that must be printed
229+
The resuld will be another data frame. `head` and `tail` messages are just shortcuts for `df rowsFrom: 1 to: 5` and `df rowsFrom: (df size - 5) to: df size`. But what if you want a different number of rows? You can do that using parametrized messages `head:` and `tail:` with a given number of rows.
252230

253231
```smalltalk
254232
df head: 10.
255233
df tail: 3.
256234
```
257235

258-
The same messages are also supported by the objects of `DataSeries` class. This means that we can also look at a head or tail of a specific column
236+
You can also look at the head or tail of a specific column, since all these messages are also supported by DataSeries
259237

260238
```smalltalk
261239
(df column: #LSTAT) head: 2.
@@ -269,3 +247,18 @@ The result will be another series
269247
1 | 4.98
270248
2 | 9.14
271249
```
250+
251+
### Adding new rows and columns to DataFrame
252+
New rows and columns can be appended to the data frame using messages `addRow:named` and `addColumn:named`. Like in the case of DataSeries, you must provide a name for these new elements, since it can not continue the existing sequence of names.
253+
254+
```smalltalk
255+
df addRow: #('Lviv' 0.724 true) named: #D.
256+
df addColumn: #(4 3 4) named: #Rating.
257+
```
258+
259+
The same can be done using messages `row:put:` and `column:put:` with non-existing keys. DataFrame will append the new key and associate it with a given row or column
260+
261+
```smalltalk
262+
df at: #D put: #('Lviv' 0.724 true).
263+
df at: #Rating put: #(4 3 4).
264+
```

0 commit comments

Comments
 (0)