You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
4.[loading a built-in dataset](#4-loading-the-built-in-datasets)
89
89
90
90
#### 1. Creating DataFrame from an array of rows or columns
91
91
The easiest and most straightforward way of creating a DataFrame is by passing all data in an array of arrays to `fromRows:` or `fromColumns:` message. Here is an example of initializing a DataFrame with rows:
@@ -207,36 +207,14 @@ df at: 3 at: 2.
207
207
df at: 3 at: 2 put: true.
208
208
```
209
209
210
-
### Adding new rows and columns to DataFrame
211
-
New rows and columns can be appended to the data frame using messages `addRow:named` and `addColumn:named`. Like in the case of DataSeries, you must provide a name for these new elements, since it can not continue the existing sequence of names.
212
-
213
-
```smalltalk
214
-
df addRow: #('Lviv' 0.724 true) named: #D.
215
-
df addColumn: #(4 3 4) named: #Rating.
216
-
```
217
-
218
-
The same can be done using messages `row:put:` and `column:put:` with non-existing keys. DataFrame will append the new key and associate it with a given row or column
219
-
220
-
```smalltalk
221
-
df at: #D put: #('Lviv' 0.724 true).
222
-
df at: #Rating put: #(4 3 4).
223
-
```
224
-
225
210
#### Head & tail
226
-
Now let's take a look at some bigger dataset, for example, Boston Housing Data
211
+
When working with bigger datasets it's often useful to access only the first or the last 5 rows. This can be done using `head` and `tail` messages. To see how they work let's load the Housing dataset.
227
212
228
213
```smalltalk
229
214
df := DataFrame loadHousing.
230
215
```
231
216
232
-
This dataset has 489 entries. Printing this many rows is unnecessary. On larger datasets it can also be time consuming. So in order to make sure that the data was loaded and to take a quick look on it, we can print its head (first 5 rows) or tail (last 5 rows)
233
-
234
-
```smalltalk
235
-
df head.
236
-
df tail.
237
-
```
238
-
239
-
Data frame responds to these messages with another `DataFrame` object containing the requested rows. Here is the example output of the `df head` message
217
+
This dataset has 489 entries. Printing all these rows in order to understand how this data looks like is unnecessary. On larger datasets it can also be time consuming. To take a quick look on your data, use `df head` or `df tail`
240
218
241
219
```
242
220
| RM LSTAT PTRATIO MDEV
@@ -248,14 +226,14 @@ Data frame responds to these messages with another `DataFrame` object containing
248
226
5 | 7.147 5.33 18.7 760200.0
249
227
```
250
228
251
-
It is also possible to specify the number of rowsthat must be printed
229
+
The resuld will be another data frame. `head` and `tail` messages are just shortcuts for `df rowsFrom: 1 to: 5` and `df rowsFrom: (df size - 5) to: df size`. But what if you want a different number of rows? You can do that using parametrized messages `head:` and `tail:` with a given number of rows.
252
230
253
231
```smalltalk
254
232
df head: 10.
255
233
df tail: 3.
256
234
```
257
235
258
-
The same messages are also supported by the objects of `DataSeries` class. This means that we can also look at a head or tail of a specific column
236
+
You can also look at the head or tail of a specific column, since all these messages are also supported by DataSeries
259
237
260
238
```smalltalk
261
239
(df column: #LSTAT) head: 2.
@@ -269,3 +247,18 @@ The result will be another series
269
247
1 | 4.98
270
248
2 | 9.14
271
249
```
250
+
251
+
### Adding new rows and columns to DataFrame
252
+
New rows and columns can be appended to the data frame using messages `addRow:named` and `addColumn:named`. Like in the case of DataSeries, you must provide a name for these new elements, since it can not continue the existing sequence of names.
253
+
254
+
```smalltalk
255
+
df addRow: #('Lviv' 0.724 true) named: #D.
256
+
df addColumn: #(4 3 4) named: #Rating.
257
+
```
258
+
259
+
The same can be done using messages `row:put:` and `column:put:` with non-existing keys. DataFrame will append the new key and associate it with a given row or column
0 commit comments