You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+28-33Lines changed: 28 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -158,22 +158,42 @@ DataFrame loadHousing.
158
158
DataFrame loadTips.
159
159
```
160
160
161
-
### Exploring the created DataFrame
162
-
To get the dimensions of a data frame, its rows, and columns, we can say
161
+
### Accessing rows and columns
162
+
Rows and columns of a data frame can be accessed either by their names or their numeric indexes. You can access row _'C'_ and the column _'Population'_ of a data frame created in the previous sections by writing
163
+
164
+
```smalltalk
165
+
df row: 'C'.
166
+
df column: 'Population'.
167
+
```
168
+
169
+
Alternatively, you can use numeric indexes. Here is how you can ask a data frame for a third row or a second column:
163
170
164
171
```smalltalk
165
-
df dimensions.
166
-
df dimensions rows.
167
-
df dimensions columns.
172
+
df rowAt: 3.
173
+
df columnAt: 2.
168
174
```
169
175
170
-
The first line will return an object of `DataDimensions` class. It is just a specialized `Point` which responds to `rows` and `columns` messages instead of `x` and `y`. It also reimplements the `printOn:` message, so if you press `Ctrl+P` on `df dimensions`, you will see something like this
176
+
The important feature of a `DataFrame`is that when asked for a specific row or column, it responds with a `DataSeries` object that preserves the same indexing. This way, if you extract row _'B'_ from a data frame, it will still remember that _'Dubai'_ is a city with a population of 2.789 million people
171
177
172
178
```
173
-
3 rows
174
-
3 columns
179
+
| B
180
+
------------+-------
181
+
City | Dubai
182
+
Population | 2.789
183
+
BeenThere | true
184
+
```
185
+
186
+
You can access multiple columns at a same time by providing an array of column names or indexes, or by specifying the numeric range. For this purpose DataFrame provides messages `rows:`, `columns:`, `rowsAt:`, `columnsAt:`, `rowsFrom:to:`, and `columnsFrom:to:`
187
+
188
+
```smalltalk
189
+
df columns: #(City BeenThere).
190
+
df rowsAt: #(3 1).
191
+
df columnsFrom: 2 to: 3.
192
+
df rowsFrom: 3 to: 1.
175
193
```
176
194
195
+
The result will be a data frame with requested rows and columns in a given order. For example, the last line will give you a data frame "flipped upside-down" (with row indexes going in the descending order).
196
+
177
197
#### Head & tail
178
198
Now let's take a look at some bigger dataset, for example, Boston Housing Data
179
199
@@ -221,28 +241,3 @@ The result will be another series
221
241
1 | 4.98
222
242
2 | 9.14
223
243
```
224
-
225
-
### Accessing rows and columns
226
-
Rows and columns of a data frame can be accessed either by their names or their numeric indexes. Afrer changing the names of rows and columns to `#(A B C)` and `#(City Population SomeBool)`, as shown above, how we can now access row _'C'_ and the column _'Population'_ of a data frame
227
-
228
-
```smalltalk
229
-
df row: 'C'.
230
-
df column: 'Population'.
231
-
```
232
-
233
-
We can also access them by their numeric indexes
234
-
235
-
```smalltalk
236
-
df rowAt: 3.
237
-
df columnAt: 2.
238
-
```
239
-
240
-
The important feature of a `DataFrame` is that whenever we ask for a specific row or column, it responds with a `DataSeries` object that preserves the same indexing. So, for example, if you take row _'B'_ of a data frame described above, you will get a series named _'B'_ with keys _'City'_, _'Population'_, and _'SomeBool'_.
0 commit comments