Skip to content

Commit dc2ecac

Browse files
committed
Update README.md [ci skip]
1 parent 91fc400 commit dc2ecac

File tree

1 file changed

+28
-33
lines changed

1 file changed

+28
-33
lines changed

README.md

Lines changed: 28 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -158,22 +158,42 @@ DataFrame loadHousing.
158158
DataFrame loadTips.
159159
```
160160

161-
### Exploring the created DataFrame
162-
To get the dimensions of a data frame, its rows, and columns, we can say
161+
### Accessing rows and columns
162+
Rows and columns of a data frame can be accessed either by their names or their numeric indexes. You can access row _'C'_ and the column _'Population'_ of a data frame created in the previous sections by writing
163+
164+
```smalltalk
165+
df row: 'C'.
166+
df column: 'Population'.
167+
```
168+
169+
Alternatively, you can use numeric indexes. Here is how you can ask a data frame for a third row or a second column:
163170

164171
```smalltalk
165-
df dimensions.
166-
df dimensions rows.
167-
df dimensions columns.
172+
df rowAt: 3.
173+
df columnAt: 2.
168174
```
169175

170-
The first line will return an object of `DataDimensions` class. It is just a specialized `Point` which responds to `rows` and `columns` messages instead of `x` and `y`. It also reimplements the `printOn:` message, so if you press `Ctrl+P` on `df dimensions`, you will see something like this
176+
The important feature of a `DataFrame` is that when asked for a specific row or column, it responds with a `DataSeries` object that preserves the same indexing. This way, if you extract row _'B'_ from a data frame, it will still remember that _'Dubai'_ is a city with a population of 2.789 million people
171177

172178
```
173-
3 rows
174-
3 columns
179+
| B
180+
------------+-------
181+
City | Dubai
182+
Population | 2.789
183+
BeenThere | true
184+
```
185+
186+
You can access multiple columns at a same time by providing an array of column names or indexes, or by specifying the numeric range. For this purpose DataFrame provides messages `rows:`, `columns:`, `rowsAt:`, `columnsAt:`, `rowsFrom:to:`, and `columnsFrom:to:`
187+
188+
```smalltalk
189+
df columns: #(City BeenThere).
190+
df rowsAt: #(3 1).
191+
df columnsFrom: 2 to: 3.
192+
df rowsFrom: 3 to: 1.
175193
```
176194

195+
The result will be a data frame with requested rows and columns in a given order. For example, the last line will give you a data frame "flipped upside-down" (with row indexes going in the descending order).
196+
177197
#### Head & tail
178198
Now let's take a look at some bigger dataset, for example, Boston Housing Data
179199

@@ -221,28 +241,3 @@ The result will be another series
221241
1 | 4.98
222242
2 | 9.14
223243
```
224-
225-
### Accessing rows and columns
226-
Rows and columns of a data frame can be accessed either by their names or their numeric indexes. Afrer changing the names of rows and columns to `#(A B C)` and `#(City Population SomeBool)`, as shown above, how we can now access row _'C'_ and the column _'Population'_ of a data frame
227-
228-
```smalltalk
229-
df row: 'C'.
230-
df column: 'Population'.
231-
```
232-
233-
We can also access them by their numeric indexes
234-
235-
```smalltalk
236-
df rowAt: 3.
237-
df columnAt: 2.
238-
```
239-
240-
The important feature of a `DataFrame` is that whenever we ask for a specific row or column, it responds with a `DataSeries` object that preserves the same indexing. So, for example, if you take row _'B'_ of a data frame described above, you will get a series named _'B'_ with keys _'City'_, _'Population'_, and _'SomeBool'_.
241-
242-
```
243-
| B
244-
------------+-------
245-
City | Dubai
246-
Population | 2.789
247-
SomeBool | true
248-
```

0 commit comments

Comments
 (0)