Skip to content

Commit 91fc400

Browse files
committed
Update README.md [ci skip]
1 parent ba6fd4e commit 91fc400

File tree

1 file changed

+24
-5
lines changed

1 file changed

+24
-5
lines changed

README.md

Lines changed: 24 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ There are four ways of creating a data frame:
8787
3. from file
8888
4. loading a built-in dataset
8989

90-
#### 1. Creating a DataFrame from an array of rows or columns
90+
#### 1. Creating DataFrame from an array of rows or columns
9191
The easiest and most straightforward way of creating a DataFrame is by passing all data in an array of arrays to `fromRows:` or `fromColumns:` message. Here is an example of initializing a DataFrame with rows:
9292

9393
```smalltalk
@@ -123,15 +123,34 @@ B | Dubai 2.789 true
123123
C | London 8.788 false
124124
```
125125

126+
#### 2. Creating DataFrame from a Matrix
127+
By it's nature DataFrame is similar to a matrix. It works like a table of values, supports matrix accessors, such as `at:at:` or `at:at:put:` and in some cases can be treated like a matrix. Some classes provide tabular data in matrix format. For example TabularWorksheet class of [Tabular]() package that is used for reading XLSX files. To initialize a DataFrame from a maxtrix of values, use `fromMatrix:` method
128+
129+
```smalltalk
130+
matrix := Matrix
131+
rows: 3 columns: 3
132+
contents:
133+
#('Barcelona' 1.609 true
134+
'Dubai' 2.789 true
135+
'London' 8.788 false).
136+
137+
df := DataFrame fromMatrix: matrix.
138+
```
139+
140+
Once again, the names of rows and columns are set to their default values.
141+
126142
#### 3. Reading data from file
127-
This is the most common way of creating a data frame. You have some dataset in a file (CSV, Excel etc.) - just ask a DataFrame to read it. At this point only CSV files are supported, but very soon you will also be able to read the data from other formats.
143+
In most real-world scenarios the data is located in a file or database. The support for database connections will be added in future releases. Right now DataFrame provides you the methods for loading data from two most commot file formats: CSV and XLSX
128144

129145
```smalltalk
130-
df := DataFrame fromCSV: 'path/to/your/file.csv'.
146+
DataFrame fromCSV: 'path/to/your/file.csv'.
147+
DataFrame fromXLSX: 'path/to/your/file.xlsx'.
131148
```
132149

133-
### 4. Loading the built-in datasets
134-
DataFrame provides several famous datasets for you to play with. They are compact and can be loaded with a simple message. An this point there are three datasets that can be loaded in this way - [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set), a simplified [Boston Housing dataset](https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data), and a tipping dataset.
150+
Since JSON does not store data as a table, it is not possible to read such file directly into a DataFrame. However, you can parse JSON using [NeoJSON](https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/NeoJSON/NeoJSON.html) or any other library, construct an array of rows and pass it to `fromRows:` message, as described in previous sections.
151+
152+
#### 4. Loading the built-in datasets
153+
DataFrame provides several famous datasets for you to play with. They are compact and can be loaded with a simple message. An this point there are three datasets that can be loaded in this way - [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set), a simplified [Boston Housing dataset](https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data), and [Restaurant tipping dataset](https://vincentarelbundock.github.io/Rdatasets/doc/reshape2/tips.html).
135154

136155
```smalltalk
137156
DataFrame loadIris.

0 commit comments

Comments
 (0)