You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/index.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,12 +3,12 @@
3
3
## Introduction
4
4
5
5
The handling of data is a recurring task for data analysts. Reading in experimental data, checking its properties,
6
-
and creating visualisations may become tedious tasks. Hence, increasing the efficiency in this process is beneficial for many professionals
6
+
and creating visualisations are crucial steps in the research process. Hence, increasing the efficiency in this process is beneficial for professionals
7
7
handling data. Spreadsheet-based software lacks the ability to properly support this process, due to the lack of automation and repeatability.
8
8
The usage of a high-level scripting language such as Python is ideal for these tasks.
9
9
10
10
This course trains participants to use Python effectively to do these tasks. The course focuses on data manipulation and cleaning of tabular data,
11
-
explorative analysis and visualisation using important packages such as Pandas, Numpy, Matplotlib and Seaborn.
11
+
explorative analysis and visualisation using important packages such as Pandas, Matplotlib and Seaborn.
12
12
13
13
The course does not cover statistics, data mining, machine learning, or predictive modelling. It aims to provide participants the means to effectively
14
14
tackle commonly encountered data handling tasks in order to increase the overall efficiency. These skills are both useful for data cleaning as well as
To get started, you should have the following three elements setup:
7
+
To get started, you should have the following elements setup:
8
8
9
-
1.Install Python and the required Python packages
10
-
2.Download the course material to your computer
9
+
1.Download the course material to your computer
10
+
2.Install Python and the required Python packages using `conda`
11
11
3. Test your configuration and installation
12
12
4. Start Jupyter lab
13
13
14
14
In the following sections, more details are provided for each of these steps. When all three are done, you are ready to start coding!
15
15
16
-
## 1. (_before the course_) Install Python and the required Python packages
16
+
## 1. Getting the course materials
17
17
18
-
For scientific and data analysis, we recommend to use Anaconda (or Miniconda) (<https://www.anaconda.com/download/>), which provides a Python
19
-
distribution that includes the scientific libraries (this recommendation applies to all platforms, so for both Windows, Linux and Mac),
20
-
instead of installing Python as such. After installation, proceed with the setup.
18
+
### Option 1: You are already a git user
21
19
22
-
### Install Anaconda
20
+
As the course has been set up as a [git](https://git-scm.com/) repository managed on [Github](https://github.com/jorisvandenbossche/DS-python-data-analysis),
21
+
you can clone the entire course to your local machine. Use the command line to clone the repository and go into the course folder:
After the download, unzip on the location you prefer within your user account (e.g. `My Documents`, not `C:\`). Watch out for a nested 'DS-python-data-analysis/DS-python-data-analysis' folder structure after unzipping and move the inner DS-python-data-analysis folder to your preferred location.
38
39
39
-
When you already have an installation of Anaconda, you have to make sure you are working with the most recent versions. As the course is
40
-
developed for Python 3, make sure you have Anaconda3 (on Windows, check Start > Programs > Anaconda3). If not, reinstall Anaconda according
41
-
to the previous section.
40
+
__Note:__ Make sure you know where you stored the course material, e.g. `C:/Users/yourusername/Documents/DS-python-data-analysis`.
42
41
43
-
Start the Anaconda Navigator program (for Windows users: Start > Anaconda Navigator) and go to the Environments tab. You should see
44
-
the *base (root) environment*, click the arrow next to it and click `Open terminal`, as shown in the following figure:
42
+
## 2. Install Python and the required Python packages using `conda`
For scientific and data analysis, we recommend to use `conda`, a command line tool for package and environment management (<https://docs.conda.io/projects/conda/>).
45
+
`conda` allows us to install a Python distribution with the the scientific libraries we will use in this course (this recommendation applies to all platforms, so for both Windows, Linux and Mac).
47
46
48
-
Type following command + ENTER-button (make sure you have an internet connection):
47
+
### 2.1 Install `conda`
49
48
50
-
```
51
-
conda update -n base conda
52
-
```
53
-
and respond with *Yes* by typing `y`. Packages should be updated after the completion of the command.
49
+
#### Option 1: I do not have `conda` installed
54
50
55
-
### Setup after Anaconda installation
51
+
We recommend to use the installer provided by the conda-forge community: <https://conda-forge.org/download/>.
56
52
57
-
As not all packages we will use in the course are provided by default as part of Anaconda, we have to add the package to Anaconda to get started.
58
-
As a good practice, we will create a new _conda environment_ to work with. This environment will contain the required packages on which this
59
-
course depends.
53
+
Follow the instructions on that page, i.e. first download the appropriate installed (depending on your operating system), and then run that installer.
54
+
55
+
On Windows, this will mean double-clicking the downloaded `.exe` file, and following the instructions. During installation, choose the options (click checkbox):
56
+
57
+
- '_Register Miniforge3 as my default Python 3.12_' (in case this returns an error about an existing Python 3.12 installation, remove the existing Python installation using [windows Control Panel](https://support.microsoft.com/en-us/windows/uninstall-or-remove-apps-and-programs-in-windows-4b55f974-2cc6-2d2b-d092-5905080eaf98)).
58
+
- '_Clear the package cache upon completion_'.
59
+
60
+
On MacOS or Linux, you have to open a terminal, and run `bash Miniforge3-$(uname)-$(uname -m).sh`
61
+
62
+
#### Option 2: I already have `conda`, Anaconda or Miniconda installed
63
+
64
+
When you already have an installation of `conda` or Anaconda, you have to make sure you are working with a recent version. If you installed it only a
65
+
few months ago, this step is probably not needed, otherwise follow the next steps:
66
+
67
+
1. Open a terminal window (on Windows, use the dedicated "Anaconda Prompt" or "Miniforge Prompt", via Start Menu)
68
+
2. Run `conda update conda`, by typing that command, hit the ENTER-button
69
+
(make sure you have an internet connection), and respond with *Yes* by typing `y`.
70
+
3. Run `conda config --add channels conda-forge`, by typing that command, hit the ENTER-button
71
+
4. Run `conda config --set channel_priority strict`, by typing that command, hit the ENTER-button
72
+
73
+
If you are using Anaconda on Windows, replace each time "Miniforge Prompt" by "Anaconda Prompt" in the following sections.
74
+
75
+
### 2.2 Setup after `conda` installation
76
+
77
+
Now we will use `conda` to install the Python packages we are going to use
78
+
throughout this course.
79
+
As a good practice, we will create a new _conda environment_ to work with.
60
80
61
81
The packages used in the course are enlisted in
62
82
an [`environment.yml` file](https://raw.githubusercontent.com/jorisvandenbossche/DS-python-data-analysis/main/environment.yml). The file looks as follows:
63
83
64
84
```
65
85
name: DS-python
66
86
channels:
67
-
- defaults
68
87
- conda-forge
69
88
dependencies:
70
-
- python=3.11
71
-
- ipython
72
-
- jupyter
89
+
- python=3.12
90
+
- geopandas
73
91
- ...
74
92
```
75
93
@@ -78,45 +96,32 @@ The file contains information on:
78
96
-`channels` to define where to download the packages from
79
97
-`dependencies` contains each of the packages
80
98
81
-
To download the environment file, click to go to
82
-
the [environment.yml](https://raw.githubusercontent.com/jorisvandenbossche/DS-python-data-analysis/main/environment.yml) online. Once opened in the
83
-
browser, right-click and save the file/page on your computer. The specific text depends on your browser (`Save page as...`, `Save as...`).
99
+
The environment.yml file for this course is included in the course material you
100
+
downloaded.
84
101
85
-
__WARNING !__ Make sure you save the file as `environment.yml` instead of `environment.yml.txt` which, specifically on Windows operating system,
86
-
might be the default option. To do so, choose for 'save as type' _All Files_ instead of 'Text Document'.
102
+
Now we can create the environment:
87
103
88
-

104
+
1. Open the terminal window (on Windows use "Miniforge Prompt", open it via Start Menu > 'Miniforge Prompt')
105
+
2. Navigate to the directory where you downloaded the course materials (that directory should contain a `environment.yml` file, double check in your file explorer).:
89
106
90
-
You will need the folder/directory containing the `environment.yml` file in the next step. Make sure you know where you stored the file on
91
-
your computer, e.g. when stored in the folder `C:/Users/yourusername/Documents` you should see the file `environment.yml` in File Explorer
92
-
in that directory.
107
+
```
108
+
cd FOLDER_PATH_TO_COURSE_MATERIAL
109
+
```
110
+
(Make sure to hit the ENTER-button to run the command)
93
111
94
-
Next, start the Anaconda Navigator program (for windows users: Start > Anaconda Navigator) and go to the Environments tab. You should see
95
-
the *base (root) environment*, click the arrow next to it and click `Open terminal`, as shown in the following figure:
112
+
3. Create the environment by typing the following commands line by line + hitting the ENTER-button (make sure you have an internet connection):
Type following commands line by line + ENTER-button (make sure you have an internet connection):
100
-
101
-
```
102
-
conda install -n base conda-libmamba-solver
103
-
conda config --set solver libmamba
104
-
conda config --add channels conda-forge
105
-
conda config --set channel_priority strict
106
-
cd FOLDER_PATH_TO_ENVIRONMENT_FILE
107
-
conda env create -f environment.yml
108
-
```
118
+
__!__`FOLDER_PATH_TO_COURSE_MATERIAL` should be replaced by the path to the folder containing the downloaded course materials (e.g. in the example it is `C:/Users/yourusername/Documents/DS-python-data-analysis`)
109
119
110
-
__!__`FOLDER_PATH_TO_ENVIRONMENT_FILE` should be replaced by the path to the folder containing the downloaded environment file. In the
111
-
example earlier, this was `C:/Users/yourusername/Documents`, but make sure you use your specific folder (as seen in File Explorer).
120
+
__!__ You can safely ignore the warning `FutureWarning: 'remote_definition'...`.
112
121
113
-
Respond with *Yes* by typing `y` when asked. Output will be printed and if no error occurs, you should have the environment configured
114
-
with all packages installed.
122
+
Respond with *Yes* by typing `y` when asked. Output will be printed and if no error occurs, you should have the environment configured with all packages installed.
115
123
116
-
**Note:** If you did use Miniconda instead, create the environment using the same commands/instructions in the terminal (make sure to
117
-
do the `conda config ...` steps.).
118
-
119
-
When finished, keep the terminal window open (or reopen it). Execute the following commands to check your installation:
124
+
When finished, keep the terminal window (or "Miniforge Prompt") open (or reopen it). Execute the following commands to check your installation:
120
125
121
126
```
122
127
conda activate DS-python
@@ -132,34 +137,13 @@ import matplotlib
132
137
133
138
If no message is returned, you're all set! If a message (probably an error) returned, contact the instructors. Copy paste the message returned.
134
139
135
-
136
-
## 2. (_first day of the course_) Getting the course materials
137
-
138
-
### Option 1: You are a git user?
139
-
140
-
As the course has been setup as a [git](https://git-scm.com/) repository managed on [Github](https://github.com/jorisvandenbossche/DS-python-data-analysis),
141
-
you can clone the entire course to your local machine. Use the command line to clone the repository and go into the course folder:
After the download, unzip on the location you prefer within your user account (e.g. `My Documents`, not `C:\`).
159
-
160
-
__Note:__ Make sure you know where you stored the course material, e.g. `C:/Users/yourusername/Documents/DS-python-data-analysis`
161
-
162
-
## 3. (_first day of the course_) Test your configuration
146
+
## 3. Test your configuration
163
147
164
148
To check if your packages are properly installed, open the Conda Terminal again (see above) and navigate to the course directory:
165
149
@@ -187,12 +171,14 @@ When all checkmarks are ok, you're ready to go!
187
171
188
172
## 4.(_start of day during course_) Starting Jupyter Notebook with Jupyter Lab
189
173
190
-
Each of the course modules is set up as a [Jupyter notebook](http://jupyter.org/), an interactive environment to write and run code. It is
191
-
no problem if you never used jupyter notebooks before as an introduction to notebooks is part of the course.
174
+
Each of the course modules is set up as a [Jupyter notebook](http://jupyter.org/), an interactive environment to write and run code. It is no problem if you never used jupyter notebooks before as an introduction to notebooks is part of the course.
192
175
193
-
### Option 1: Using the command line
194
176
195
-
* In the terminal, navigate to the `DS-python-data-analysis` directory (downloaded or cloned in the previous section)
177
+
* In the terminal (or "Miniforge Prompt"), navigate to the `DS-python-data-analysis` directory (downloaded or cloned in the previous section)
178
+
179
+
```
180
+
cd FOLDER_PATH_TO_COURSE_MATERIAL
181
+
```
196
182
197
183
* Ensure that the correct environment is activated.
198
184
@@ -206,14 +192,6 @@ no problem if you never used jupyter notebooks before as an introduction to note
206
192
jupyter lab
207
193
```
208
194
209
-
### Option 2: Using Anaconda Navigator
210
-
211
-
In the Anaconda Navigator *Home* tab, first switch to the course environment, called `DS-python` in the selection bar. Next,
212
-
select the Launch button under the Jupyter Lab icon:
This will open a browser window automatically. Navigate to the course directory (if not already there) and choose the `notebooks` folder to
219
-
access the individual notebooks containing the course material.
197
+
This will open a browser window automatically. Navigate to the course directory (if not already there) and choose the `notebooks` folder to access the individual notebooks containing the course material.
0 commit comments