|
| 1 | +# Data manipulation, analysis and visualisation in Python |
| 2 | + |
| 3 | +## Introduction |
| 4 | + |
| 5 | +The handling of data is a recurring task for data analysts. Reading in experimental data, checking its properties, |
| 6 | +and creating visualisations may become tedious tasks. Hence, increasing the efficiency in this process is beneficial for many professionals |
| 7 | +handling data. Spreadsheet-based software lacks the ability to properly support this process, due to the lack of automation and repeatability. |
| 8 | +The usage of a high-level scripting language such as Python is ideal for these tasks. |
| 9 | + |
| 10 | +This course trains participants to use Python effectively to do these tasks. The course focuses on data manipulation and cleaning of tabular data, |
| 11 | +explorative analysis and visualisation using important packages such as Pandas, Numpy, Matplotlib and Seaborn. |
| 12 | + |
| 13 | +The course does not cover statistics, data mining, machine learning, or predictive modelling. It aims to provide participants the means to effectively |
| 14 | +tackle commonly encountered data handling tasks in order to increase the overall efficiency. These skills are both useful for data cleaning as well as |
| 15 | +feature engineering. |
| 16 | + |
| 17 | +The course has been developed as a course for the Specialist course Doctoral schools of Ghent University, but can be taught to others upon request. |
| 18 | + |
| 19 | +## Course info |
| 20 | + |
| 21 | +### Aim & scope |
| 22 | + |
| 23 | +This course is intended for researchers that have at least basic programming skills. A basic (scientific) programming course that is part of |
| 24 | +the regular curriculum should suffice. For those who have experience in another programming language (e.g. Matlab, R, ...), following a Python |
| 25 | +tutorial prior to the course is advised. |
| 26 | + |
| 27 | +The course is intended for professionals who wish to enhance their general data manipulation and visualization skills in Python, with a specific |
| 28 | +focus on tabular data. The course is NOT intended to be a course on statistics or machine learning. |
| 29 | + |
| 30 | +### Program |
| 31 | + |
| 32 | +After setting up the programming environment with the required packages using the conda package manager and an introduction of the Jupyter |
| 33 | +notebook environment, the data analysis package Pandas and the plotting packages Matplotlib and Seaborn are introduced. Advanced usage of Pandas |
| 34 | +for different data cleaning and manipulation tasks is taught and the acquired skills will immediately be brought into practice to handle real-world |
| 35 | +data sets. Applications include time series handling, categorical data, merging data, tidy data,... |
| 36 | + |
| 37 | +The course closes with a discussion on the scientific Python ecosystem and the visualisation landscape learning |
| 38 | +participants to create interactive charts. |
| 39 | + |
| 40 | +## Getting started |
| 41 | + |
| 42 | +The course uses Python 3 and some data analysis packages such as Pandas, Seaborn, Numpy and Matplotlib. To install the required libraries, |
| 43 | +we recommend Anaconda or miniconda ([https://www.anaconda.com/download/](https://www.anaconda.com/download/)) or another Python distribution that |
| 44 | +includes the scientific libraries (this recommendation applies to all platforms, so for both Window, Linux and Mac). |
| 45 | + |
| 46 | +For detailed instructions to get started on your local machine, see the [setup instructions](./setup.html). |
| 47 | + |
| 48 | +In case you do not want to install everything and just want to try out the course material, use the environment setup by |
| 49 | +Binder [](https://mybinder.org/v2/gh/jorisvandenbossche/DS-python-data-analysis/HEAD) and open de notebooks |
| 50 | +rightaway (inside the `notebooks` directory). |
| 51 | + |
| 52 | +## Slides |
| 53 | + |
| 54 | +For the course slides, click [here](https://jorisvandenbossche.github.io/DS-python-data-analysis/slides.html). |
| 55 | + |
| 56 | +## Contributing |
| 57 | + |
| 58 | +Found any typo or have a suggestion, see [how to contribute](./contributing.html). |
| 59 | + |
| 60 | +## Meta |
| 61 | +Authors: Joris Van den Bossche, Stijn Van Hoey |
| 62 | + |
| 63 | +<img src="./static/img/logo_flanders+richtingmorgen.png" width="79%"> |
| 64 | +<img src="./static/img/doctoralschoolsprofiel_hq_rgb_web.png" width="20%"> |
| 65 | + |
0 commit comments