Final report: report.pdf
The main goal of this project was to create a comprehensive report explaining concepts of statistical data analysis applied to an existing dataset.
- The choice of statistical methods was flexible, as long as they were relevant and covered in the course curriculum.
- The report included test cases, either from the recommended list provided by faculty or created by team members.
- R language was used for data analysis and report generation.
Project evaluation was based on:
- Report quality
- Oral examination testing knowledge of theoretical concepts (e.g., when to use a test, test assumptions, and method details).
The dataset consists of survey responses and student grades in mathematics and Portuguese from two high schools.
Collecting this type of data is essential for analyzing and improving the quality of the education system.
More details: pdfs/dataset_documentation.pdf
Directory | Description |
---|---|
auditorne | Reference to existing problems and their solutions |
cheatsheets | Tidyverse cheat sheets in PDF format |
pdfs | Dataset and project descriptions |
src | R Markdown source files and dataset |
- Install RStudio
- Install R and tidyverse dependencies:
sudo apt-get install r-cran-curl r-cran-openssl r-cran-xml2 libxml2-dev
- Open RStudio → Open Project →
student-success-analysis.Rproj
- The file
student-success-analysis.Rmd
should open automatically- If not, navigate to it in the Files panel and double-click
- Run the first code chunk (
Ctrl + Shift + Enter
) containing thelibrary
functions - A popup will prompt to install required packages → click Yes
- Installation may take ~20 minutes
KS Test:
- If p = 1 → data surely come from the same distribution
- If p = 0 → data come from different distributions
- Spellcheck the report
- Write introduction to the problem