Davide Gerosa (module lead) - davide.gerosa@unimib.it
Giulia Fumagalli (teaching assistant) - g.fumagalli47@campus.unimib.it
University of Milano-Bicocca, 2024. BSc in Artificial Intelligence.
Machine learning and data mining are quickly becoming essential techniques in the field of (astro)physics. Such powerful tools provide precious insights into the laws governing natural processes and shed light on the information contained in experimental datasets. This lab provides a quick introduction to such topics, equipping students with some essential background to apply their data-science knowledge to core physical problems.
- Probability Probability theory. Bayes theorem. Descriptive statistics (E: Monty Hall)
- Sampling Bayesian vs frequentist statistics. From the pdf to the samples: inverse transform, acceptance/rejection (E: Black holes)
- Density estimation From the samples to the pdf: histograms, Kernel Density Estimation. (E: Exoplanets)
- Markow chains Monte Carlo integrations. Markow chains. (E: Weather forecast, N-dimensional balls)
- MCMC Metropolis Hastings. MCMC diagnostics. Modern samplers. (E: Time transients)
- Model selection Bayesian model selection. Savage-Dickey density ratio. (E: Higgs Boson)
- Nested sampling Computing the evidence. Nested sampling. Modern samples. (E: Time transients & Higgs Boson)
- Project (E: The expansion of the Universe)
Here are some useful textbooks. The first one is easier, the second one is amazing, the third is short and sweet, the fourth and fifth ones are for the true Bayesians...
- "Machine Learning for Physics and Astronomy", Acquaviva, Princeton University Press, 2023.
- "Statistics, Data Mining, and Machine Learning in Astronomy", Željko, Andrew, Jacob, and Gray. Princeton University Press, 2012.
- "Statistical Data Analysis", Cowan. Oxford Science Publications, 1997.
- "Data Analysis: A Bayesian Tutorial", Sivia and Skilling. Oxford Science Publications, 2006.
- "Bayesian Data Analysis", Gelman, Carlin, Stern, Dunson, Vehtari, and Rubin. Chapman & Hall, 2013. Free!
We will make heavy use of the python programming language. If you need to refresh your python skills, here are some catch-up resources and online tutorials.
- "Python for Scientific Computing", M. Zingale
- "Lectures on scientific computing with Python", R. Johansson et al.
- Python Programming for Scientists", T. Robitaille et al.
- "Learning Scientific Programming with Python", Hill, Cambridge University Press, 2020. Supporting code: scipython.com.
Classes are on Wednesdays from 8.30am to 12.30pm for a total of 36 hours. Here is our timetable:
- 06-03-24. U7-Lab712.
- 13-03-24. U7-Lab712.
- 20-03-24. U7-Lab712.
- 27-03-24. U7-Lab712.
- 03-04-24. No lecture, Davide and Giulia are away for research.
- 10-04-24. U7-Lab712.
- 17-04-24. U7-Lab712.
- 24-04-24. U7-Lab712.
- 01-05-24. Holiday.
- 08-05-24. U9a-Lab909.
- 15-05-24. U7-Lab714.
- 22-05-24. U7-Lab714. Question time.
- 29-05-24. U7-Lab714. [Extra slot in case we skip one lecture...].
- 05-06-24. U7-Lab714. [Extra slot in case we skip one lecture...].
- 12-06-24. U7-Lab714. [Extra slot in case we skip one lecture...].
Each class will be about 1h of lecture and about 3h of coding. You will need to run python, see here for instructions.
Each lecture has an exercise at the end (actually, most of the time in class is dedicated to completing these problems!). At the end of the class, you will have to submit your codes showcasing what you've done on these problems. There will not be oral exams. The outcome will be provided as a passed / not passed statement (no numbers).
To submit your codes, register on github.com, create a new private repository called machinelearning4physics_bicocca_2024_solutions, upload your files, and share it with the two of us (usernames dgerosa and gfumagalli). See here for detailed instructions.
Exams will be cleared according to the nominal calendar available on the student service website.
Credit: xkcd 1831.
