Course aim
The course introduces a variety of central algorithms and methods essential for performing scientific data analysis using statistical inference and machine learning. Much emphasis is put on practical applications of Bayesian inference in the natural and engineering sciences, i.e. the ability to quantify the strength of inductive inference from facts (such as experimental data) to propositions such as scientific hypotheses and models.
The course is project-based, and the students will be exposed to fundamental research problems through the various projects, with the aim to reproduce state-of-the-art scientific results. The students will use the Python programming language, with relevant open-source libraries, and will learn to develop and structure computer codes for scientific data analysis projects.
Course design
- Lectures with computer demonstrations: background and theory, discussion problems, computer demonstrations using Jupyter notebooks.
- Supervised computational exercises: group work on exercises, problem sets, and numerical projects in the computer lab with supervision.
- Exercise- and project-based learning through work on analytical and numerical exercises and problem sets plus computational projects with written reports.
Computer lab sessions
Should be used for working on projects, exercises and problem sets. You will have the opportunity to discuss with the supervisors and with your fellow students. The main computer lab session on Thursdays will typically start with a demonstration or feedback discussion led by one of the teaching assistants.
Concerning which computer to use, you have three options:
- Use one of the Linux computers, e.g., in rooms F-T7203, F-T7204, FT4011. The first two of these rooms are reserved for us on Thursdays from 8-12 but all computers can be used when available.
- Log in remotely to one of the Chalmers STUdat linux computers. See instructions at https://chalmers.topdesk.net/. Depending on your platform, find the relevant set of instructions for "How do I remotely access a StuDAT Linux computer from my [select platform]". You might want to use the /bin/bash shell rather than the default /bin/sh.
- Use your personal computer.
General recommendations
- See detailed Getting started instructions.
- Weekly reading assignments and exercises provide hints to solve the projects and problem sets.
- Try to establish a practice where you log your work with the exercises and projects. You may find such a log book very handy at later stages in your work, especially when you don't properly remember what a previous test version of your program did. Here you could also record the time spent on solving the exercise, various algorithms you may have tested, or questions that you would like to discuss further with your lab partner or the supervisor.
- The course assumes a solid background in undergraduate mathematics (multivariable analysis, linear algebra, mathematical statistics). You might have to refresh your knowledge in these subjects by referring to undergraduate mathematics textbooks.
- We will use the Python programming language. In particular, we will use Jupyter notebooks when working on numerical exercises, problems, and projects. The exercises are constructed to help you get started with Jupyter notebooks, and we provide suggestions for useful electronic resources. You might have to use these references throughout the course, and you're also encouraged to discuss with the teaching assistants.
- We will briefly discuss version control approaches for code development during this course. The use of such an approach is, however, not an official learning outcome in this course and will not be the required practice for your work on numerical projects. Still, you are encouraged to try this approach and the teachers will be happy to discuss implementations and recommended work flows. Version control is the most ethical approach to computational research. All files that are associated with this course are available via a public repository on github that you are welcome to clone.
Changes
Course changes since last year:
- Revised learning objectives and course content.
- Updated problem sets.
- Preparatory material for refreshing prerequisite knowledge.
Learning objectives
after completion of the course the student should be able to:
- plan and perform scientific data analysis with methods from Bayesian statistics.
- simulate multivariate probability distributions with MCMC methods.
- quantify and critically assess uncertainties of model parameters via statistical inference.
- understand and numerically implement several probabilistic algorithms used in data analysis and machine learning.
- address open questions in scientific data analysis and perform numerical studies using Python as a programming language.
- write well-structured technical reports where results and conclusions from a scientific data analysis are communicated in a clear way.
- maintain a scientific and ethical conduct in the process of modeling, analyzing data and writing computer programs.
Examination
The final grade is based on the performance on problem sets (performed individually) and the graded reports on projects (performed in groups of two students). There is no written exam. Ethical aspects will be explicitly examined.
General examination rules for both problem sets and projects
- Students are allowed to discuss together and help each other when solving the problems and working on projects. However, every student must understand their submitted solution in the sense that they should be able to explain and discuss them in detail with a peer or with a teacher.
- While discussions with your peers are allowed (even encouraged), direct plagiarism is not. Every student must reach their own understanding of submitted solutions according to the definition in the previous point.
- The use of coding assistance from code generating artificial intelligence tools is allowed. However, every student must reach their own understanding of submitted solutions (including employed algorithms) according to the definition in the first point.
Problem sets
There are three problem sets. These are strongly connected to the exercises that you will be working on in the computer labs.
- Each set contains a number of basic questions and a number of extra (often more advanced) ones. Most tasks are computational and it is recommended to work on them during the computer lab sessions.
- The problem sets are examined individually via:
- Hand-in of solution code (Jupyter notebook) via Canvas. It is the code and results in your submitted notebook that is considered to be your hand-in solution. See specific deadlines on the corresponding Canvas Assignments. Note that there are separate Canvas Assignments for basic and extra problems.
- Code tests for certain (marked) tasks via Yata Questions. You need to have a green check mark on Yata to get the corresponding points. See specific deadlines on the corresponding Yata Questions assignment.
- Face-to-face discussions with a randomly selected subset of students on the first Monday after the problem set deadline. Selected students will meet with one of the teachers and are expected to answer questions on their code implementation and results. No extra preparation is needed for these discussions apart from familiarity with your own solution. A list of randomly selected students will be published on the course web page around Monday noon. During the afternoon session that same day, students will be called in the numbered order until the end of the list (or the end of the exercise session). You must inform the responsible teacher as soon as possible following the publication of the student list if you can not be physically present at the exercise session (in which case we will have the discussion on zoom).
- An oral examination (on all aspects of the course) will be arranged during the exam week for students that do not show up for their discussion slot, or that fail to demonstrate familiarity with their hand-in solutions.
- Problem sets that are handed in post deadline will only be eligible for points corresponding to the minimum required for a passing grade (for Chalmers PhD students this corresponds to grade 4). Late hand-ins will be graded in December/January (handed in before Nov. 30th) or August/September (handed in before July 31st).
Projects
There are two computational projects. Each project contains a basic and an extra (optional) task.
- The projects are performed in groups of two students. Join a project group via Canvas.
- Each group has to hand in a their solution code and a written report (pdf format) that will be graded. See detailed instructions in the problem formulations. See specific deadlines on the corresponding Canvas Assignments.
- Projects that are handed in post deadline will only be eligible for points corresponding to the minimum required for a passing grade (for Chalmers PhD students this corresponds to grade 4). Late hand-ins will be graded in December/January (handed in before Nov. 30th) or August/September (handed in before July 31st).
Grading system
- Pass (grade 3 / G / E)
- In order to pass the course you need to have a minimum number of points per task (4 points / problem set and 6 points / project), and you need ≥30 points in total.
- Pass with distinction
- In order to pass the course with distinction (high grade) you need a minimum number of points per task (6 points / problem set and 9 points / project) and a certain amount of points in total (see table below).
The final grade, given that the above minimum-point requirements per task are fulfilled, is determined according to:
Grade table
Total points |
Minimum points |
Grade |
(max 100) |
per set |
per project |
Chalmers |
GU |
ECTS |
≥ 80 |
6 |
9 |
5 |
VG |
A |
70 - 79 |
6 |
9 |
4 |
VG |
B |
60 - 69 |
6 |
9 |
4 |
G |
C |
50 - 59 |
4 |
6 |
3 |
G |
D |
30 - 49 |
4 |
6 |
3 |
G |
E |
< 30 |
or < 4 |
or < 6 |
U |
U |
F |
Note that ECTS grades are not implemented at Chalmers. The above table just gives an indication of the approximate correspondance.
Chalmers and GU student portals
Links to the course syllabus at the Chalmers and Gothenburg University student portals: