INTRODUCTION TO STATISTICAL LEARNING

Anno accademico
2025/2026 Programmi anni precedenti
Titolo corso in inglese
INTRODUCTION TO STATISTICAL LEARNING
Codice insegnamento
LT9054 (AF:576177 AR:323369)
Lingua di insegnamento
Inglese
Modalità
In presenza
Crediti formativi universitari
6
Livello laurea
Laurea
Settore scientifico disciplinare
SECS-S/01
Periodo
4° Periodo
Anno corso
3
Spazio Moodle
Link allo spazio del corso
This course introduces statistical methods for data analysis and prediction, which are categorized into supervised and unsupervised learning. Supervised statistical learning involves the development of a model to predict or estimate an output based on one or more input variables. This approach is widely applicable across various domains, including business, medicine, astrophysics, and public policy. In contrast, unsupervised statistical learning deals with data that consists solely of input variables, without a predefined output, enabling the identification of underlying patterns and relationships within the data.
The course aims to introduce students to statistical learning methods through practical applications in marketing, finance, biology, and other fields. The objective is to equip students with the skills to effectively analyze data by implementing statistical learning methods using the statistical software R.
Although the course draws on algebra, mathematics, probability, statistics, and programming, it is introductory and accessible to students from diverse backgrounds. A basic mathematical foundation is sufficient; no advanced mathematics is required. An elementary knowledge of statistics (e.g., Introduction to Probability for Economics) is recommended but not required. Familiarity with linear regression and prior exposure to a programming language such as R or Python are helpful but not necessary. No detailed knowledge of matrix operations is expected.
1. Introduction to Statistical learning and R programming.
2. Supervised learning: the bias-variance trade-off, linear regression, tree-based methods.
3. Unsupervised learning: dimensionality reduction, principal components analysis and matrix completion.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2021. An Introduction to Statistical Learning: With Applications in R. 2nd ed. New York: Springer.
The final assessment consists of two components. The first component is a data analysis project that evaluates students’ ability to develop predictive models. Students must submit their predictions, the code used to reproduce them, and a report of no more than four pages describing the analysis. The project must be uploaded to the course Moodle page before the exam date, and only one submission per academic year is allowed. Examples of submissions are available on the Ca’ Foscari Moodle e-learning platform. The second component is an oral examination. This may include a discussion of the submitted project as well as questions on topics covered in class, including exercises and programming. Students are required to bring their own laptop for the oral exam.
scritto e orale

Il/la docente ha il dovere di vigilare affinché siano rispettate le regole di autenticità e originalità delle prove d'esame. Di conseguenza, nei casi in cui vi sia il sospetto di un comportamento irregolare, l'esame può prevedere un ulteriore approfondimento, contestuale alla prova d'esame, che potrà essere realizzato anche in modalità differente rispetto alle modalità sopra riportate.

The exam is graded on a scale from 0 to 30, with a minimum passing score of 18. The final grade is based on the evaluation of the following elements, as demonstrated by the student in both the data analysis project and the oral examination: understanding of the fundamental concepts of the subject; ability to solve both conceptual and applied exercises; and problem-solving and programming skills. The grading scale is defined as follows: barely sufficient performance, with several gaps, corresponds to grades of 18–20; fully sufficient performance corresponds to grades of 21–23; good performance corresponds to grades of 24–26; excellent performance corresponds to grades of 27–30; and outstanding performance is awarded 30 cum laude.
The course will be delivered in a lecture-style format, with select sessions dedicated to programming. Students must bring their own laptop to these sessions and have R and RStudio installed.
Students are encouraged to register for the course on the Moodle platform (moodle.unive.it), where they can find supplementary materials.
Il programma è ancora provvisorio e potrà subire modifiche.
Data ultima modifica programma: 07/04/2026