Academic year 2020/2021 Syllabus of previous years
Course code ET7008 (AF:304951 AR:170894)
Modality For teaching methods (in presence/online) please check the timetable
ECTS credits 6
Degree level Bachelor's Degree Programme
Educational sector code INF/01
Period 4th Term
Course year 2
Moodle Go to Moodle page
Contribution of the course to the overall degree programme goals
One of the main roles of a Digital Manager is to exploit available resources in order to manage digitalization projects and to implement digital solutions in the field of information technology.
In particular, students should be able to exploit modern AI approaches to extract meaningful information starting from raw data of various kind.
Such competences need a strong theoretical and practical knowledge of data analysis.

The goal of this course is to teach students methods and technologies for effective data analysis, discussing the fundamental techniques for predictive and descriptive analysis of data.
During the lectures several tools and techniques will be presented, from both theoretical and practical aspects, so that students will be able to compare such tools and extract knowledge from the presented datasets.
The results of the aforementioned analysis are exploited as a starting point for further decisions and considerations.
Expected learning outcomes
At the end of the course, students should be able to apply the appropriate learning and descriptive techniques to data and to manage the tools presented during lectures.
Students should also be able to produce a comparative analysis report, including data representation.

Students will achieve the following learning outcomes, divided in three main areas:

1. Knowledge and understanding:
- understanding the theoretical bases of the main algorithms presented during lectures;
- understanding principles and differences of non-supervised learning algorithms;
- understanding principles and differences of supervised learning algorithms.

2. Applying knowledge and understanding in practical situations:
- being able to apply proper supervised and unsupervised analysis techniques to data;
- being able to use data analysis software tools used during lectures (e.g., scikit-learn);
- being able to compare and correctly interpret different analysis results from different algorithms

3. Communication:
- reporting comprehensive comparative analysis among different data analysis methods;
- being able to present results with appropriate figures and diagrams.
Students should have achieved the learning outcomes of courses "Introduction to Coding and Data Management" and "Probability and Statistics".
1. Introduction to Data Science
2. Similarity Search in Text
- Text representation; Tokenization, Stemming, Lemmatization; Vector space; Similarity measures;
3. Collaborative Filtering ( content-based, item-based)
4. Clustering:
- Centroid-based clustering; Hierarchical clustering; Agglomerative clustering; Density-based clustering; Quality evaluation;
5. Supervised Learning
- Model training, validation and tuning; Classification; Regression; Feature Engineering; Decision Trees;
6. Ensemble methods
- Bagging and Boosting; Bias vs. Variance trade-off; Over-fitting and Under-fitting; Random Forest
Referral texts
- Python Data Science Handbook. O’Reilly. 2016.
- Lecture notes. Selected readings provided during the course.
Assessment methods
Learning outcomes are verified by a set of exercises/reports and by a written exam.

The exercises require to apply data analysis methods to a given dataset of limited complexity.

The project requires to conduct a comparative analysis of different tools applied to a specific dataset or problem.
The student must chose and motivate the most appropriate solution and deliver a report discussing a comparative analysis of the chosen methods.
Teaching methods
Lectures and hands-on sessions.

The following software tools will be used during the course: Jupyter, scikit-learn.
Teaching language
Type of exam
written and oral
Definitive programme.
Last update of the programme