LAB OF INFORMATION SYSTEMS AND ANALYTICS
|Academic year||2020/2021 Syllabus of previous years|
|Official course title||LAB OF INFORMATION SYSTEMS AND ANALYTICS|
|Course code||ET7008 (AF:304951 AR:170894)|
|Modality||On campus classes|
|Degree level||Bachelor's Degree Programme|
|Educational sector code||INF/01|
Students will achieve the following learning outcomes:
Knowledge and understanding: i) understanding principles of non-supervised learning; ii) understanding principles of supervised learning.
Applying knowledge and understanding: i) being able to apply supervised and unsupervised analysis techniques; ii) being able to use data analysis software tools (e.g., scikit-learn).
Communication: i) reporting comprehensive comparative analysis among different data analysis methods
2. Similarity Search in Text
- Text representation; Tokenization, Stemming, Lemmatization; Vector space; Similarity measures;
3. Collaborative Filtering
- content-based, item-based collaborative filtering
- Centroid-based clustering; Hierarchical clustering; Agglomerative clustering; Density-based clustering; Quality evaluation;
5. Supervised Learning
- Model training, validation and tuning; Classification; Regression; Feature Engineering; Decision Trees;
6. Ensemble methods
- Bagging and Boosting; Bias vs. Variance trade-off; Over-fitting and Under-fitting; Random Forest
- Lecture notes. Selected readings provided during the course.
The exercises require to apply data analysis methods to a given dataset of limited complexity.
The project requires to conduct a comparative analysis of different tools applied to a specific dataset or problem.
The student must chose and motivate the most appropriate solution and deliver a report discussing a comparative analysis of the chosen methods.