IMAGE AND VIDEO UNDERSTANDING

Academic year
2025/2026 Syllabus of previous years
Official course title
IMAGE AND VIDEO UNDERSTANDING
Course code
CM0524 (AF:576777 AR:323783)
Teaching language
English
Modality
On campus classes
ECTS credits
6
Degree level
Master's Degree Programme (DM270)
Academic Discipline
INF/01
Period
2nd Semester
Course year
1
Where
VENEZIA
The course aims at introducing the student to the principles, the algorithms and the main applications in the field of image and video understanding.
1. Knowledge and understanding
1.1. acquire the main models and algorithms of image and video understanding

2. Ability to apply knowledge and understanding
2.1. acquire the ability to apply the studied models to real problems
2.2. acquire the ability to critically assess the performance and the behavior of a model applied to a concrete problem

3. Judgement
3.1. ability to understand which characteristics of the various models of artificial intelligence are best suited to a given problem
3.2. ability to critically evaluate the theoretical characteristics of the proposed models
The student is expected to be familiar with the basic concepts of calculus, linear algebra and statistics. Knowledge of Python language, together with PyTorch, are recommended.
Neural Network Models for Images and Video:
- Artificial Neural Networks (training, tricks, optimizers)
- Convolutional Neural Networks
- Transformer Architectures
- Graph Neural Networks

Image Analysis:
- Classification
- Segmentation
- Object Detection

Video Understanding:
- Video Object Segmentation
- Object Tracking

Human-Centered Computer Vision:
- Person detection
- Face detection
- Pose Estimation
- Person Re-Identification
- Trajectory Forecasting
- Action Recognition
- Group Detection

Generative AI:
- Auto Encoders & Variational Autoencoders
- GANS
- Diffusion Models

Advanced Topics (tentative):
- Active Learning
- Anomaly Detection
- Multimodal Deep Learning
- Implicit Representation
- Scene Understanding
- R. Szeliski, Computer Vision: Algorithms and Applications. Springer.

- D. Forsyth and J. Ponce. Computer Vision: A modern Approach. Pearson.

- I. Goodfellow, Y. Bengio and A. Courville. Deep Learning. MIT Press
The exam consists of an oral test (70% of the final mark) together with a discussion of a project (30% of the final mark) agreed upon with the teacher.
oral
A. Scores in the 18-22 range will be awarded in the presence of:
- Sufficient knowledge and ability to structure the project;
- Limited ability to justify implementation choices;
- Sufficient communication skills, especially in relation to the use of course-specific language.

B. Scores in the 23-26 range will be awarded in the presence of:
- Fair knowledge and ability to structure the project;
- Fair ability to collect and/or interpret data, proposing effective implementation solutions;
- Fair communication skills, especially in relation to the use of course-specific language.

C. Scores in the 27-30 range will be awarded in the presence of:
- Good or excellent knowledge and ability to structure the project;
- Good or excellent ability to collect and/or interpret data, proposing innovative implementation solutions;
- Fully appropriate communication skills, especially in relation to the use of course-specific language.

D. Lode will be awarded in the presence of excellent knowledge and applied understanding of the program, judgment skills, and communication abilities.
Powerpoint presentations and chalk talk.
To favor an "active" appraoch to the study of the topics covered in the classes, students will be asked to develop a simple project which will be discussed during the oral examination.
Definitive programme.
Last update of the programme: 08/06/2025