NATURAL LANGUAGE PROCESSING

Academic year
2025/2026 Syllabus of previous years
Official course title
NATURAL LANGUAGE PROCESSING
Course code
PHD223 (AF:588778 AR:333353)
Teaching language
English
Modality
On campus classes
ECTS credits
2
Degree level
Corso di Dottorato (D.M.226/2021)
Academic Discipline
INF/01
Period
Annual
Course year
1
Where
VENEZIA
The development of the course is gradual and follows the various levels of language analysis ranging from morphology and syntax to semantics and pragmatics. Each of these levels, in fact, is used to solve specific problems in the field of Natural Language Processing (NLP) such as syntactic parsing, word embedding, semantic parsing, question answering, and the use of generative language models to develop chatbots such as ChatGPT.

The approaches presented are based on neural architectures but space will also be left for important alternative approaches to contextualize the state of the art in the discipline.

The training objective is to provide a broad knowledge of modern techniques of natural language analysis and to indicate the fields in which it is applied.
At the end of the course, the student will be able to:
- Use and know the fundamental algorithms for natural language analysis
- Implement and train models for automatic text analysis
- Choose the most suitable models for specific applications
Basic knowledge of linear algebra and statistics are recommended. Knowledge of Python is also required for practical work. Pytorch and Transformers libraries are a plus.
Introduction
- The NLP pipeline
- Morphology
- Syntax
- Semantics
- Pragmatics
- Tokenization
- Lemmatization and stemming 
- Word-based analysis
- Sentence-based analysis

(Large) Language Models:
- Encoder models
- Decoder Models
- Encoder-Decoder models
- Masked Language Modeling
- Autoregressive Models
- Fine-tuning
- Post-training
- Alignment
All study materials will be provided through Moodle.
Learning assessment involves the development of a project (individual, to be selected with the teacher) in Python aimed at putting into practice the knowledge acquired during the course and addressing a specific NLP problem. The evaluation will be based on three main aspects:

1. Design ability: The project should reflect a clear understanding of the theoretical concepts and methodologies learned. It will be important to demonstrate a structured plan and a critical approach in carrying out the work.
2. Work organization: The ability to manage the various phases of the project, from ideation to implementation, will be evaluated. This includes time management, task division, and collaboration (if applicable).
3. Mastery of tools: During the presentation, the student must demonstrate full mastery of the tools and technologies used and a thorough knowledge of the concepts introduced during the course.
oral
The evaluation criteria are as follows:

A. Scores in the 18-22 range will be awarded in the presence of:
- Sufficient knowledge and ability to structure the project;
- Limited ability to justify implementation choices;
- Sufficient communication skills, especially in relation to the use of course-specific language.

B. Scores in the 23-26 range will be awarded in the presence of:
- Fair knowledge and ability to structure the project;
- Fair ability to collect and/or interpret data, proposing effective implementation solutions;
- Fair communication skills, especially in relation to the use of course-specific language.

C. Scores in the 27-30 range will be awarded in the presence of:
- Good or excellent knowledge and ability to structure the project;
- Good or excellent ability to collect and/or interpret data, proposing innovative implementation solutions;
- Fully appropriate communication skills, especially in relation to the use of course-specific language.

D. Lode will be awarded in the presence of excellent knowledge and applied understanding of the program, judgment skills, and communication abilities.
The course consists of lectures and practical classroom activities to consolidate the concepts learned. As study material, slides and scientific articles will be provided.
Definitive programme.
Last update of the programme: 23/05/2025