DEEP LEARNING FOR NATURAL LANGUAGE PROCESSING

Academic year
2023/2024 Syllabus of previous years
Official course title
DEEP LEARNING FOR NATURAL LANGUAGE PROCESSING
Course code
CM0624 (AF:398333 AR:215032)
Modality
On campus classes
ECTS credits
6
Degree level
Master's Degree Programme (DM270)
Educational sector code
INF/01
Period
1st Semester
Course year
2
Where
VENEZIA
Moodle
Go to Moodle page
The course is part of the Computer Science curriculum and focuses on natural language analysis techniques with approaches based on deep neural architectures and more traditional models. The course develops gradually, starting from the foundations of automatic text analysis and covering state-of-the-art generative models.

Modern models straddling vision and language, audio and language, and special architectures such as graph neural networks and diffusion processes are also covered.

The training goal is to provide a broad knowledge of modern techniques of natural language analysis and the fields in which it is applied.
At the end of the course, the student will be able to:
- Use and know the fundamental algorithms for natural language analysis
- Implement and train models for automatic text analysis
- Choose the most suitable models for specific applications
Basic knowledge of linear algebra and statistics are recommended. Knowledge of Python for practical work is also required.
Introduction
- Syntactic and semantic analysis
- Regular Expressions
- Tokenization
- Lemmatization and stemming 
- Part-Of-Speech tagging
- Dependency and Consistency parsing 
- Word Sense Disambiguation

Word Embedding Models:
- Bag of Word
- CBOW
- Skipgram
- Word2Vec
- GloVe
- ELMo

Deep Learning for Sequences
- Recurrent networks and language models
- Backprop through time
- LSTM
- GRU

Attention Mechanisms
- Self-Attention
- Transformers

(Large) Language Models:
- BERT
- Generative Pre-trained Transformer (GPT), GPT-3, GPT-4, ChatGPT

Applications
- Text classification (sentiment analysis, language classification, intent classification)
- Named Entity Recognition
- Machine Translation: seq2seq
- Question Answering
- Text Summarization
- Topic Modeling (LDA, BERTopic)

Graphs and NLP
- Graph Neural Networks
- Knowledge Graphs

Vision and Language (tentative)
- Image Captioning
- CLIP
- Generative Models

Audio and Language (tentative)
- Speech-to-Text

Diffusion Processes (tentative)
- DDPM
- Stable Diffusion
All study materials will be provided through Moodle.
The learning assessment includes an oral exam and the development of a project in Python with oral discussion.
The course consists of lectures and practical classroom activities to consolidate the concepts learned. As study material, slides and scientific articles will be provided.
English
oral
Definitive programme.
Last update of the programme: 14/01/2024