KNOWLEDGE, INTERACTION AND INTELLIGENT SYSTEMS-1

Academic year: 2021/2022 Syllabus of previous years

Official course title: KNOWLEDGE, INTERACTION AND INTELLIGENT SYSTEMS-1

Course code: PHD157-1 (AF:364606 AR:193149)

Teaching language: Inglese

Modality: On campus classes

ECTS credits: 2

Degree level: Corso di Dottorato (D.M.45)

Academic Discipline: INF/01

Period: Annual

Course year: 1

Where: VENEZIA

Moodle: Go to Moodle page

Contribution of the course to the overall degree programme goals

The term "big data" indicates data so large that traditional algorithmic techniques cannot cope with neither storing nor analyzing it. Very often, however, the sheer size of such datasets is not proportional to their effective information content. Think about DNA sequencing: one human genome requires approximately 1 GB to be stored, and advanced techniques allow to sequence dozens of genomes in few days. Clearly, classic algorithmic techniques do not scale: for instance, all Italian genomes would require approximately 60 Petabytes just to be stored. Any two human genomes, however, are 99.99% similar: the key is compression. The real problem is: can we develop algorithms that operate directly on compressed data (without first decompressing it)?

The course tackles the problem of representing and manipulating big data through the use of compressed data structures. This research direction merges techniques from algorithms, data structures, and information theory in order to obtain structures able to, simultaneously, accelerate operations typical of information retrieval while occupying a space proportional to the compressed data (often, thousands of times smaller than the original datasets).

Expected learning outcomes

At the end of the course, the student will be able to:
1) understand the main lossless compression techniques used to represent unstructured (plain text) and structured (e.g. trees, graphs) data.
2) understand the relation between compression and computation, and how this can be exploited to accelerate operations on compressed data.
3) implement basic compressed data structures in C++ using the sdsl library.

Pre-requirements

basic knowledge about:
- algorithms and data structures (e.g. sorting, hashing, binary search, big-O notation)
- probability theory
- basic knowledge of C/C++

1) Entropy, Shannon's theorem, prefix-free codes, compression
2) Bitvectors with rank/select/access, wavelet trees, geometric data structures
3) Compressed suffix arrays, FM-index, Burrows-Wheeler transform
4) Indexes based on Lempel-Ziv compressors
5) Practical compressed data structures: the sdsl library

Referral texts

- Navarro, Gonzalo. Compact data structures: A practical approach. Cambridge University Press, 2016.
- articoli di ricerca

Assessment methods

One of the following:
- discussion of an existing article in the field
- implementation of a compressed index in C++
- original research (proposal/implementation of a new technique that may lead to an original publication)

Type of exam

oral

Teaching methods

Frontal lectures, slides, blackboard.

Definitive programme.

Last update of the programme: 31/05/2021

Type	Name	Sender (Domain)	Description	Duration	Policy
Essential	_shibsession[], _shibsstate[]	Unive.it (www.unive.it)	They maintain the session data of the SingleSignOn.	session	Information by Ca' Foscari University
Essential	PHPSESSID	Unive.it (www.unive.it)	Unique user identifier for the website applications.	session	Information by Ca' Foscari University
Essential	cookie[*]	Unive.it (www.unive.it)	It stores the user's preferences on cookies. user preferences on cookies.	1 month	Information by Ca' Foscari University
Essential	cookie	idp.unive.it	It stores the user's preferences on cookies.	1 month	Information by Ca' Foscari University
Essential	fe_typo_user	Unive.it (www.unive.it)	Unique user identifier for the reserved area of the website	session	Information by Ca' Foscari University
Essential	JSESSIONID	Unive.it (www.unive.it)	Used to create web sessions into the Personal Area.	session	Information by Ca' Foscari University
Essential	ADMCMD_prev	Unive.it (www.unive.it)	Used to create web sessions into the Personal Area.	session	Information by Ca' Foscari University
Essential	unive.it	Unive.it (www.unive.it)	It stores the user's preferences on cookies.	6 months	Information by Ca' Foscari University
Essential	noiframe	Unive.it (www.unive.it)	It stores the user's preferences on cookies.	6 months	Information by Ca' Foscari University
Essential	_pk_id[*]	unive/WAI	*	30 days	Information by Matomo
Essential	_pk_ses[*]	unive/WAI	*	1 day	Information by Matomo
Essential	_pk_ref[*]	unive/WAI	*	6 months	Information by Matomo
Essential	_gsas[*]	unive/google	It stores the user's preferences on cookies.	3 months	Information by Google
Essential	_opensaml_req_cookie%[*]	unive	Authentication and SingleSignOn (shibboleth)	session	Information by Ca' Foscari University
Google - Youtube	__Secure-1PAPISID	Google (google.com)	Used for targeting purposes in order to acquire web visitors' interests and show them pertinent and customised Google advertising.	2 years	Information by Google
Google - Youtube	CONSENT	Google (google.com)	Used by Google to store the user's preferences.	17 years	Information by Google
Google - Youtube	__Secure-1PSID	Google (google.com)	Used for targeting purposes in order to acquire web visitors' interests and show them pertinent and customised Google advertising.	2 years	Information by Google
Essential	Socialpix	Unive.it (www.unive.it)	They are used to record cookie preferences	6 months	Information by Ca' Foscari University
Facebook - Pixel	_fbp	Unive.it (www.unive.it)	Tracks users for retargeting advertising on Facebook	3 months	Information by Facebook
Facebook - Pixel	datr	Facebook	Marketing	2 anni	Information by Facebook