AUTONOMOUS, DISTRIBUTED AND PERVASIVE SYSTEMS-1

Academic year: 2023/2024 Syllabus of previous years

Official course title: AUTONOMOUS, DISTRIBUTED AND PERVASIVE SYSTEMS-1

Course code: PHD156-1 (AF:471254 AR:258248)

Modality: On campus classes

ECTS credits: 2

Degree level: Corso di Dottorato (D.M.45)

Educational sector code: INF/01

Period: Annual

Course year: 1

Where: VENEZIA

Moodle: Go to Moodle page

Contribution of the course to the overall degree programme goals

The term "big data" indicates data so large that traditional algorithmic techniques cannot cope with neither storing nor analyzing it. Very often, however, the sheer size of such datasets is not proportional to their effective information content. Think about DNA sequencing: one human genome requires approximately 1 GB to be stored, and advanced techniques allow to sequence dozens of genomes in few days. Clearly, classic algorithmic techniques do not scale: for instance, all Italian genomes would require approximately 60 Petabytes just to be stored. Any two human genomes, however, are 99.99% similar: the key is compression. The real problem is: can we develop algorithms that operate directly on compressed data (without first decompressing it)?

The course tackles the problem of representing and manipulating big data through the use of compressed data structures. This research direction merges techniques from algorithms, data structures, and information theory in order to obtain structures able to, simultaneously, accelerate operations typical of information retrieval while occupying a space proportional to the compressed data (often, thousands of times smaller than the original datasets).

Expected learning outcomes

At the end of the course, the student will be able to:
1) understand the main lossless compression techniques used to represent unstructured (plain text) and structured (e.g. trees, graphs) data.
2) understand the relation between compression and computation, and how this can be exploited to accelerate operations on compressed data.
3) implement basic compressed data structures

Pre-requirements

basic knowledge about:
- algorithms and data structures (e.g. sorting, hashing, binary search, big-O notation)
- probability theory

1) Entropy, Shannon's theorem, prefix-free codes, compression
2) Bitvectors with rank/select/access, wavelet trees, geometric data structures
3) Compressed suffix arrays, FM-index, Burrows-Wheeler transform
4) Indexes based on Lempel-Ziv compressors

Referral texts

- Navarro, Gonzalo. Compact data structures: A practical approach. Cambridge University Press, 2016.
- original research articles

Assessment methods

One of the following:
- discussion of an existing article in the field
- implementation of a compressed index in C++
- original research (proposal/implementation of a new technique that may lead to an original publication)

Teaching methods

Frontal lectures, slides, blackboard.

Teaching language

English

Further information

Course contents may still vary.

Type of exam

oral

Definitive programme.

Last update of the programme: 25/02/2023

Type	Name	Sender (Domain)	Description	Duration	Policy
Essential	_shibsession[], _shibsstate[]	Unive.it (www.unive.it)	They maintain the session data of the SingleSignOn.	session	Information
Essential	PHPSESSID	Unive.it (www.unive.it)	Unique user identifier for the website applications.	session	Information
Essential	cookie[*]	Unive.it (www.unive.it)	It stores the user's preferences on cookies. user preferences on cookies.	1 month	Information
Essential	cookie	idp.unive.it	It stores the user's preferences on cookies.	1 month	Information
Essential	fe_typo_user	Unive.it (www.unive.it)	Unique user identifier for the reserved area of the website	session	Information
Essential	JSESSIONID	Unive.it (www.unive.it)	Used to create web sessions into the Personal Area.	session	Information
Essential	ADMCMD_prev	Unive.it (www.unive.it)	Used to create web sessions into the Personal Area.	session	Information
Essential	unive.it	Unive.it (www.unive.it)	It stores the user's preferences on cookies.	6 months	Information
Essential	noiframe	Unive.it (www.unive.it)	It stores the user's preferences on cookies.	6 months	Information
Essential	_pk_id[*]	unive/WAI	*	30 days	Information
Essential	_pk_ses[*]	unive/WAI	*	1 day	Information
Essential	_pk_ref[*]	unive/WAI	*	6 months	Information
Essential	_gsas[*]	unive/google	It stores the user's preferences on cookies.	3 months	Information
Essential	_opensaml_req_cookie%[*]	unive	Authentication and SingleSignOn (shibboleth)	session	Information
Google - Youtube	__Secure-1PAPISID	Google (google.com)	Used for targeting purposes in order to acquire web visitors' interests and show them pertinent and customised Google advertising.	2 years	Information
Google - Youtube	CONSENT	Google (google.com)	Used by Google to store the user's preferences.	17 years	Information
Google - Youtube	__Secure-1PSID	Google (google.com)	Used for targeting purposes in order to acquire web visitors' interests and show them pertinent and customised Google advertising.	2 years	Information