AUTONOMOUS, DISTRIBUTED AND PERVASIVE SYSTEMS-2

Academic year
2023/2024 Syllabus of previous years
Official course title
AUTONOMOUS, DISTRIBUTED AND PERVASIVE SYSTEMS-2
Course code
PHD156-2 (AF:471255 AR:258249)
Modality
On campus classes
ECTS credits
2
Degree level
Corso di Dottorato (D.M.45)
Educational sector code
INF/01
Period
Annual
Course year
1
Where
VENEZIA
Moodle
Go to Moodle page
The need for storing data in compressed form is becoming more and more important due to the ever-growing rate of data produced on a daily basis. To keep up with this data deluge, data compression is a mandatory step to deliver good quality of service in concrete applications.

In this introductory course, you will learn about fundamental data compression algorithms that are all widely adopted by tools that we use every day, like filesystems, computer networks, search engines, and databases. These algorithms have now become indispensable knowledge across many fields in computing, including Information Retrieval, Machine Learning, Natural Language Processing, Applied Physics, and Bioinformatics.
At the end of the course, the student will be able to:
- Understand and analyze the main lossless compression algorithms for integers and sequences of integers.
- Consciously choose a suitable compression technique for the specific practical application.
Basic knowledge about big-O notation, computer architecture, and probability theory.
1. Introduction
- What is and Why Data Compression?
- Basic Model and Limit
- Space vs. Time Trade-offs
- Fundamental Question(s), Undecidability
- Data, Information, Redundancy, Technological Limitations

2. The Static Integer Coding Problem
- Unique Decodability
- Unary, Binary, Minimal Binary, Gamma, Delta, Golomb, Rice, Exponential Golomb, Fibonacci, Variable-Byte
- Effectiveness
- Information Content, Entropy, Distributions
- Zero- Minimum-Redundancy Codes, Kraft-McMillan Inequality

3. List Compressors
- The Sorted List Coding Problem
- Combinatorial Lower Bound
- Blocking, PForDelta, Simple, Elias-Fano, partitioned Elias-Fano, Rank & Select on Bit-Vectors, Interpolative, Directly-Addressable, Hybrid Approaches
- Performance

4. Statistical Compressors
- The Statistical and Minimum-Redundancy Coding Problem
- Assigning Canonical Prefix-Free Codewords
- Shannon-Fano
- Huffman, Canonical Huffman
- Arithmetic Coding
- Asymmetric Numeral Systems

5. Dictionary-based Compressors
- The Dictionary-based Coding Problem
- LZ77, LZSS, gzip, LZ78, LZW
- Variants, Overview of Compressors
- Robert Sedgewick and Kevin Wayne. 2011. Algorithms. Four-th edition. Addison-Wesley Professional, ISBN 0-321-57351-X.
- David Salomon. 2007. Variable-Length Codes for Data Compression. Springer Science & Business Media, ISBN 978-1-84628-959-0.
- Alistair Moffat and Andrew Turpin. 2002. Compression and Coding Algorithms. Springer Science & Business Media, ISBN 978-1-4615-0935-6.
- Gonzalo Navarro. 2016. Compact Data Structures. Cambridge University Press, ISBN 978-1-107-15238-0.
- Research papers indicated in the teaching material
One of the following choices:
1. Implement (correctly!) a data compression algorithm and discuss it.

2. Study a research paper and discuss it.

3. Contribute to the course: prepare some slides on a new topic. (The teacher will give you a list of topics.)
4. Study a research problem together (may lead to a publication).
Frontal lectures, slides, blackboard for additional proofs.
English
oral
Definitive programme.
Last update of the programme: 16/12/2023