DATA WRANGLING AND VISUALISATION

Academic year
2026/2027 Syllabus of previous years
Official course title
DATA WRANGLING AND VISUALISATION
Course code
CT0661 (AF:622606 AR:412781)
Teaching language
English
Modality
On campus classes
ECTS credits
6
Degree level
Bachelor's Degree Programme
Academic Discipline
SECS-S/01
Period
1st Semester
Course year
3
Where
VENEZIA
This elective course, part of the 'Data Science' curriculum in Computer Science, provides the fundamental tools for the management, manipulation, visualization, and communication of data with varying degrees of complexity. The course aims to develop practical skills to address analytical challenges arising in technological, scientific, biomedical, and economic fields. Through this program, students will acquire the methodological and operational foundations necessary to master advanced Data Science tools.
Attendance and participation in the course activities, along with individual study, will enable students to:
1. (knowledge and understanding)
-- know and understand the main methods for data management, manipulation, visualization, and communication, with a specific focus on the concept of cognitive load and the principles of visual perception;
2. (applying knowledge and understanding)
-- describe and visualize data of varying degrees of complexity, choosing the most appropriate methodologies to transform data into structured narratives (storytelling);
-- use statistical software for data manipulation, synthesis, and graphical representation, managing the entire pipeline from raw data to the final output;
3. (making judgements)
-- critically interpret analyses and visualizations, assessing their coherence, ethics, and communicative effectiveness, while justifying the chosen methodological and design strategies.
Basic knowledge of probability theory consistent with the "Probability and Statistics" course (https://www.unive.it/data/course/608540 ) and fundamental concepts of structured programming.
1) Data Wrangling & Transformation
- The Tidy Data Paradigm: Principles of data structure and organization.
- Data Transformation: Relational operations, joining datasets, and data pipelines.
- Data Cleaning & Quality: Assessing consistency and handling missing information.
- String Processing: Text manipulation and regular expressions.
- Temporal Data Handling: Working with dates and times.

2) Data Visualization
- Principles of Data Visualization: Theoretical foundations and cognitive load.
- Preattentive Attributes: Effective use of shapes, colors, and spatial positioning.
- The Grammar of Graphics: A formal framework for building visualizations.
- Graph & Annotation Design: Design patterns for clarity and focus (Data-to-Ink ratio).
- Visual Integrity: Identifying and avoiding misleading graphs.

3) Data Storytelling
- Structuring a Narrative with Data: From raw analysis to a cohesive story.
- Communicating Numbers & Statistics: Making data accessible to different audiences.
- Risk Communication: Explaining absolute vs. relative risks.
- Data Journalism: Best practices, ethics, and case studies in public communication.
K. Healy (2026). Data Visualization: A Practical Introduction. Princeton University Press, 2nd edition. https://socviz.co/
R. A. Irizarry (2025). Introduction to Data Science. Data Wrangling and Visualization with R, 2nd edition. Chapman & Hall. https://rafalab.dfci.harvard.edu/dsbook-part-1/
E. R. Tufte (2001). The Visual Display of Quantitative Information. Graphics Press
H. Wickham and G. Grolemund (2023). R for data science. O’Reilly Media, 2nd edition. https://r4ds.hadley.nz
The final assessment consists of two parts:
1) Computer-based practical test: students will be provided with a dataset to analyze using R. The test covers the complete data pipeline: from data wrangling and the creation of effective visualizations to synthesizing findings into a concise data-driven narrative.
2) Oral interview: students who pass the practical test will take part in an oral discussion. This session is designed to verify the originality of the work and the student’s critical command over the methodological and narrative choices made during the analysis.

The final grade will reflect code accuracy, the quality of the visualizations, and the candidate's ability to independently explain and justify their analytical process.
written and oral

The lecturer has a duty to ensure that the rules regarding the authenticity and originality of exam tests and papers are respected. Therefore, if there is suspicion of irregular conduct, an additional assessment may be conducted, which could differ from the original exam description.

The exam results are graded as follows:
- sufficient (18-22 points): the student demonstrates a sufficient knowledge and understanding of the course methods, is able to apply and interpret them adequately, and uses technical terminology correctly;
- fair (23-25 points): the student shows a good knowledge and understanding of the course methods, applies and interprets them convincingly, and uses technical terminology with fair accuracy;
- good (26-28 points): the student possesses a solid knowledge and understanding of the course methods, applies and interprets them in a fully convincing manner, and employs technical terminology accurately;
- excellent (29-30 points): the student demonstrates an excellent knowledge and understanding of the course methods, applies and interprets them brilliantly, and uses technical terminology with extreme accuracy.

Honors (lode) is reserved for students who, in addition to achieving an excellent result, demonstrate exceptional commitment throughout the course assessments by providing original contributions or insights.
The course consists of interactive lectures, complemented by practical exercises, case study discussions, and computer-based labs. Teaching materials will be provided by the instructor via the Moodle platform. The statistical software used throughout the course is R (www.r-project.org).
Definitive programme.
Last update of the programme: 09/04/2026