DATA MANAGEMENT

Academic year: 2025/2026 Syllabus of previous years

Official course title: GESTIONE DEI DATI DIGITALI

Course code: NS001B (AF:582240 AR:328523)

Teaching language: Italian

Modality: On campus classes

ECTS credits: 6

Degree level: Minor

Academic Discipline: INF/01

Period: 2nd Semester

Course year: 1

Where: VENEZIA

Moodle: Go to Moodle page

Contribution of the course to the overall degree programme goals

Digital data management is today a fundamental cross-cutting competence for understanding, interpreting and addressing complex problems across different disciplinary fields. The quality of individual, professional and organizational decisions increasingly depends on the ability to acquire, organize, process, visualize and interpret digital data, transforming them into useful information and actionable insights.
The course aims to provide students with an introductory, methodological and applied preparation on the main processes of digital data management, with particular attention to the integration of visual data analysis tools, elementary Machine Learning techniques and applications of Generative Artificial Intelligence.
The course is designed for students from different disciplinary backgrounds and does not require prior programming skills. Through guided laboratory activities, participants will learn to build simple data analysis workflows with Orange Data Mining and to design a virtual tutor or conversational assistant with Flowise, possibly integrating documents, knowledge bases and RAG architectures.
By the end of the course, students will be able to develop a digital data management project aimed at solving a practical or knowledge-based problem, documenting the phases of data acquisition, preparation, analysis, visualization and valorization through specialist and generative AI tools.

Expected learning outcomes

Knowledge and understanding
By the end of the course, students will be able to:
• describe the fundamental characteristics of digital data;
• distinguish between structured, semi-structured and unstructured data;
• illustrate the main stages of the digital data management process;
• understand the role of data in analysis, forecasting and decision-making processes;
• describe, at an introductory level, the main data analysis techniques: classification, regression, clustering, exploratory analysis and visualization;
• understand the meaning of vector representation of texts, images and documents;
• distinguish between specialist Artificial Intelligence and Generative Artificial Intelligence;
• describe the general logical functioning of a Large Language Model;
• understand the concepts of prompt, embedding, vector store, retrieval and RAG architecture;
• recognize the potential, limitations and risks of AI tools applied to digital data management.

Ability to apply knowledge and understanding
Students will be able to:
• acquire and organize digital datasets according to an analysis question;
• use visual tools to explore, clean and transform digital data;
• build a data analysis workflow using Orange Data Mining;
• apply simple classification, regression, clustering or visualization techniques;
• interpret the outputs produced by an analysis workflow;
• represent results through appropriate charts, diagrams and visualizations;
• use generative AI tools to summarize, classify, reorganize and query digital content;
• design effective prompts to guide the processing of texts, data and documents;
• configure a simple virtual tutor or conversational assistant using Dify;
• design a simple RAG architecture for querying a document base.

Judgement skills
Students will be able to:
• select appropriate tools, techniques and workflows in relation to the problem addressed;
• assess the quality of the available data and recognize their limitations, incompleteness or distortions;
• critically interpret the results produced by analysis models and AI tools;
• distinguish between correlation, forecasting, classification and explanation;
• recognize possible errors, biases, hallucinations or unverifiable responses produced by generative systems;
• assess the appropriateness of a digital solution in relation to the context of use, the target users and the decision-making or knowledge-related objectives.

Communication skills
Students will be able to:
• clearly present the stages of a digital data management project;
• describe a data analysis workflow using correct basic technical terminology;
• communicate analysis results through charts, visualizations and short reports;
• document the functioning of a virtual tutor or conversational assistant;
• explain the methodological and operational choices adopted in the two assignments;
• collaborate with peers and the instructor during exercises, reviews and project activities.

Pre-requirements

The course is aimed at all those who wish to enhance their specific skills through the integration of elementary techniques for searching, organizing, interpreting and visualizing digital data across different disciplinary fields, in order to improve the quality of their forecasts and decisions. Therefore, apart from basic computer skills, no specific technical knowledge in programming or in the use of particular software for information processing is required, nor are mathematical skills beyond those normally included in upper secondary school curricula: high schools, technical institutes and vocational schools.

The course is divided into two parts and eight teaching units.

PART ONE
The digital data management process

Software tool: Orange Data Mining

1. The new digital intelligence and the data life cycle
• Data, information and knowledge
• Digital data as a cognitive and decision-making resource
• The stages of the digital data management process
• Acquisition, storage, processing, representation, activation and adaptation
• Relationship between data, models, forecasts and decisions
• Human intelligence, artificial intelligence and hybrid intelligence

2. Data acquisition, storage and quality
• Sources of digital data
• Structured, semi-structured and unstructured data
• Datasets, observations, variables and metadata
• Common formats: CSV, Excel, texts, images, documents
• Data quality criteria
• Errors, missing values, duplicates, anomalies and inconsistencies
• Logical organization of data according to the analysis

3. Exploratory analysis, pre-processing and data representation
• Initial exploration of a dataset
• Distributions, relationships, comparisons and trends
• Data cleaning and transformation
• Normalization, encoding and variable selection
• Vector representation of digital texts and images
• Introductory concepts of features, embeddings and similarity
• Use of Orange Data Mining to build visual workflows

4. Models for analysis, forecasting and visualization
• Difference between descriptive, predictive and decision-oriented analysis
• Classification: predicting categories
• Regression: predicting numerical values
• Clustering: identifying groups within data
• Time series and autoregressive models: trends, variations and forecasting
• Basic evaluation of results
• Data visualization and infographics
• Charts, dashboards and communication of results

PART TWO
Generative AI and digital data management

Software tools: Mistral AI Studio, Dify

5. From specialist AI to generative AI
• Difference between specialist AI and generative AI
• Machine Learning, Deep Learning and Large Language Models
• Applications of generative AI to digital data management
• Processing of texts, tables, documents and semi-structured content
• Opportunities, limitations and risks of generative automation
• Critical verification of model-generated outputs

6. Logical-functional architecture of an LLM and prompting techniques
• Tokens, context and language generation
• Embeddings and semantic representation
• Training, inference and alignment
• Hallucinations, bias and output control
• Descriptive, instructional and constrained prompts
• Role prompting, few-shot prompting and structured prompts
• Prompts for analysis, summarization, classification, extraction and data transformation

7. Data processing through generative AI
• Analysis and summarization of documents
• Extraction of information from texts and digital content
• Qualitative classification of content
• Generation of tables, diagrams, reports and operational summaries
• Design of generative workflows to support knowledge-based and decision-making activities
• Use of generative AI platforms, with particular reference to Mistral AI Studio

8. Design of agents, virtual tutors and RAG architectures
• From chatbot to generative AI agent
• System instructions, objectives, constraints and conversational behavior
• Memory, tools and workflows
• Retrieval-Augmented Generation
• Indexing of documents and knowledge bases
• Vector stores and semantic search
• Design of a virtual tutor with Dify
• Overview of multi-agent systems and orchestrators

Referral texts

Main texts

1. G.B. Ronsivalle, La nuova intelligenza digitale. Come trasformare i dati in decisioni per progettare il futuro, Maggioli Editore, Apogeo Education Series, 2022.
2. G.B. Ronsivalle, I. Baccan, A. Bersan, The Orange Box. Il nuovo laboratorio di Machine Learning, Edizioni Wemole, 2024.

Supplementary teaching materials
The instructor will also provide:
• course slides;
• operational handouts;
• practice datasets;
• tutorials for Orange Data Mining;
• tutorials for Mistral AI Studio and Dify;
• workflow examples;
• examples of prompts and virtual tutor configurations;
• documents or text corpora for RAG exercises.

Assessment methods

Assessment is divided into two practical assignments, aimed at verifying the ability to apply the concepts and tools presented during the course.

Step 1 — Development of a workflow with Orange Data Mining
Maximum score: 15 points
Minimum passing threshold: 9 points
The assignment consists of creating a data management and analysis workflow using Orange Data Mining.
Students will be required to:
• select or use a dataset provided by the instructor;
• formulate an analysis question;
• import and organize the dataset in Orange;
• carry out exploratory and pre-processing operations;
• apply at least one analysis, modelling or visualization technique;
• interpret the results obtained;
• produce a short descriptive report on the workflow.

Required submission
• Orange workflow file.
• Dataset used or indication of the source.
• Short descriptive report.
• Any screenshots of the main visualizations.
• Interpretative comment on the results.

Step 2 — Development of a virtual tutor with Dify
Maximum score: 15 points
Minimum passing threshold: 9 points
The assignment consists of designing and implementing a virtual tutor or conversational assistant using Dify.
Students will be required to:
• define an application domain;
• identify the target users and purpose of the tutor;
• design the conversational behavior of the agent;
• configure prompts, instructions and constraints;
• optionally integrate a document base through a RAG architecture;
• test the system through examples of interaction;
• document limitations, risks and possible improvements.

Required submission
• Export or documentation of the Dify workflow.
• Description of the virtual tutor developed.
• System prompt or main instructions.
• Any document corpus used.
• Examples of questions and answers.
• Short critical report on the functioning of the system.

Type of exam

written and oral

The instructor is responsible for ensuring the authenticity and originality of all examinations and coursework. In cases of suspected academic misconduct, an additional on-site assessment may be required during the exams, which may differ from the standard format.

Grading scale

The final exam consists of the assessment of the two practical assignments:

Step 1 — Workflow with Orange Data Mining | Maximum score: 15 points | Minimum threshold: 9 points
Step 2 — Virtual tutor with Dify | Maximum score: 15 points | Minimum threshold: 9 points
Total: Maximum score: 30 points | Minimum threshold: 19 points

The final grade is expressed out of thirty.
Honours may be awarded in the case of particularly original, complete and rigorous assignments, capable of effectively integrating data analysis, visualization, agent design and critical evaluation of results.
The instructor may require a short supplementary interview, possibly in conjunction with the presentation of the assignments, in cases where it is necessary to further assess the authenticity, originality or actual understanding of the work carried out.

Teaching methods

Lessons alternate between:
• theoretical presentation sessions supported by multimedia slides;
• moments of interaction and guided discussion;
• analysis of use cases;
• operational demonstrations of the software tools;
• individual and group exercises;
• guided simulations;
• laboratory activities with Orange Data Mining;
• laboratory activities with Mistral AI Studio and Dify;
• progressive review of the final assignments.

The course adopts a learning-by-doing approach. Each theoretical concept is connected to a practical activity, an applied example or a phase of the final project.
Asynchronous materials, video tutorials and operational worksheets may be made available to further explore the use of the software tools employed during the course.

2030 Agenda for Sustainable Development Goals

This subject deals with topics related to the macro-area "Human capital, health, education" and contributes to the achievement of one or more goals of U. N. Agenda for Sustainable Development

Definitive programme.

Last update of the programme: 06/06/2026

Type	Name	Sender (Domain)	Description	Duration	Policy
Essential	_shibsession[], _shibsstate[]	Unive.it (www.unive.it)	They maintain the session data of the SingleSignOn.	session	Information by Ca' Foscari University
Essential	PHPSESSID	Unive.it (www.unive.it)	Unique user identifier for the website applications.	session	Information by Ca' Foscari University
Essential	cookie[*]	Unive.it (www.unive.it)	It stores the user's preferences on cookies. user preferences on cookies.	1 month	Information by Ca' Foscari University
Essential	cookie	idp.unive.it	It stores the user's preferences on cookies.	1 month	Information by Ca' Foscari University
Essential	fe_typo_user	Unive.it (www.unive.it)	Unique user identifier for the reserved area of the website	session	Information by Ca' Foscari University
Essential	JSESSIONID	Unive.it (www.unive.it)	Used to create web sessions into the Personal Area.	session	Information by Ca' Foscari University
Essential	ADMCMD_prev	Unive.it (www.unive.it)	Used to create web sessions into the Personal Area.	session	Information by Ca' Foscari University
Essential	unive.it	Unive.it (www.unive.it)	It stores the user's preferences on cookies.	6 months	Information by Ca' Foscari University
Essential	noiframe	Unive.it (www.unive.it)	It stores the user's preferences on cookies.	6 months	Information by Ca' Foscari University
Essential	_pk_id[*]	unive/WAI	*	30 days	Information by Matomo
Essential	_pk_ses[*]	unive/WAI	*	1 day	Information by Matomo
Essential	_pk_ref[*]	unive/WAI	*	6 months	Information by Matomo
Essential	_gsas[*]	unive/google	It stores the user's preferences on cookies.	3 months	Information by Google
Essential	_opensaml_req_cookie%[*]	unive	Authentication and SingleSignOn (shibboleth)	session	Information by Ca' Foscari University
Google - Youtube	__Secure-1PAPISID	Google (google.com)	Used for targeting purposes in order to acquire web visitors' interests and show them pertinent and customised Google advertising.	2 years	Information by Google
Google - Youtube	CONSENT	Google (google.com)	Used by Google to store the user's preferences.	17 years	Information by Google
Google - Youtube	__Secure-1PSID	Google (google.com)	Used for targeting purposes in order to acquire web visitors' interests and show them pertinent and customised Google advertising.	2 years	Information by Google
Essential	Socialpix	Unive.it (www.unive.it)	They are used to record cookie preferences	6 months	Information by Ca' Foscari University
Facebook - Pixel	_fbp	Unive.it (www.unive.it)	Tracks users for retargeting advertising on Facebook	3 months	Information by Facebook
Facebook - Pixel	datr	Facebook	Marketing	2 anni	Information by Facebook