DATA ANALYSIS

Academic year 2018/2019 ANALISI DEI DATI CT0427 (AF:248800 AR:136461) Frontal Lesson 6 Bachelor's Degree Programme SECS-S/01 2nd Semester 2 VENEZIA
Contribution of the course to the overall degree programme goals
This course belongs to the interdisciplinary educational activities of the Data Science curriculum of the Bachelor in Informatics. The course aims at providing students with the basic instruments of statistical inference and data analysis. The objective of the course is to develop skills to solve statistical questions arising in technology, science, medicine, economics and business. Special attention will be paid to the integration of methodology with computational tools through the R language. The achievement of the educational objectives of the course will provide the student with the tools to learn more advanced data science methods.
Expected learning outcomes
Regular and active participation in the teaching activities offered by the course and in independent research activities will enable students to:
1. (knowledge and understanding)
-- acquire knowledge and understanding regarding basic inferential and data analysis methods
2. (applying knowledge and understanding)
-- synthetize and model phenomena characterized by variability and uncertainty
-- identify and fit statistical models for prediction
-- use statistical software for manipulation, synthesis and analysis of data
3. (making judgements)
-- correctly evaluate the outcomes of statistical software analyses
Pre-requirements
Students are assumed to have reached the learning objectives of the course Probability e Statistics (www.unive.it/data/course/230177) although it is not formally required to have passed the examination. It is important that the students have a solid familiarity with the main properties and operations involving discrete and continuous random variables.
Contents
The course program includes presentation and discussion of the following subjects:
1. Population and sample
2. Estimation:
- method-of-moments and maximum likelihood
- point and interval estimation
- hypothesis testing
- non-parametric methods
3. Bootstrap simulation
4. Linear regression:
- ordinary least squares
- analysis of variance
- multiple linear regression
Methods will be illustrated with simulated and real data using the R language (www.r-project.org). Special attention will be paid to applications regarding various aspects of sustainability.
Referral texts
- Baron M (2014). Probability and Statistics for Computer Scientistis. Second Edition. CRC Press. Chapters 9-10-11
- Additional readings and materials distributed during the course through the Moodle platform
Assessment methods
The achievement of the course objectives is assessed through a written exam. The exam consists of four exercises designed to measure
1. the theoretical knowledge of the course topics,
2. the ability to apply them for solving real data problems.
The maximal score for each exercise is 8 points. The final score is the sum of the scores of the four exercises. A total score exceeding 30 corresponds to 30 with honors. During the written test the use of books, notes, or electronic media is *not* allowed.
Teaching methods
Conventional theoretical lectures complemented by exercise classes and discussion of case studies. Teaching material prepared by the lecturer will be distributed during the course through the Moodle platform. The statistical software used in the course is R (www.r-project.org).
Teaching language
Italian
Type of exam
written
Sustainability
• This subject deals with topics related to the macro-area "Climate change and energy" and contributes to the achievement of one or more goals of U. N. Agenda for Sustainable Development
Definitive programme.
Last update of the programme
25/06/2018