DATA ANALYSIS WITH STATISTICAL PROGRAMMING

Academic year
2021/2022 Syllabus of previous years
Official course title
DATA ANALYSIS WITH STATISTICAL PROGRAMMING
Course code
PHD154 (AF:364598 AR:193142)
Modality
On campus classes
ECTS credits
6
Degree level
Corso di Dottorato (D.M.45)
Educational sector code
SECS-S/01
Period
1st Semester
Course year
1
Where
VENEZIA
Moodle
Go to Moodle page
This course belongs to the educational activities of the PhD that allow the student to acquire instruments for data manipulation and analysis. We will learn the basics of statistical inference all while analyzing data. We will also acquire the minimal programming skills necessary for dealing with environmental data. Using R, we will also learn the basics of doing reproducible research.

The course is part of a three-course statistics path which is offered to doctoral students of the Department: the path is composed of a first introductory course (PHD154, Data Analysis with Statistical Programming) which precedes two more advanced non-sequential course (PHD140, Statistics and PHD124, Applied Time Series). Students who are interested in gaining a more solid background in statistical sciences are highly encouraged to follow all courses and to discuss with the instructor of the course for their PhD program the best course of action.
Regular and active participation in the teaching activities offered by the course will enable students to:

1. Knowledge and understanding:
1.1 To know the most important statistical methods for data analysis, with focus on environmental data.

2. Ability to apply knowledge and understanding:
2.1 To know how to apply statistical methods.
2.2. To know how to apply autonomously the basic computational tools of the R environment.
2.3 To know how to use autonomously the R programming language to modify existing code or to write new code for data analysis.


3. Ability to judge:
3.1 To be able to select the most suitable statistical methods for the problem at hand.

4. Communication skills:
4.1 To be able to communicate the results to the various stakeholders.
4.2 To be able to interact with the lecturer and the other students during the theoretical lessons and practical applications.

5. Learning skills:
5.1 To be able to take lecture notes to integrate and clarify the content of the referral teaching material.
5.2 To be able to self evaluate by addressing the lecturer’s questions and solving exercises.
Basic knowledge of statistics, mathematics, coding.
RStudio. R Markdown for reproducible research.

Basic R programming. Logical expressions. Vectors, matrices, data frames, lists. Reading, writing, editing data. Read data downloaded from the internet. Conditional execution. Loops. Recursion. Fast R code and vectorization.

Descriptive statistics. Plotting.

Statistical inference.

Nonparametric hypothesis testing. Rank tests.

Analysis of tabular data.

Analysis of variance. Parametric and nonparametric one-way ANOVA. Post-hoc analysis and multiple comparisons. Two-way ANOVA.

Correlation and concordance.

Linear regression. Estimation and hypothesis testing. Goodness of fit.

Cases studies.
Open source books on R
Scientific papers
Lecture notes given by the lecturer
Paper to be written according to a template given by the lecturer.
The course comprises of
a) theoretical lessons describing the various concepts and methods
b) practicals with data analyses, code programming and result discussion and communication.
English
None.
written
Definitive programme.
Last update of the programme: 10/09/2021