07 May 2024 12:00

Retrieving and managing air quality data at the European level

Aula Delta 2C, Edificio DELTA - Campus Scientifico via Torino

Titolo: Retrieving and managing air quality data at the European level: the EEAaq package and the need to manage missing data

SpeakerPaolo Maranzano, Università Milano-Bicocca - Fondazione Eni Enrico Mattei

Link Zoom:

In this talk we discuss the EEAaq software, an R package developed to download, manage and analyze air quality data at the European level from the European Environment Agency (EEA) dataflows. The software (release 0.0.3) is freely available on the R CRAN since August 2023. EEAaq addresses several issues: (1) the EEA air quality download system and the metadata retrieving lacks in practicality and flexibility for non-professionals users; (2) direct collection of data from the agency’s portal requires heavy data manipulation; (3) air quality conditions in Europe are continuously raising considerable interest from researchers and technicians involved in policy evaluation. The EEAaq package provides the users with a set of functions, which can be re-grouped into three categories according to their goal: 1) download, 2) summarize and aggregate data, and 3) build static and dynamic maps. The download functions allow the users to specify either LAU or NUTS-level zone information, a specific shapefile, or a list of coordinates representing the area for which to retrieve the respective air quality data. The summary functions allow for the computation of descriptive statistics, data information, and time aggregation. The mapping functions aim to represent the monitoring stations and to build spatial interpolation maps.

Data provided by the EEA suffer from poor comparability due to the heterogeneity of national and regional agencies. In fact, depending on the countries, pollutants may be measured at different frequencies or even not measured at all. Another serious problem is the high presence of missing values and holes in the collected time series. To address this issue, a variety of algorithms are being developed in the EEAaq package that provide estimates and imputations of missing values by exploiting the multiple seasonality properties (intraday, weekly, and annual) of the pollutants. In particular, variants of the Site-Dependent Effect method (Plaia & Bondì, 2006), which takes into account only the temporal dynamics of the data, are proposed in which they 1) explicitly model the spatial correlation between pollutant stations; 2) model the potential spatial heterogeneity between time series; and 3) model the positive asymmetry that typically characterizes pollutant concentrations. The imputation algorithms are evaluated through a simulation study based on the actual atmospheric monitoring network installed in Europe in 2023.

Joint work with: Riccardo Borgoni, Agostino Tassan Mazzocco (University of Milano-Bicocca)

Bio Sketch:
Paolo Maranzano is a Researcher at the University of Milano Bicocca, where he also obtained a PhD in Statistics in 2021. His research interests are in the area of environmental, scoio-economic and energetic macroeconomics and the analysis of policy impacts.


The event will be held in Italian

Organized by

Dipartimento di Scienze Ambientali, Informatica e Statistica - Gruppo Statistica (Girardi/Prosdocimi)

Search in the agenda