Academic year
2022/2023 Syllabus of previous years
Official course title
Course code
CM0527 (AF:398832 AR:214613)
On campus classes
ECTS credits
Degree level
Master's Degree Programme (DM270)
Educational sector code
1st Semester
Course year
Go to Moodle page
This course belongs to the educational activities of the Master in Computer Science that allow the student to acquire advanced instruments for data analysis. The objective of the course is to provide an introduction to R for the application of statistical computational and simulation methods.
Regular and active participation in the teaching activities offered by the course will enable students to:

1. Knowledge and understanding:
1.1 To know the nonparametric approach to statistical inference.
1.2 To know the basic statistical methods for workload modeling and computer systems performance evaluation.
1.3 To know the basic statistical methods for hardware and software scalability analysis.
1.4 To know copulas to simulate high dimensional distributions with complex dependence structures.
1.5 To know the basic concepts of statistical quality controls of products, both tangibles (like semiconductors) and intangibles (like software process improvement and software failure process), and services.

2. Ability to apply knowledge and understanding:
2.1 To know how to apply nonparametric statistical methods.
2.2 To know how to apply the basic statistical methods for workload modeling and computer systems performance evaluation.
2.3 To know how to apply copulas to simulate multivariate distributions.
2.4 To know how to design and apply quality control charts in computer science.
2.5. To know how to apply autonomously the basic computational and programming tools of the R environment.

3. Ability to judge:
3.1 To be able to select the most suitable nonparametric statistical methods for the problem at hand.
3.2 To be able to select the most suitable statistical method for workload modeling and computer systems performance evaluation.
3.3 To be able to select and parametrize the most suitable copula to simulate the multivariate distribution of interest.
3.4 To be able to design the most suitable quality control chart for the problem at hand.

4. Communication skills:
4.1 To be able to communicate the results to the various stakeholders.
4.2 To be able to interact with the lecturer and the other students during the theoretical lessons and practical applications.

5. Learning skills:
5.1 To be able to take lecture notes to integrate and clarify the content of the referral teaching material.
5.2 To be able to self evaluate by addressing the lecturer’s questions and solving exercises.
Basic knowledge of descriptive statistics, probability, parametric inference, coding.
1) Introduction: Computational statistics. Statistical simulation. Statistical hypothesis testing. Parametric and nonparametric tests.

2) Comparing central tendency. Comparing variability. Jointly comparing central tendency and variability. Comparing distributions. Testing for correlation and concordance.

3) Quality 4.0: Statistical quality control in the Industry 4.0 era. Shewhart, CUSUM, EWMA charts for product and services. LCL, UCL, ARL computation.

4) Application of control charts to electronics manufacturing processes. Managing Software Process Improvement through statistical process control. Monitoring software failure process.

5) Workload modeling for computer systems performance evaluation. Simulating supercomputer workload. Workload evaluation of a stochastic model of a supercomputer based on a modified Kiefer–Wolfowitz recursion. Workload import/export using Standard Workload Format. Distributional measure of correlation computation to analyze datasets from supercomputer logs.

6) Statistical methods for software and hardware scalability analysis. Application to benchmark measurements from a raytracer software. Scalability analysis of a SPEC benchmark.

7) How to simulate multivariate distributions with complex dependence structures using Elliptical and Archimedean copulas.

8) Bootstrap: variance estimation, tests, confidence intervals.
Open source books on R
Scientific papers
Lecture notes
The achievement of the course objectives is assessed through a written exam. The exam includes questions related to computational statistics and its application to computer science, including nonparametric statistical methods, statistical simulation and quality control charts. Key-points to pass the exam are: to be able to select the most suitable method for the problem at hand, to apply it correctly and interpret the results. At the end of the course at least a simulated exam will be performed.
The course consists of
a) theoretical lessons describing the various concepts and methods
b) practicals with data analyses and result discussion and communication.
Definitive programme.
Last update of the programme: 08/04/2022