Strumenti Utente

Strumenti Sito


Statistical Methods for Data Science A.Y. 2019/20



Dates are preliminary.

Day of Week Hour Room
Tuesday 16:00 - 18:00 Fib-L1 Distance Learning
Wednesday 9:00 - 11:00 Fib-A1 Distance Learning


Students should be comfortable with most of the topics on mathematical calculus covered in:

  • [P] J. Ward, J. Abdey. Mathematics and Statistics. University of London, 2013. Chapters 1-8 of Part 1.

Extra-lessons refreshing such notions may be planned in the first part of the course.

Mandatory Teaching Material

The following are mandatory text books:

  • [T] F.M. Dekking C. Kraaikamp, H.P. Lopuha, L.E. Meester. A Modern Introduction to Probability and Statistics. Springer, 2005.
  • [R] P. Dalgaard. Introductory Statistics with R. 2nd edition, Springer, 2008.


Preliminary program and calendar

Student project

Written exam

There are no mid-terms. The exam consists of a written part and an oral part. The written part consists of exercises on the topics of the course. Each question is assigned a grade, summing up to 30 points. Students are admitted to the oral part if they receive a grade of at least 18 points. Written exam consists of open questions and exercises. Example written texts: sample1, sample2. Oral consists of critical discussion of the written part and of open questions and problem solving on the topics of the course.
Online exams: during the COVID-19 restrictions, the written part and the oral part will be online. For the written part, students will connect to Google Meet (room code: 500PP) and will activate both microphone and web-cam. Each sheet will include name, surname, student id, and it will be signed. A picture of the sheets will be delivered to ruggieri [at] di [dot] unipi [dot] it.

Registration to exams is mandatory (deadline is 5 days before the exam!): register here

Date Hour Room
24/7/2020 9:00 - 11:00 Online exam
7/9/2020 11:00 - 13:00 Online exam

Class calendar

Distance-learning lessons: see instructions for Google Meet and use the room code: 500PP.

Date Room Topic Learning material
1 25.02 16:00-18:00 L1 Introduction. Probability and independence. [T] Chpts. 1-3
2 26.02 9:00-11:00 A1 R basics. [R] Chpts. 1,2.1,2.2 slides script1.R
3 03.03 16:00-18:00 L1 Discrete random variables. [T] Chpt. 4 [R] Chpt. 3 script2.R
4 04.03 9:00-11:00 A1 Continuous random variables. Simulation. [T] Chpts. 5, 6.1-6.2 [R] Chpt. 3 script3.R
5 10.03 16:00-18:00 Distance-learning Recalls: derivatives and integrals. rec01 audio-video (.flv) [P] Chpt. 1-8 scriptMath.R
6 11.03 9:00-11:00 Distance-learning Expectation and variance. R data access. rec02 audio-video (.flv) [T] Chpt. 7 [R] Chpt. 2.4 script4.R
7 17.03 16:00-18:00 Distance-learning R programming. Project presentation. rec03 audio-video (.flv) and project info audio-video (.flv) [R] Chpt. 2.3 exercise.R
8 18.03 9:00-11:00 Distance-learning Project presentation. Power laws and Zipf laws. rec04 audio-video (.flv) Newman's paper Sect I, II, III(A,B,E,F) script6.R
9 24.03 16:00-18:00 Distance-learning Computations with random variables. Joint distributions. rec05 audio-video (.flv) [T] Chpts. 8-9
10 25.03 9:00-11:00 Distance-learning Covariance. Sum of random variables. rec06 audio-video (.flv) [T] Chpts. 10-11 script8.R
11 31.03 16:00-18:00 Distance-learning Law of large numbers. The central limit theorem. rec07 audio-video (.flv) [T] Chpts. 13-14 script9.R
12 1.04 9:00-11:00 Distance-learning Graphical summaries. rec08 audio-video (.flv) [T] Chpt. 15 script10.R
13 7.04 16:00-18:00 Distance-learning Numerical summaries. Data preprocessing in R. Q&A on the project. rec09 audio-video (.flv), project data audio-video (.flv) [T] Chpt. 16, [R] Chpts. 4,10 script11.R, dataprep.R
14 8.04 9:00-11:00 Distance-learning Unbiased estimators. Efficiency and MSE. rec10 audio-video (.flv) [T] Chpts. 17.1-17.3, 19, 20 script12.R
XX 15.04 9:00-11:00 No lesson on this date. Students work on the project on their own.
15 21.04 16:00-18:00 Distance-learning Maximum likelihood. Fisher information.rec11 audio-video (.flv) [T] Chpt. 21 notes1.pdf script13.R
16 22.04 9:00-11:00 Distance-learning Simple linear and polynomial regression. Least squares. rec12 audio-video (.flv) [T] Chpts. 17.4,22 [R] Chpts. 6,12.1 script14.R
17 28.04 16:00-18:00 Distance-learning Multiple, non-linear, and logistic regression. rec13 audio-video (.flv) [R] Chpt. 13,16.1-16.2 notes2.pdf script15.R
18 29.04 9:00-11:00 Distance-learning Confidence intervals: Gaussian, T-student, large sample method. rec14 audio-video (.flv) [T] Chpts. 23.1,23.2,23.4, 24.3,24.4 script16.R
19 05.05 16:00-18:00 Distance-learning Confidence intervals in linear regression. Empirical bootstrap. Application to confidence intervals. rec15 audio-video (.flv) [T] Chpts. 18.1,18.2,23.3 notes2.pdf script17.R
20 06.05 9:00-11:00 Distance-learning Parametric bootstrap. Hypotheses testing. rec16 audio-video (.flv) [T] Chpts. 18.3,25 script18.R
21 12.05 16:00-18:00 Distance-learning One-sample t-test and application to linear regression. rec17 audio-video (.flv) [T] Chpts. 26-27, [R] Chpts. 5.1,5.2 notes2.pdf script19.R
22 13.05 9:00-11:00 Distance-learning Goodness of fit: chi-square, K-S. Fitting power laws. rec18 audio-video (.flv) K-S script20.R
XX 19.05 16:00-18:00 No lesson on this date. Students work on the project on their own.
23 20.05 9:00-11:00 Distance-learning Hypotheses testing: F-test, comparing two samples. rec19 audio-video (.flv) [T] Chpts. 28, [R] Chpts. 5.3-5.7 script21.R
XX 26.05 16:00-18:00 No lesson on this date. Students work on the project on their own.
24 27.05 9:00-11:00 Distance-learning Project tutoring. rec20 audio-video (.flv)

Previous years

mds/smd/start.txt · Ultima modifica: 10/07/2020 alle 10:25 (3 giorni fa) da Salvatore Ruggieri