Strumenti Utente

Strumenti Sito


mds:sds:start

Statistics for Data Science (628PP) A.Y. 2021/22

Instructor

Classes

Lessons will be also live-streamed on the Teams space.

Day of Week Hour Room
Tuesday 16:00 - 18:00 Fib A1
Thursday 16:00 - 18:00 Fib C1
Friday 14:00 - 16:00 Fib A1

Pre-requisites

Students should be comfortable with most of the topics on mathematical calculus covered in:

  • [P] J. Ward, J. Abdey. Mathematics and Statistics. University of London, 2013. Chapters 1-8 of Part 1.

Extra-lessons refreshing such notions may be planned in the first part of the course.

Mandatory Teaching Material

The following are mandatory text books:

  • [T] F.M. Dekking C. Kraaikamp, H.P. Lopuha, L.E. Meester. A Modern Introduction to Probability and Statistics. Springer, 2005.
  • [R] P. Dalgaard. Introductory Statistics with R. 2nd edition, Springer, 2008.
  • selected chapters of other books for advanced topics

Software

Preliminary program and calendar

Student project

  • The project can be done in groups of at most 3 students.
  • The project must be delivered (report + code) by end of July.
  • The oral discussion must be done by the September session, and it will cover both the project and all topics of the course.
  • The project replaces the written exam but students have to register for the written dates in order to fill the student's questionnaire.
  • Groups ready to discuss send the project to the teacher plus availability time slots for oral discussion.

Written exam

There are no mid-terms. The exam consists of a written part and an oral part. The written part consists of exercises on the topics of the course. Each question is assigned a grade, summing up to 30 points. Students are admitted to the oral part if they receive a grade of at least 18 points. Written exam consists of open questions and exercises. Example written texts will be added here. Oral consists of critical discussion of the written part and of open questions and problem solving on the topics of the course.

Registration to exams is mandatory (beware of the registration deadline!): register here

Date Hour Room Notes
9/9/2022 9:00 - 11:00 C1

Class calendar

Lessons will be live-streamed on the Teams space and recorded.

To watch the recordings online, you must be connected to the unipi.it VPN. Alternatively, right click on the link and download the whole file, then watch it locally on your computer.

Slides and R scripts might be updated after the classes to align with actual content of lessons and to correct typos. Be sure to download the updated versions.

# Date Room Topic Teaching material
01 15/02 16-18 A1+Teams Introduction. Probability and independence. rec01 (.mp4) [T] Chpts. 1-3 slides01 (.pdf)
02 17/02 16-18 C1+Teams R basics. rec02 (.mp4) [R] Chpts. 1,2.1-2.3 slides02 (.pdf), script02 (.R)
03 18/02 14-16 A1+Teams Bayes' rule and applications. rec03 (.mp4) [T] Chpt. 3 slides03 (.pdf), script03 (.R)
04 22/02 16-18 A1+Teams Discrete random variables. rec04 (.mp4) [T] Chpts. 4, 9.1, 9.2, 9.4 [R] Chpt. 3 slides04 (.pdf), script04 (.R)
05 24/02 16-18 C1+Teams Discrete random variables (continued) rec05 (.mp4)
06 25/02 14-16 A1+Teams Recalls: derivatives and integrals. rec06 (.mp4) [P] Chpt. 1-8 slides06 (.pdf), script06 (.R)
07 01/03 16-18 A1+Teams R data access and programming. rec07 (.mp4) [R] Chpt. 2.3,2.4 script07 (.zip)
08 03/03 16-18 C1+Teams Continuous random variables.rec08 (.mp4) [T] Chpts. 5, 9.2-9.4 [R] Chpt. 3 slides08 (.pdf), script08 (.R)
09 04/03 14-16 A1+Teams Expectation and variance. Computations with random variables.rec09 (.mp4) [T] Chpts. 7,8 slides09 (.pdf), script09 (.R)
10 08/03 16-18 A1+Teams Expectation and variance. Computations with random variables (continued).rec10 (.mp4)
11 10/03 16-18 C1+Teams Moments. Functions of random variables.rec11 (.mp4) [T] Chpts. 9-11 slides11 (.pdf), script11 (.zip)
12 11/03 14-16 A1+Teams Simulation. rec12 (.mp4) [T] Chpts. 6.1-6.2 slides12 (.pdf), script12 (.R) script12_sol07 (.R)
13 15/03 16-18 A1+Teams Power laws and Zipf's law. rec13 (.mp4) Newman's paper Sect I, II, III(A,B,E,F) slides13 (.pdf), script13 (.R)
14 17/03 16-18 C1+Teams Law of large numbers. The central limit theorem. rec14 (.mp4) [T] Chpts. 13-14 slides14 (.pdf), script14 (.R)
18/03 14-15 A1+Teams Office hours (open Q&A)
15 22/03 16-18 A1+Teams Graphical summaries. rec15 (.mp4) [T] Chpt. 15, [R] Chpt. 4 slides15 (.pdf), script15 (.R)
16 24/03 16-18 C1+Teams Numerical summaries.rec16 (.mp4) [T] Chpt. 16, [R] Chpt. 4 slides16 (.pdf), script16 (.R)
17 25/03 14-16 A1+Teams Data preprocessing in R. Estimators.rec17 (.mp4) [R] Chpt. 10, [T] Chpts. 17.1-17.3script17 (.R), dataprep.R
18 29/03 16-18 A1+Teams Unbiased estimators. Efficiency and MSE.rec18 (.mp4) [T] Chpts. 19, 20 slides18 (.pdf), script18 (.R)
19 31/03 16-18 Teams Maximum likelihood estimation.rec19 (.mp4) [T] Chpt. 21 sdsln.pdf Chpt. 1 slides19 (.pdf), script19 (.R)
20 05/04 16-18 Teams Linear regression. Least squares estimation.rec20 (.mp4) [T] Chpts. 17.4,22 [R] Chpt. 6 sdsln.pdf Chpt. 2 slides20 (.pdf), script20 (.R)
21 07/04 16-18 C1+Teams Multiple, non-linear, and logistic regression.rec21 (.mp4) [R] Chpt. 12.1,13,16.1-16.2 sdsln.pdf Chpt. 2 slides21 (.pdf), script21 (.R)
22 08/04 14-16 Teams Multiple, non-linear, and logistic regression (continued).rec22 (.mp4) [R] Chpt. 12.1,13,16.1-16.2 slides22 (.pdf), script22 (.zip)
23 12/04 16-18 Teams Statistical decision theory.rec23 (.mp4) sdsln.pdf Chpt. 4 slides23 (.pdf), script23 (.R)
24 14/04 16-18 Teams Project presentation + Office hours.rec24 (.mp4) See student project
25 21/04 16-18 Teams Statistical decision theory (continued).rec25 (.mp4)
26 22/04 14-16 Teams Confidence intervals: mean, proportion, linear regression.rec26 (.mp4) [T] Chpts. 23.1,23.2,23.4,24.3,24.4 sdsln.pdf Chpt. 3 slides26 (.pdf), script26 (.R)
27 26/04 16-18 Teams Bootstrap and resampling methods.rec27 (.mp4) [T] Chpts. 18.1-18.3,23.3 slides27 (.pdf), script27 (.R)
28 28/04 16-18 C1+Teams Bootstrap and resampling methods (continued).rec28 (.mp4)
29 29/04 14-16 A1+Teams Hypotheses testing. One-sample tests of the mean and application to linear regression.rec29 (.mp4) [T] Chpts. 25,26,27, [R] Chpts. 5.1,5.2 sdsln.pdf Chpt.3.3 slides29 (.pdf), script29 (.R)
30 04/05 9-11 Gerace+Teams Bias in statistics and causal reasoning. Speaker: prof. Fabrizia Mealli rec30 (.mp4) slides30 (.pdf) Optional reading
31 04/05 11-13 Gerace+Teams Bias in statistics and causal reasoning (continued). Speaker: prof. Fabrizia Mealli rec31 (.mp4)
32 10/05 16-18 A1+Teams One-sample tests of the mean and application to linear regression (continued). Project tutoring. rec32 (.mp4)
33 12/05 16-18 C1+Teams Multiple comparisons. Fitting distributions. rec33 (.mp4) K-S, slides33 (.pdf), script33 (.R)
34 13/05 14-16 A1+Teams Two-sample tests of the mean, and F-test. rec34 (.mp4) [T] Chpts. 28, [R] Chpts. 5.3-5.7 slides34 (.pdf), script34 (.R)
35 17/05 16-18 A1+Teams Testing correlation/independence. Multiple-sample tests of the mean. rec35 (.mp4) [R] Chpts. 7, 8 slides35 (.pdf), script35 (.R)
36 19/05 16-18 C1+Teams Multiple-sample tests of the mean (continued). Project tutoring. rec36 (.mp4)

Past years

This course of 9 ECTS replaces an older 6 ECTS version.

The 6 ECTS version is discontinued. Students having the 6 ECTS version in their study plan can still take the 6 ECTS version exam for the A.Y. 2021/22, 2022/23 and 2023/24. However, there will no specific project for the 6 ECTS version.

mds/sds/start.txt · Ultima modifica: 09/09/2022 alle 10:33 (3 settimane fa) da Salvatore Ruggieri