Indice

Big Data Analytics A.A. 2019/20

Instructors - Docenti:

NOTICE: ON NOVEMBER 18TH UNIVERSITY IS CLOSED AND ALL LESSONS ARE SUSPENDED DUE TO WEATHER ALERT

Registration to the course: build up teams of 3 or 4 students and register to the course here, by September 29th: http://bit.ly/bda_19_20_registration.

Preferences for datasets: fill this form: http://bit.ly/bda_19_20_bidding_datasets. Use the same email address used during the registration. In the form you find 5 choices, each choice is associated with five datasets. Your preferred dataset is the one chosen in “Prima scelta”, your second preferred dataset is the one chosen in “Seconda scelta”, and so on.

Assignment of datasets:

First mid term presentation: The first mid term presentation (data understanding and project proposal) will be on October 24th (Soccer Match Events, Credit Risk, Ted Talks) and October 31st (Injury Forecasting, Car Crash, Reddit).

Second mid term presentation: 18 November (Soccer Match Events, Credit Risk, Ted Talks) and 25 November (Injury Forecasting, Car Crash)

Paper presentation:

Third midterm presentation: 9 December (Soccer Match Events, Credit Risk, Ted Talks) and 12 December (Injury Forecasting, Car Crash)

Learning goals

In our digital society, every human activity is mediated by information technologies, hence leaving digital traces behind. These massive traces are stored in some, public or private, repository: phone call records, movement trajectories, soccer-logs and social media records are all examples of “Big Data”, a novel and powerful “social microscope” to understand the complexity of our societies. The analysis of big data sources is a complex task, involving the knowledge of several technological and methodological tools. This course has three objectives:

Module 1: Big Data Analytics and Social Mining

In this module, analytical methods and processes are presented thought exemplary cases studies in challenging domains, organized according to the following topics:

Module 2: Big Data Analytics Technologies

This module will provide to the students the technologies to collect, manipulate and process big data. In particular the following tools will be presented:

Module 3: Laboratory for Interactive Project Development

During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering several thematic areas. Discussions and presentation in class, at different stages of the project execution, will be performed.

Calendar

16/09 (Mod. 1) Introduction to the course, The Big Data scenario lesson1_introduction_to_the_course.pdf

20/09 NO LESSON

23/09 (Mod. 2) Python for Data Science and the Jupyter Notebook: developing open-source and reproducible data science

27/09 (Mod. 3) Presentation of datasets for projects: http://bit.ly/bda_19_20_datasets

30/09 (Mod. 2) Scikit-learn: programming tools for data mining: http://bit.ly/bda_notebooks_2

03/10 (Mod. 2) Geopandas and scikit-mobility: analyze trajectory data in Python: geopandas.zip

07/10 (Mod. 2) PyMongo and MongoDB: fast querying and aggregation in NoSQL databases: mongodb.zip

10/10 NO LESSON

14/10 (Mod. 1) Soccer data landscape and injury prediction: bda_1920_sports_analytics.pdf

17/10 (Mod. 1) Performance evaluation: from human evaluations to data-driven algorithms: bda_performance_evaluation.pdf

21/10 (Mod. 1) Nowcasting well-being with Big Data: bda_wellbeing.pdf

24/10 (Mod. 3) Project presentation - first group of teams

28/10 NO LESSON

31/10 (Mod. 3) Project presentation - second group of teams

7/11 (Mod. 3) Discussion and group working on projects

11/11 (Mod. 3) Discussion and group working on projects

18/11 (Mod. 3) All lessons suspended for weather alert

21/11 (Mod. 1) Forecasting influenza with retail market data

25/11 (Mod. 3) Project advancements - first and second group of teams

28/11 (Mod. 1) Explainability and interpretation of machine learning models

02/12 (Mod. 3) Paper presentations

05/12 (Mod. 3) Paper presentations

09/12 (Mod. 3) Project advancements - first group of teams

12/12 (Mod. 3) Project advancements - second group of teams

Final exams (appelli) are scheduled on:

Exam

The following table describe the expected content of a project:

Previous Big Data Analytics websites

Big Data Analytics A.A. 2018/19

Big Data Analytics A.A. 2017/18

Big Data Analytics A.A. 2016/17

Big Data Analytics A.A. 2015/16