Instructors - Docenti:
Notice: you can find a list of the papers to read at this link: http://bit.ly/bda_papers. Send an email to Luca Pappalardo with your choice for three papers. We then assign you one of the papers.
In our digital society, every human activity is mediated by information technologies, hence leaving digital traces behind. These massive traces are stored in some, public or private, repository: phone call records, movement trajectories, soccer-logs and social media records are all examples of “Big Data”, a novel and powerful “social microscope” to understand the complexity of our societies. The analysis of big data sources is a complex task, involving the knowledge of several technological and methodological tools. This course has three objectives:
In this module, analytical methods and processes are presented thought exemplary cases studies in challenging domains, organized according to the following topics:
This module will provide to the students the technologies to collect, manipulate and process big data. In particular the following tools will be presented:
During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering several thematic areas. Discussions and presentation in class, at different stages of the project execution, will be performed.
17/09 (Mod. 1) Introduction to the course, The Big Data scenario mod1.introduction_bigdatalandscape_newquestions_.pdf
21/09 (Mod. 1) Big Data Analytics: new questions to be solved + Presentation of datasets
24/09 (Mod. 2) Python for Data Science: The Jupyter Notebook: developing open-source and reproducible data science
28/09 (Mod. 1) Soccer data landscape and players’ injury prediction
01/10 (Mod. 2) Scikit-learn: programming tools for data mining and analysis.
05/10 (Mod. 1) Analysis and evolution of sports performance
08/10 (Mod. 1) The mobility data landscape and mobility data mining methods
12/10 (Mod. 1) Soccer Data Challenge
15/10 (Mod. 1) Understanding Human Mobility with GPS
19/10 (Mod. 3) Data Understanding and Project Formulation
22/10 (Mod. 2) MongoDB: fast querying and aggregation in NoSQL databases
05/11 (Mod. 2) GeoPandas: analyze geo-spatial data with Python
09/11 (Mod. 1) Predicting well-being from human mobility patterns
12/11 (Mod. 1) Nowcasting influenza with retail market data
16/11 (Mod. 1) papers presentation
19/11 (Mod. 1) papers presentation
23/11 (Mod. 3) Mid Term Project Results
26/11 (Mod. 1) The social media data landscape and social media mining methods
30/11 No lessons
03/12 (Mod. 1) Sentiment analysis: examples from Human Migration studies
07/12 (Mod. 1) Discussion on Ethical issues in Big Data Analytics
10/12 (Mod. 3) Final Project results
14/12 (Mod. 3) Final Project results
12/01 14,00 @ CNR (Entrance 20 - Room C36b) - Exam
The two mid-terms will be 40% of the final grade, the remaining 60% is the evaluation of the Project and the Discussion (prepare some Slides to present your project). There is the possibility to do the a final test about technologies if the Mid-Terms are not sufficient.