Strumenti Utente

Strumenti Sito


Big Data Analytics A.A. 2018/19

Instructors - Docenti:

Learning goals

In our digital society, every human activity is mediated by information technologies, hence leaving digital traces behind. These massive traces are stored in some, public or private, repository: phone call records, movement trajectories, soccer-logs and social media records are all examples of “Big Data”, a novel and powerful “social microscope” to understand the complexity of our societies. The analysis of big data sources is a complex task, involving the knowledge of several technological and methodological tools. This course has three objectives:

  • introducing to the emergent field of big data analytics and social mining;
  • introducing to the technological scenario of big data, like programming tools to analyze big data, query NoSQL databases, and perform predictive modeling;
  • guide students to the development of a open-source and reproducible big data analytics project, based on the analyis of real-world datasets.

Module 1: Big Data Analytics Technologies

This module will provide to the students the technologies to collect, manipulate and process big data. In particular the following tools will be presented:

  • Python for Data Science
  • The Jupyter Notebook: developing open-source and reproducible data science
  • MongoDB: fast querying and aggregation in NoSQL databases
  • GeoPandas: analyze geo-spatial data with Python
  • Scikit-learn: programming tools for data mining and analysis

Module 2: Big Data Analytics and Social Mining

In this module, analytical methods and processes are presented thought exemplary cases studies in challenging domains, organized according to the following topics:

  • Challenges in Big Data analytics opportunities, risks, and technological challenges in big data analytics.
  • Soccer Analytics: analysis of massive “soccer-logs”, records which detail all the events occurring during soccer games. Example of using MongoDB to store soccer-logs and find relevant information. Presentation of data-driven studies and algorithms in soccer analytics.
  • Mobility Analytics: analysis of call detail records from mobile phones and GPS traces from private cars. Example of using GeoPandas to clean and analyze mobility data. Presentation of data-driven studies and algorithms in mobility analytics.
  • Network Analytics: analysis of networked data describing social interactions or collaborations. Example of using Networkx to analyze network data. Presentation of data-driven studies and algorithms in network analytics.

Module 3: Developing a Big Data analytics project

During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering three thematic areas: sports analytics, mobility analytics and network analytics. Discussions and presentation in class, at different stages of the project execution, will be performed.

Previous Big Data Analytics websites

bigdataanalytics/bda/start.txt · Ultima modifica: 12/08/2018 alle 08:15 (8 giorni fa) da Luca Pappalardo