# DidaWiki

### Strumenti Sito

magistraleinformatica:alg2:algo2_13:start

Questa è una vecchia versione del documento!

# Year 2013-2014

Prof. Roberto Grossi

### Announcements

• Calendario orali: martedì 11 febbraio ore 9:00, presso lo studio del docente. Nota: in questa riunione, si decidono le date degli orali per i giorni successivi fino alla fine di febbraio.
• The final list of problems has been updated (Dec.19).
• The class lectures will be (mainly) given in English. Tue 11-13 (aula A1), Wed 16-18 (aula C1), Thu 9-11 (aula N1).
• Office hours: Thu 11-14 (Dipartimento di Informatica)
• Examination dates: Dec. 19, 9:00-11:00, Aula B (compitino).

### Overview

We will study, design and analyze advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. This course deepens and extends the algorithmic notions of students.

The syllabus is structured to highlight the applicative scenarios in which the studied algorithms and data structures can be successfully applied. The level of detail with which each argument will be dealt with can change year-by-year, and will be decided according to requests coming from other courses and/or specific issues arising in, possibly novel, applicative scenarios.

The advanced nature of this course focuses on developing algorithmic design skills. The students are exposed to complex problems that require a significant effort in problem solving. One “brainstorming” lecture per week is devoted to a full immersion in this activity, and it is highly recommend to attend it. The purpose is not that of teaching/learning further N algorithms, but to refine students' skills. The final written exam will be based on the topics discussed during the “brainstorming” lectures.

We like students that make mistakes in a constructive way, since this is the starting point of our brainstorming to drive their reasoning paths, thus learning by mistakes. We are interested in the route that leads to the algorithmic solution, rather than the solution itself. Regarding mistakes and intelligence, Alan Turing wrote in 1947 that “… if a machine is expected to be infallible, it cannot also be intelligent”. This is a motto for this class, and an invitation to express your own ideas even if they could be apparently wrong.

### Topics

Please note that several topics are the outcome of recent advancement in algorithms and data structures, and thus most of the course material consists in research papers or book chapters.

Date Topics References and notes
Sept. 24 Introduction to the course. Problem solving: ideas and examples. -

#### Enumeration, hardness and approximation of some combinatorial problems

NP-hard problems are important but difficult to solve and no deterministic polynomial algorithms are currently known for them. Even basic problems, such as finding a simple path of vertices in a graph, are NP-hard. We discuss how to attack these problems (a) by counting and listing their solutions where the cost is proportional to the output (it can be exponential in the worst case but we only pay for what we get) and (b) by approximating their solutions with polynomial-time algorithms when the problem requires to minimize a cost or maximize a benefit. Most of the addressed problems are on graphs, which are popular representations for modern networked data.

Date Topics References and notes
Sept. 25 Counting and listing all triangles in a graph. Section 10.6
Sept. 26 Listing all maximal cliques in a graph. wikipedia paper,sect.3
Oct. 1 All-pairs shortest paths. Listing all paths. chap.25, par.25.1, 25.3
Oct. 2 Problem solving. Listing all complete graphs K_4 and s-t paths. list of problems
Oct. 3 Feedback vertex/edge set: connection with spanning trees, vertex covers and independent sets; listing all the minimal feedback vertex sets in directed graphs. sect. 1, 2
Oct. 8-10 Class canceled. Homework: Review notions of polynomial certificate, polynomial reduction/transformation and NP-completeness. chapt.9: par. 9.1-9.2 (Italian)
Oct. 15 Reviewed notions of NP-hardness, Eulerian vs Hamiltonian paths in graphs. Approximation algorithms. Difficulty of approximation for the general Travelling salesman Problem (TSP). chapt.9: par. 9.3-9.4 (Italian)
Oct. 16 Problem solving. Listing all vertex-to-vertex shortest paths. Proving that the metagraph of all the the minimal feedback vertex sets is strongly connected. list of problems
Oct. 17 Polinomial algorithms for the 2-approximation of the metric TSP and of the Minimum Vertex Cover. chapt.8: par. 8.10-8.11 (Italian)
Oct. 22 Knapsack problem: bad example for greedy and its refinement for 2-approximation. Approximation for Min Bin Packing using Next Fit and First Fit Decreasing strategies. chapt.2: par. 2.1.1, par. 2.2.2
Oct. 23 Problem solving. Linear-time algorithm for finding an Eulerian cycle. 2-approximation for MAX-SAT. list of problems
Oct. 24 Density-approximation for Maximum Independent Set (MIS). chapt.2: par.2.1.2
Oct. 29 Review: polynomial reductions among problems, NP vs #P. Proof of Cook-Levin theorem that SAT is NP-complete. wikipedia Th.7.2
Oct. 30 Polynomial-time reductions: SAT to Clique; Clique to Vertex Cover; Partition to Knapsack (easy). Class by prof. Pagli). SAT2Clique Clique2VC
Oct. 31 Problem solving. Polynomial-time 2-approximation for MAX-CUT e MAX-SAT. list of problems

#### Randomization, hashing and data streaming

Randomization is now a common powerful tool to solve large-scale problems. After introducing the concept of randomized algorithms and hashing, we apply them to solve data streaming problems, a field emerged in the last decade. Here data flow as a stream and one-pass algorithms with limited memory can process it. We focus on the count-min sketch paradigm and its applications.

Date Topics References and notes
Nov. 12 Randomized quicksort and two analyses for its expected cost. one-hour class, par.5.1par. 7.3
Nov. 13 Basic randomized techniques: Indicator random variables. Markov's inequality. K-wise indepent hash functions. Notes
Nov. 14 Data Streaming Model. Motivations and examples. Count-Min Sketches. sect.1 Site Notes
Nov. 19 Class canceled (teacher abroad)
Nov. 20 Count-Min Sketches (continued). sect.2,3 Site
Nov. 21 Chernoff bounds, indicator variables, and applications of count-min sketches. par.4.1,Th.4.1 (proof is optional) sect.4 Notes

#### Algorithms for massive data sets and hierarchical memories

Massive data sets are one of the key elements in our so-called big data society. In this scenario the design and analysis of algorithms and data structures adopt the external and cache-oblivious models, which take into account the memory hierarchy that provides performance to modern computers. We study basic problems on sorting and searching in these models, comparing the results with the classical RAM model of computation.

Date Topics References and notes
Nov. 26 External memory model (EMM). Case study for sequential access versus random access. Description of the model. CPU complexity and I/O complexity. Cost of scanning and sorting. External binary search and binary mergesort. Introduction, chapt.2 and 3 (set D=1 disks)
Nov. 27 Problem solving. Pairwise independent hash functions. list of problems
Nov. 28 K-way distribution sort. Scan-and-sort paradigm and mapReduce/Hadoop. Introduction, chapt.2 and 3 (set D=1 disks), chapt.5 (set D=1 disks, no randomization, till Sect.5.1.1 included) Slides Slides (thanks to M. Idini)
Dec. 3 Lower bounds for sorting and permuting in the external memory model. Introduction, chapt.2 and 3 (set D=1 disks), chapt.5 (set D=1 disks, no randomization, till Sect.5.1.1 included NotesNotes
Dec. 4 Problem solving. K-way merging, permuting. list of problems
Dec. 5 Suffix sorting. DC-3 algorithm for RAM model and EMM. DC3 algorithm (up to Section3)
Dec. 10 Cache-oblivious (CO) model. Ideal cache. Motivations and simple examples: multiple scanning, inverting an array. van Emde Boas (vEB) layout of complete binary (search) trees. Sect.4.1 Notes
Dec. 11 Problem solving. Filling the details of the DC3 suffix sorting. Suffix array searching. Implicit van Emde Boas layout of binary complete search trees. list of problems
Dec. 12 Text indexing and searching: suffix trees and inverted lists. wikipedia wikipedia pp.124-128,131-135,139-143
Dec. 17 Text compression: inverted lists and Lempel-Ziv algorithms. sect.3.2-3.3 (up to p.119) sect.2.6 (up to p.81)
##### Excercises discussed in class

This is the final list of problems.

##### Examination outcomes (in Italian)

Risultati del 19.12.2013

   438014    24
452465    29
465077    27
505530    28
505534    28
508454    29
509023    30

Risultati del 20.01.2014

   438014   29
438905   27
441808   29
443831   29
445942   28
464665   29
499487   26
505530   28
508463   20

Risultati del 07.02.2014

   451325   ins
472005   30
490483   28
508463   30
magistraleinformatica/alg2/algo2_13/start.1391867647.txt.gz · Ultima modifica: 08/02/2014 alle 13:54 (8 anni fa) da Roberto Grossi