Strumenti Utente

Strumenti Sito


magistraleinformaticanetworking:ae:ae2014:start

Algorithm Engineering A.A. 2014-2015

Thanks to P. Sanders Teacher: Paolo Ferragina

CFU: 9 (second semester).

Course ID: 531AA.

Language: English.

Degree: Master degreein CS&Networking in collaboration with the Scuola Normale Superiore “S. Anna”.

Lectures schedule: 11:00-13:00 at Monday (N1), Tuesday (C), Wednesday (N1).

Question time: After lectures and by appointment.

News: about this course will be distributed via a Tweeter-channel.

Official lectures schedule: The schedule and content of the lectures is available below and with the official registro.

Goals

In this course we will study, design and analyze advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. These algorithmic tools will be designed and analyzed in several models of computation— such as RAM, 2-level memory, cache-oblivious, streaming— in order to take into account the architectural features and the memory hierarchy of modern PCs. Moreover, we will add to such theoretical analysis several other engineering considerations spurring from the implementation of the proposed algorithms and from experiments published in the literature.

Every lecture will follow a problem-driven approach that starts from a real software-design problem, abstracts it in a combinatorial way (suitable for an algorithmic investigation), and then introduces algorithmic solutions aimed at minimizing the use of some computational resources like time, space, communication, I/O, energy, etc. Some of these solutions will be discussed also at an experimental level, in order to introduce proper engineering and tuning tools for algorithmic development.

Exam

Dates Room Testo
05-06-2015 L1 no participants
29-06-2015 L1 text
20-07-2015 L1 text
10-09-2015 L1 text
11-01-2016 L1 text
01-02-2016 L1 text

Background

If you wish to refresh your mind on basic Algorithms and Data Structures, I suggest you to look at the well-known book Introduction to Algorithms by Cormen-Leiserson-Rivest-Stein, third edition. There you'd look at the chapters 2, 3, 4, 6, 7, 8, 10, 11 (no perfect hash), 12 (no randomly built), 15 (no optimal BST), 18, 22 (no strongly connected components). Also, you could look at the Video Lectures by Erik Demaine and Charles Leiserson, specifically Lectures 1-7, 9-10 and 15-17.

Books, Notes, etc.

I'll use just the old-fashioned blackboard.

Most of the content of the course will be covered by some notes I wrote in these years; for some topics I'll use parts of papers/books.

Lectures

Date Lecture Biblio
23/02/15 Introduction to the course. Models of computation: RAM, 2-level memory. The role of the Virtual Memory system. An example of algorithm analysis: the sum of n numbers. Chap. 1 of the notes.
24/02/15 Maximum sub-array sum in 1d and its variations. Chap. 2 of the notes, about sect 2.5 read only the first page and half up to the algorithmic solution of the density-based problem.
25/02/15 Random sampling: disk model, streaming model and known length, streaming model and unknown length. Chap. 3 of the notes.
Students are warmly invited to refresh their know-how about: Divide-and-conquer technique for algorithm design and Master Theorem for solving recurrent relations; and Binary Search Trees Lecture 2, 9 and 10 of Demaine-Leiserson's course at MIT
02/03/15 List Ranking: definition of the problem, RAM solution, difficulties on disk, pointer-jumping technique, I/O-efficient simulation.
03/03/15 List Ranking: divide-and-conquer approach with its analysis, Master Theorem (recap), randomised-algorithm for generating an independent set. Sorting atomic items: sorting vs permuting, comments on the time and I/O bounds, binary merge-sort and its bounds. Chap. 4 of the notes.
04/03/15 Snow Plow and compression. Multi-way mergesort. Chap. 5 of the notes
09/03/15 Lower bounds for sorting and permuting on disk. The case of D>1 disks: non-optimality of multi-way mergesort (with comments and analysis), the disk-striping technique.
12/03/15 Quicksort: recap on best-case, worst-case, average-case with analysis. Various partitioning strategies: two-way or three-way. Selection of k-tk ranked item in linear average time (with proof).
16/03/2015 Quicksort: Multi-way for disk, multi-pivot selection (with proof). Bounded-space quicksort. 3-way partition for better in-memory quicksort.
17/03/2015 String sorting: comments on the difficulty of the problem on disk, lower bound. LSD-radix sort with proof of time complexity and correctness. Chap. 6 of the notes (updated 20/3/15).
18/03/2015 MSD-radix sort, trie data structure, Multi-key Quicksort.
19/03/2015 Ternary search tree. Interpolation search.
23/03/2015 Prefix search: definition of the problem, solution based on arrays, Front-coding, two-level indexing. Chap. 7 of the notes
24/03/2015 More on two-level indexing of strings: review of in-memory level based on array of string pointers and uncompacted tries. Solution based on compacted trie and Patricia trie, with analysis of space, I/Os and time of the prefix search. This year we don't study locality preserving FC.
25/03/2015 Randomised data structures: Treaps and Skip lists (with proofs and comments on string items). Notes by others. Study also Theorems and Lemmas. see Demaine's lecture num. 12 on skip lists.
Students are warmly invited to refresh their know-how about: hash functions and their properties; hashing with chaining. Lectures 7 of Demaine-Leiserson's course at MIT
26/03/2015 Hashing and dictionary problem: direct addessing, simple hash functions, hashing with chaining, uniform hashing and its computing/storage cost, universal hashing (definition and properties). Chap. 13 of the notes. Do not study the proof of Theorem 13.3 (study the statement), and the whole theorem 13.4.
30/03/2015 Two examples of Universal Hash functions: one with correctness proof, the other without. Minimal ordered perfect hashing: definition, properties, construction, space and time complexity. With exercises on previous and today topics. Theo 13.5 without proof, but with statement.
01/04/2015 As a consequence of the email of Prof. Danelutto of yesterday, students are invited to read the attached exercises, with my solutions, and contact me via skype from 11 to 13 [nick: paolo.ferragina]. Exercises and their solutions.
15/04/2015 Perfect hash table (with proof) and Cuckoo hashing (with proof). Note that Chap. 13 has been changed.
16/04/2015 Bloom Filter: properties, construction, query operation (with proofs). Spectral Bloom Filter. Lower bound and comparison with BF. No Compressed BF.
17/04/2015 Exercises
20/04/2015 Integer coding: the problem and some considerations. Gamma, delta, golomb/rice, PForDelta. Chap. 9 of the notes
21/04/2015 More on integer encoders: (s,c)-codes, variable-byte, Interpolative, Elias-Fano.
22/04/2015 Prefix-free codes, notion of entropy, optimal codes. Huffman, with optimality (proof). Canonical Huffman: construction, properties, decompression algorithm. Chap. 10 of the notes.
23/04/2015 Exercises
27/04/2015 Arithmetic coding: properties, algorithm and proofs; with examples. No PPM and Range coding in Chap 10 of the notes
28/04/2015 Dictionary-based compressors: properties and algorithmic structure. LZ77, LZSS and gzip. LZ8 and LZW Chap 11, no par 11.4.
29/04/2015 Burrows-Wheeler Transform (forward and backward), Move-To-Front, Run-Length-Encoding, bzip2. Chap. 12 of the notes. Study also Theorem 12.5, but no sections 12.4 and 12.5.
29/04/2015 Exercises
06/05/2015 Substring search: definition, properties, reduction to prefix search. The Suffix Array. Binary searching the Suffix Array: p log n (no pages 8.4-8.5). How to compute the lcp-array. Suffix array construction (theo 8.1, but no Skew algorithm, and no Scan-based algorithm). Text mining use of suffix arrays. Chap. 8 of the notes.
07/05/2015 Suffix Trees: properties, structure, pattern search, space occupancy. Construction of Suffix Trees from Suffix Arrays and LCP arrays, and vice versa. LZ77-parsing via suffix trees. NO McCreight's construction
08/05/2015 Exercise
11/05/2015 Range-minimum-query over arrays and Lowest common ancestor over trees, reductions and various solutions. Application to k-mismatch problem. Read 8.4 of the notes and these slides
12/05/2015 Recap: graphs and their representations, BFS and DFS visits, computing shortest-path over unary-length edges, topological sort. Chap 23.1-23.4 of CLR
18/05/2015 Minimum Spanning Tree problem: definition, Greedy approach, Kruskal's and Prim's algorithms. Chap 23 of CLR
19/05/2015 Algorithms for external and semi-external computation of MST. Also use of MST for clustering and for the bottleneck shortest path problem (no proof). A part of the Mehlhorn-Sander's book, look at Sect 11.5.
25/05/2015 Shortest Path problem: Dijkstra's algorithm. Steiner Tree problem: definition and a 2-approximate solution. Traveling Salesman Tour problem: definition and a 2-approximate solution. Part of the chap on Shortest Path of CLR. Also sect 11.5 and 11.6 of Sanders-Mehlhorn's book.
26/05/2015 Exercises on Graph Problems
magistraleinformaticanetworking/ae/ae2014/start.txt · Ultima modifica: 01/02/2016 alle 08:56 (8 anni fa) da Paolo Ferragina