Strumenti Utente

Strumenti Sito


magistraleinformatica:ir:ir15:start

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisioneRevisione precedente
Prossima revisione
Revisione precedente
magistraleinformatica:ir:ir15:start [25/11/2015 alle 14:29 (10 anni fa)] – [Content of the Lectures] Paolo Ferraginamagistraleinformatica:ir:ir15:start [02/11/2016 alle 09:15 (9 anni fa)] (versione attuale) – [Exam] Paolo Ferragina
Linea 28: Linea 28:
  
 ^ Date         ^ Room ^ Text ^ ^ Date         ^ Room ^ Text ^
-| 11/01/2016 |  L1 (9:00)  +| 11/01/2016 |  L1 (9:00) {{:magistraleinformatica:ir:ir15:ir160111.docx|text}} 
-| 01/02/2016 |  L1 (9:00)  |  |+| 01/02/2016 |  L1 (9:00)  | {{:magistraleinformatica:ir:ir15:ir160201.docx|text}} | 
 +| 27/06/2016 |  L1 (9:00)  | {{:magistraleinformatica:ir:ir15:ir160627.docx|text}} | 
 +| 19/07/2016 |  L1 (9:00)  | no participants | 
 +| 02/09/2016 |  L1 (9:30)  | {{:magistraleinformatica:ir:ir15:ir160902.docx|text}} |
  
 =====  Books ===== =====  Books =====
Linea 40: Linea 43:
 ^ Date         ^ Argument ^ Refs ^  ^ Date         ^ Argument ^ Refs ^ 
 | 22/09/2015 | Introduction to the course: modern IR, not just search engines! Boolean retrieval model. Matrix document-term. Inverted list: dictionary + postings. How to implement an AND, OR and NOT queries, and their time complexities. The structure of a search engine. | {{:magistraleinformatica:ir:ir15:lect_01-intro_new.ppt|Slides}}\\ Chapt 1 of [MRS] | | 22/09/2015 | Introduction to the course: modern IR, not just search engines! Boolean retrieval model. Matrix document-term. Inverted list: dictionary + postings. How to implement an AND, OR and NOT queries, and their time complexities. The structure of a search engine. | {{:magistraleinformatica:ir:ir15:lect_01-intro_new.ppt|Slides}}\\ Chapt 1 of [MRS] |
-| 24/09/2015 | Web search engine: difficulties in their design and their ephocs. The Web graph: some useful structural properties (such as Boow Tie). Crawling: problems and algorithmic structure. An example: Mercator.  | {{:magistraleinformatica:ir:ir15:lect_02-crawling_and_storage_part_a_.ppt|Slides}},\\ Sections 19.1, 19.2, 19.4, 20.1, 20.2 of [MRS]. |+| 24/09/2015 | Web search engine: difficulties in their design and their ephocs. The Web graph: some useful structural properties (such as Bow Tie). Crawling: problems and algorithmic structure. An example: Mercator.  | {{:magistraleinformatica:ir:ir15:lect_02-crawling_and_storage_part_a_.ppt|Slides}},\\ Sections 19.1, 19.2, 19.4, 20.1, 20.2 of [MRS]. |
 | 29/09/2015 | Few useful algorithmic techniques for crawling the Web (and not only that!): Bloom Filter and Consistent Hashing. | {{:magistraleinformatica:ir:ir15:lect_02-crawling_and_storage_part_b_.ppt|Slides}}.\\ Sect 20.3 and 20.4 of [MRS]. For doubts on Bloom Filter see {{:magistraleinformatica:ir:ir12:reading-bloomfilter.pdf|paper}}. | | 29/09/2015 | Few useful algorithmic techniques for crawling the Web (and not only that!): Bloom Filter and Consistent Hashing. | {{:magistraleinformatica:ir:ir15:lect_02-crawling_and_storage_part_b_.ppt|Slides}}.\\ Sect 20.3 and 20.4 of [MRS]. For doubts on Bloom Filter see {{:magistraleinformatica:ir:ir12:reading-bloomfilter.pdf|paper}}. |
 | 01/10/2015 | Compressed storage of the Web graph. Compressed storage of documents: LZ-based compression. | {{:magistraleinformatica:ir:ir15:lect_03-compression_docs_and_graph_new_.ppt|Slides}},\\ Sect 19.1 and 19.2 of [MRS], and Sect 1.1 and 2.2 of {{:magistraleinformatica:ir:ir15:lz-bwt.pdf|Ferragina's notes}}. |  | 01/10/2015 | Compressed storage of the Web graph. Compressed storage of documents: LZ-based compression. | {{:magistraleinformatica:ir:ir15:lect_03-compression_docs_and_graph_new_.ppt|Slides}},\\ Sect 19.1 and 19.2 of [MRS], and Sect 1.1 and 2.2 of {{:magistraleinformatica:ir:ir15:lz-bwt.pdf|Ferragina's notes}}. | 
Linea 57: Linea 60:
 | 19 and 20\\ 11/2015 | Lab on Lucene.\\ You need to configure your laptop as follows: Linux system (may be a virtual machine) with debian-like OS (e.g. ''Ubuntu 15.10''), working Internet connection from the Polo's room, at least 5GB of free disk and 2GB RAM,  ''httrack'' and ''pylucene'' installed (that can be done with ''sudo apt-get update'' and ''sudo apt-get install python-lucene httrack'').\\ In collaboration with Marco Cornolti (cornolti@di.unipi.it).  | [[https://docs.google.com/presentation/d/1iXjtu_AduB-_CqsV2ye8M0q9_BHiCosKv9XcucZU-No/edit?usp=sharing|Slides (crawling)]] [[https://docs.google.com/presentation/d/1JlZKfWW85Q5atTLPRieWWOmpEKZBWc1Zx6PL2ReywWY/edit?usp=sharing|Slides (Lucene)]] |  | 19 and 20\\ 11/2015 | Lab on Lucene.\\ You need to configure your laptop as follows: Linux system (may be a virtual machine) with debian-like OS (e.g. ''Ubuntu 15.10''), working Internet connection from the Polo's room, at least 5GB of free disk and 2GB RAM,  ''httrack'' and ''pylucene'' installed (that can be done with ''sudo apt-get update'' and ''sudo apt-get install python-lucene httrack'').\\ In collaboration with Marco Cornolti (cornolti@di.unipi.it).  | [[https://docs.google.com/presentation/d/1iXjtu_AduB-_CqsV2ye8M0q9_BHiCosKv9XcucZU-No/edit?usp=sharing|Slides (crawling)]] [[https://docs.google.com/presentation/d/1JlZKfWW85Q5atTLPRieWWOmpEKZBWc1Zx6PL2ReywWY/edit?usp=sharing|Slides (Lucene)]] | 
 | 24/11/2015 | Performance measures: precision, recall, F1 and user happiness. Random Walks. Link-based ranking: pagerank and personalized pagerank. | {{:magistraleinformatica:ir:ir15:lect_11-web_ranking.ppt|Slides}}.\\ Chap 8 and 21 from [MRS].  | | 24/11/2015 | Performance measures: precision, recall, F1 and user happiness. Random Walks. Link-based ranking: pagerank and personalized pagerank. | {{:magistraleinformatica:ir:ir15:lect_11-web_ranking.ppt|Slides}}.\\ Chap 8 and 21 from [MRS].  |
-| 26/11/2015 | CoSim Rank and HITS. Projections to smaller spaces: Latent Semantic Indexing (LSI). | {{:magistraleinformatica:ir:ir15:lect_12-lsi_and_random_proj.ppt|Slides}}.\\ Chap 18 from [MRS]. |  +| 26/11/2015 | CoSim Rank and HITS. Recommendation systems and Web advertising. | {{:magistraleinformatica:ir:ir15:lect_13-applications.ppt|Slides}} only. |   
-27/11/2015 | Random ProjectionsJohnson-Linderstauss Lemma and its applicationsRecommendation systems and Web advertising. | Slides only. |  +| 27/11/2015 | Projections to smaller spaces: Latent Semantic Indexing (LSI). Random Projections: Johnson-Linderstauss Lemma and its applications.  | {{:magistraleinformatica:ir:ir15:lect_12-lsi_and_random_proj.ppt|Slides}}.\\ Chap 18 from [MRS]. | 
-01/12/2015 |   |  +01/12/2015 | Semantic-annotation toolsbasics, Wikipedia structure, TAGME and other annotatorsHow to evaluate those systems. Various approaches to text representation| {{:magistraleinformatica:ir:ir15:lect_14-topic_annotators.pptx|Slides}}. |  
-| 03/12/2015 |  |  |  +03/12/2015 | More on topics annotators and their applications. Clustering: flat, hierarchical, soft, hard. K-means, optimal bisect, hierarchical - max, min, avg, centroid. {{:magistraleinformatica:ir:ir15:lect_15-clustering.ppt|Slides}}.\\ Chap 16 and 17 of [MRS].  |  
-| 10/12/2015 |   |  +| 10/12/2015 | Locality-sensitive hashing: basics, hamming distance, Jaccard similarity, sketch of the main theorem. {{:magistraleinformatica:ir:ir15:lect_16-lsh.ppt|Slides}}.\\ Sect 19.6 of [MRS] |  
-| 11/12/2015 | Extra lecture (11:00-13:00, room C) |  |  +| 11/12/2015 | Exercise |  |  
-| 15/12/2015 |  |  | +| 15/12/2015 | Exercise |  | 
  
-=====  Last year lectures (just for backup!!!) ===== 
  
-^ Date         ^ Argument ^ Refs ^  
-|  | More on Rank and Select on binary arrays. Rank and Select on general arrays: the Wavelet Tree. Binary tree encoding and navigation. |  |  
-|  | Suffix arrays: data structure and search operations. Text mining over suffix arrays. |  |  
-|  | How to compute the SCC I/O-efficiently, the size of the web and the estimation of the relative sizes of SE. | Chap. 8, 19.1 and 19.2, 19.5 from [MRS].  | 
-|  | Semantic-annotation tools: basics and TAGME. |  | 
-|  | Semantic-annotation tools: advanced and some applications.      
-|  | Extra lecture: 9-11, M1: Clustering: flat, hierarchical, soft, hard. K-means, optimal bisect, hierarchical - max, min, avg, centroid. | Chap 16 and 17 of [MRS],    
-|  | Exercises      
magistraleinformatica/ir/ir15/start.1448461766.txt.gz · Ultima modifica: 25/11/2015 alle 14:29 (10 anni fa) da Paolo Ferragina

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki