Strumenti Utente

Strumenti Sito


magistraleinformatica:eln:start

Questa è una vecchia versione del documento!


Elaborazione del Linguaggio Naturale

Laurea Magistrale: Informatica.

Docente: Giuseppe Attardi Ricevimento: Mercoledì, 11:00

Assistente: Stefano Dei Rossi

Schedule
Day Hour Room
Tuesday 16-18 G1, Polo Fibonacci
Friday 11-13 I1, Polo Fibonacci

Prerequisiti

  1. Calcolo delle probabilità e statistica
  2. Programmazione

Programma

  1. Introduction
    1. History
    2. Present and Future
    3. NLP and the Web
  2. Mathematical Background
    1. Probability and Statistics
    2. Language Model
    3. Hidden Markov Model
    4. Viterbi Algorithm
    5. Generative vs Discriminative Models
  3. Linguistic Essentials
    1. Part of Speech and Morphology
    2. Phrase structure
    3. Collocations
    4. n-gram Models
    5. Word Sense Disambiguation
  4. Preprocessing
    1. Encoding
    2. Regular Expressions
    3. Segmentation
    4. Tokenization
    5. Normalization
  5. NLTK
    1. Introduction to Python
    2. Overvies of NLTK libraries
  6. Classification
    1. Machine Learning
    2. Statistical classifiers
      1. Bayesan Network
      2. Perceptron
      3. Maximum Entropy
      4. Support Vector Machines
      5. Hidden Variable Models
  7. Clustering
    1. K-means
    2. Factored Models
      1. Singular Value Decomposition
      2. Latent Semantic Indexing
  8. Tagging
    1. Part of Speech
    2. Named Entity
    3. Super Senses
  9. Sentence Structure
    1. Constituency Parsing
    2. Dependency Parsing
  10. Semantic Analysis
    1. Semantic Role Labeling
    2. Coreference resolution
  11. Statistical Machine Translation
    1. Word-Based Models
    2. Phrase-Based Models
    3. Decoding
    4. Syntax-Based SMT
    5. Evaluation metrics
  12. Processing Pipelines
    1. Integrated tooolkit
    2. Frameworks
      1. Gate
      2. UIMA
    3. Data Pipeline
      1. Tanl
  13. Applications
    1. Information Extraction
    2. Information Filtering
    3. Recommender System
    4. Opinion Mining
    5. Semantic Search
    6. Question Answering
      1. Text Entailment

Lecture Notes

Date Lecture Notes
8/03/2011 Introduction
11/3/2011 Introduction to probability (slides)
15/03/2011 Python Tutorial (slides) Homework 1, NPS chat collection
18/03/2011 Homework Solution
22/03/2011 Text Classification (slides)
25/03/2011 Naive Bayes Classifier
29/03/2011 Introduction to NLTK (slides)
1/04/2011 Segmentation and Tokenization (slides)
5/04/2011 Hidden Markov Model (slides)
8/04/2011 Named Entity Recognition (slides)
12/04/2011 Maximum Entropy Models (slides)
29/04/2011 Perceptron, SVM
03/05/2011 Dependency Parsing (Dependency Formalism , Dependency Parsing )
13/05/2011 MEMM Homework 2
24/05/2011 Machine Translation (MT)
27/05/2011 Phrase Based Statistical Machine Translation (PBMT)
31/05/2011 Conditional Random Fields (D. Marcheggiani), Recommender Systems I. Sansovino}})
7/06/2011 Sentiment Analysis ( G. Righetti)

Temi di Approfondimento

Testi di riferimento

  1. C. Manning, H. Schutze. Foundations of Statistical Natural Language Processing. MIT Press, 2000.
  2. D. Jurafsky, J.H. Martin, Speech and Language Processing. 2nd edition, Prentice-Hall, 2008.
  3. S. Kubler, R. McDonald, J. Nivre. Dependency Parsing. 2010.
  4. P. Koehn. Statistical Machine Translation. Cambridge University Press, 2010.
  5. S. Bird, E. Klein, E. Loper. Natural Language Processing with Python.

Modalità di esame

Progetto e orale.

Corsi affini

  1. Apprendimento Automatico: Fondamenti
  2. Data Mining: fondamenti
  3. Information Retrieval
  4. Sistemi Basati sulla Conoscenza
magistraleinformatica/eln/start.1307549824.txt.gz · Ultima modifica: 08/06/2011 alle 16:17 (10 anni fa) da Giuseppe Attardi