Strumenti Utente

Strumenti Sito


magistraleinformatica:eln:start

Questa è una vecchia versione del documento!


Elaborazione del Linguaggio Naturale

Laurea Magistrale: Informatica.

Docente: Giuseppe Attardi Ricevimento: Mercoledì, 11:00

Assistente: Stefano Dei Rossi

Schedule
Day Hour Room
Monday 14-16 L1, Polo Fibonacci
Tuesday 11-13 I1, Polo Fibonacci

Prerequisiti

  1. Calcolo delle probabilità e statistica
  2. Programmazione

Programma

  1. Introduction
    1. History
    2. Present and Future
    3. NLP and the Web
  2. Mathematical Background
    1. Probability and Statistics
    2. Language Model
    3. Hidden Markov Model
    4. Viterbi Algorithm
    5. Generative vs Discriminative Models
  3. Linguistic Essentials
    1. Part of Speech and Morphology
    2. Phrase structure
    3. Collocations
    4. n-gram Models
    5. Word Sense Disambiguation
  4. Preprocessing
    1. Encoding
    2. Regular Expressions
    3. Segmentation
    4. Tokenization
    5. Normalization
  5. NLTK
    1. Introduction to Python
    2. Overvies of NLTK libraries
  6. Classification
    1. Machine Learning
    2. Statistical classifiers
      1. Bayesan Network
      2. Perceptron
      3. Maximum Entropy
      4. Support Vector Machines
      5. Hidden Variable Models
  7. Clustering
    1. K-means
    2. Factored Models
      1. Singular Value Decomposition
      2. Latent Semantic Indexing
  8. Tagging
    1. Part of Speech
    2. Named Entity
    3. Super Senses
  9. Sentence Structure
    1. Constituency Parsing
    2. Dependency Parsing
  10. Semantic Analysis
    1. Semantic Role Labeling
    2. Coreference resolution
  11. Statistical Machine Translation
    1. Word-Based Models
    2. Phrase-Based Models
    3. Decoding
    4. Syntax-Based SMT
    5. Evaluation metrics
  12. Processing Pipelines
    1. Integrated tooolkit
    2. Frameworks
      1. Gate
      2. UIMA
    3. Data Pipeline
      1. Tanl
  13. Applications
    1. Information Extraction
    2. Information Filtering
    3. Recommender System
    4. Opinion Mining
    5. Semantic Search
    6. Question Answering
      1. Text Entailment

Lecture Notes

Date Lecture Notes
20/2/2012 Introduction
21/2/2012 Introduction to probability (slides)
27/2/2012 Python Tutorial (slides) Python: Functionals and Generators
28/2/2012 Text Classification (slides)
5/3/2012 Naive Bayes Classifier
Introduction to NLTK (slides)
6/3/2012 Segmentation and Tokenization (slides) Homework 1
27/3/2012 Maximum Entropy Models (slides) Homework n. 2
Hidden Markov Model (slides)
Named Entity Recognition (slides)
Perceptron, SVM
17/4/2012 Dependency Formalism(slides)
23/4/2012 Dependency Parsing (Graph Based , Transition Based)
MEMM
Machine Translation (MT)
Phrase Based Statistical Machine Translation (PBMT)
Conditional Random Fields
Recommender Systems
Sentiment Analysis

Temi di Approfondimento

Testi di riferimento

  1. C. Manning, H. Schutze. Foundations of Statistical Natural Language Processing. MIT Press, 2000.
  2. D. Jurafsky, J.H. Martin, Speech and Language Processing. 2nd edition, Prentice-Hall, 2008.
  3. S. Kubler, R. McDonald, J. Nivre. Dependency Parsing. 2010.
  4. P. Koehn. Statistical Machine Translation. Cambridge University Press, 2010.
  5. S. Bird, E. Klein, E. Loper. Natural Language Processing with Python.

Modalità di esame

Progetto e orale.

Corsi affini

  1. Apprendimento Automatico: Fondamenti
  2. Data Mining: fondamenti
  3. Information Retrieval
  4. Sistemi Basati sulla Conoscenza

Edizioni Precedenti

magistraleinformatica/eln/start.1335197471.txt.gz · Ultima modifica: 23/04/2012 alle 16:11 (9 anni fa) da Giuseppe Attardi