Credits
6
Types
Elective
Requirements
This subject has not requirements
, but it has got previous capacities
Department
CS
Weekly hours
Theory
3
Problems
0
Laboratory
0
Guided learning
0.6
Autonomous learning
6.4
Objectives
-
Justify the approppriateness of specific statistical techniques for facing specific NLP tasks.
Related competences: CG1, CG3, CB6, CB7, CEC1, CEC2, CEC3, CTR4, CTR6, -
Evaluating the usefulness of statistical components to be included into NLP applications for carrying out NLP tasks
Related competences: CG3, CEC3, -
Searching and selection of statistical NLP resources and tools able to be used in NLP tasks and applications
Related competences: CG3, CB7, CB9, CEC1, CEC2, CEC3, CTR4, -
Design and implementation of new NLP components, tuning of existing components, and integration into a NLP application
Related competences: CG3, CB6, CB9, CEC2, CEC3, CTR4,
Contents
-
Introduction & basics
NLP vs Computational Linguistics vs HLT
Knowledge-based vs Empirical methods
Resources
Lexical resources
Corpora
Grammars
Ontologies -
Language Models
Basics
{word, class, phrase}-based models
Information content
entropy
mutual information
joint and conditional entropy
pointwise mutual information
Kullback-Leibler divergence (KL)
Application to NLP tasks
Noise channel models
Alignment models
Application to NLP tasks -
Finite State Models
Finite State Automata (FSA) and Regular grammars
Finite State Transducers (FST)
Finite State Probabilistic models
Application to NLP tasks -
Log linear & Maximum Entropy Models
Classification problems MLE vs MEM
Generative and conditional (discriminative) models.
MM and HMM.
CRF
Building ME models
Maximum Entropy Markov Models (MEMM)
Applications to NLP -
Models for parsing
Constituent parsing
Stochastic Context Free Grammars (SCFG)
Richer probabilistic models
Applications to NLP.
Syntactic parsing
Semantic parsing
Dependency parsing -
Supervised Machine Learning for NLP
Classification problems.
Margin-based classifiers: Perceptron, SVM, AdaBoost.
Kernel-based mehods. -
Semi-supervised Learning
Bootstrapping -
Unsupervised Learning (Clustering)
Similarity
Hiereachical Clustering
non-hierarchical clustrering
Clustering evaluation. -
Using statistical techniques for NLP applications
Machine Translation (MT) in detail
Other NLP tasks (Part of Speech (POS) tagging, Named Entity Recognition and Classification (NERC), Mention detection & tracking, Coreference resolution, Text Alignment, Lexical Acquisition, Relation Extraction, Semantic Role Labeling (SRL), Word Sense Disambiguation (WSD)) and applications (Information Extraction (IE), Information Retrieval (IR), Question Answering (Q&A), Automatic Summarization, Sentiment Analysis, and Text Classification) only sketched.
Activities
Activity Evaluation act
Introduction & basics
Introduction & basics attending the theory class Homework discusion and tutoringObjectives: 2
Contents:
Theory
3h
Problems
0h
Laboratory
0h
Guided learning
1h
Autonomous learning
4h
Homeworks
Students will solve the 5 homeworks at home although they will receive advise from the teachers. Homeworks are due two weeks after the proposal. The evaluation will contain comments on the student worksObjectives: 4
Contents:
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
30h
Teaching methodology
The teaching methology is as follows:For each of the 9 topics there will be one (the most frequent case) or more theory classes. The material (slides, readings, etc.) is known in advance.
Additionaly, a set of homeworks directly attached with the different topic will be proposed along the course to the students (usually 5 homeworks are proposed). These homeworks can be sometimes solved by hand and in other cases by writing a short program.
Evaluation methodology
The evaluation is based on two components:1) The final exam
2) The grades of the 5 homeworks
The final grade is obtained from the grades of such components.
The weights of the two components are the same (50%).
The weights of the five homeworks are the same (20%).
Bibliography
Basic
-
Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition and Computational Linguistics
- Jurafsky, Daniel & Martin, James H.,
ISBN: 0131873210
http://www.cs.colorado.edu/~martin/slp.html -
Handbook of natural language processing
- Somers, Harold L; Dale, Robert,
Marcel Dekker,
cop.2000.
ISBN: 0824790006
http://cataleg.upc.edu/record=b1172244~S1*cat -
Foundations of statistical natural language processing
- Manning, Christopher D; Schütze, Hinric,
MIT Press,
1999.
ISBN: 0262133601
http://cataleg.upc.edu/record=b1165147~S1*cat -
The Oxford handbook of Computational Linguistics
- Mitkov, Ruslan ,
2004.
ISBN: 019927634X
http://www.chatbots.org/book/the_oxford_handbook_of_computational_linguistics/