Credits
5
Types
Compulsory
Requirements
This subject has not requirements
, but it has got previous capacities
Department
CS
Web
www.cs.upc.edu/~turmo/ihlt/plan32js/IHLT.html
IHLT provides the basic NLP knowledge in order to course AHLT and HLE. While AHLT goes in depth in the NLP statistical techniques, HLE reviews the state of the art on real applications in which NLP technology is involved.
Teachers
Person in charge
- Jordi Turmo Borrás ( turmo@cs.upc.edu )
Others
- Salvador Medina Herrera ( salvador.medina.herrera@upc.edu )
Weekly hours
Theory
2
Problems
0
Laboratory
1
Guided learning
0
Autonomous learning
5.93
Competences
Generic
Academic
Professional
Teamwork
Information literacy
Reasoning
Objectives
-
Understand the fundamental concepts of Natural Language Processing, most well-known techniques and theories as well as most relevant existing resources.
Related competences: CEA5, CG1, CG3, CEP6, CT4, CT6, -
Understand most relevant applications of NLP and the theories, tecniques and resources they use.
Related competences: CEA5, CG1, CG3, CEP6, CT4, CT6, -
Design and development of programs to solve specific problems in the NLP context, involving the selection of most appropiate techniques and resources as well as the use of existing resources. There would be one larger programs to be developed in groups of two students.
Related competences: CEA5, CG1, CG3, CEP4, CEP6, CEP7, CT3, CT4, CT6, -
Reason (ocassionally, in group) about several problems in the NLP context that imply considering different techniques and resources.
Related competences: CEA5, CG1, CG3, CEP7, CT3, CT4, CT6,
Contents
-
Document Structure and Language
Text selection, Tokenization, Sentence splitting, Language Identifiers -
Words
Morphology, Finite States Automata, Finite States Transducers.
PoS tagging, Hidden Markov Models.
Lexical semantics, Semantic resources.
Word Sense Diambiguation. -
Word sequences
Recognition and classification of word sequences with meaning.
BIO discriminative models. Conditional Random Fields (CRF).
Named Entity Recognition and Classification (NERC).
Noun-phrase Chunking. -
Sentences
Syntactic grammars, typology. Context free grammars. Probabilistic context free grammars. Chomsky normal form grammars.
Syntactic parsers, properties and strategies. CKY and probabilistic CKY parsers. -
Sentence sequences
Coreference resolution. Mention detection. Types of techniques for the generation of coreferents chains. Mention-pair model. Entity-mention model. Rankers model.
Activities
Activity Evaluation act
Project presentation
Theory
4h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
40h
Final exam
Week: 15 (Outside class hours)
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h
Teaching methodology
There are two types of sessions: theory/exercise and laboratory.In each theory/exercise session we will introduce new concepts together with the challenges they present and the approaches to face them. In addition, we will solve some exercises to fix those concepts, techniques and algorithms introduced in the session.
In the laboratory sessions small practices will be developed using the appropriate NLP tools to practice and reinforce the knowledge learned in the theory classes.
Evaluation methodology
There will be a unique exam at the end of the course, one project and one deliverable for each lab session. The exam will include all the course contents.The mark of the project and deliverables will be computed by considering the documents presented by the students.
The final mark of the course will be calculated as follows:
Course mark = final exam mark* 0.5 + lab mark * 0.5
Bibliography
Basic
-
Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition
- Jurafsky, D.; Martin, J.H,
Prentice-Hall, Inc.,
2024.
https://web.stanford.edu/~jurafsky/slp3/ -
The Oxford handbook of computational linguistics
- Mitkov, R. (ed.),
Oxford University Press,
2003.
ISBN: 0198238827
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991002689009706711&context=L&vid=34CSUC_UPC:VU1&lang=ca -
Foundations of statistical natural language processing
- Manning, C.D.; Schütze, H,
MIT Press,
1999.
ISBN: 0262133601
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991001994779706711&context=L&vid=34CSUC_UPC:VU1&lang=ca -
Handbook of natural language processing
- Dale, R.; Moisl, H.; Somers, H,
Marcel Dekker,
2000.
ISBN: 0824790006
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991002071619706711&context=L&vid=34CSUC_UPC:VU1&lang=ca -
The Handbook of Computational Linguistics and Natural Language Processing Blackwell Handbooks in Linguistics
- Clark, Alexander ; Fox, Chris; Lappin, Shalom,
Wiley-Blackwell,
2010.
ISBN: 9781444324044
https://onlinelibrary-wiley-com.recursos.biblioteca.upc.edu/doi/book/10.1002/9781444324044
Web links
- Time table of the course depending on the holidays http://www.cs.upc.edu/~turmo/IHLT.html