ECTS credits ECTS credits: 6
ECTS Hours Rules/Memories Hours of tutorials: 1 Expository Class: 21 Interactive Classroom: 21 Total: 43
Use languages English
Type: Ordinary subject Master’s Degree RD 1393/2007 - 822/2021
Departments: Electronics and Computing, External department linked to the degrees
Areas: Computer Science and Artificial Intelligence, Área externa M.U en Intelixencia Artificial
Center Higher Technical Engineering School
Call: First Semester
Teaching: With teaching
Enrolment: Enrollable | 1st year (Yes)
The course introduces the basic concepts and techniques associated with natural language processing, the starting point for the design of environments for the exploitation of information and dialogue based on human language, both at a lexical, syntactic, semantic and pragmatic level.
The objective is to introduce the student to the complexity inherent in the analysis of human natural language, fundamentally associated with the ambiguity and contextual dependencies that they present, and to the design of data structures and algorithms that allow their practical treatment.
Analysis levels. Ambiguity and contextual dependencies.
Lexical analysis: segmentation. dictionaries and thesauri. Part-of-speech tagging.
Syntactic Parsing: algebraic grammars. mildly context-sensitive grammars. dependency grammars. probabilistic grammars.
Semantic analysis: lexical semantics, semantic dependencies and semantic graphs.
Basic bibliografy:
Manning, C., & Schutze, H. (1999). Foundations of statistical natural language processing. MIT Press
Goldberg, Y. (2017). Neural network methods for natural language processing. Synthesis lectures on human language technologies. Morgan Claypool
Jacob Eisenstein (2019). Introduction to Natural Language Processing. MIT Press
Jurafsky, D. & Martin, J. H. (2022). Speech and Language Processing (3rd ed. draft). Disponible en: https://web.stanford.edu/~jurafsky/slp3/
Manning, C., & Schutze, H. (1999). Foundations of statistical natural language processing. MIT Press
Complementary:
Chollet, F. (2018). Keras: The python deep learning library. Astrophysics Source Code Library
Stuart Russell, Peter Norvig (2020). Artificial Intelligence: A Modern Approach, 4th Edition. Pearson
Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze (2008). Introduction to Information Retrieval. Cambridge University Press, Cambridge
Kübler, S., McDonald, R., & Nivre, J. (2009). Dependency Parsing. Synthesis lectures on human language technologies. Morgan Claypool
Additionally, scientific texts available in digital libraries in the research field, such as ACL Anthology or ACM.
Basic and general skills:
CG1 - Maintain and extend grounded theoretical approaches to allow the introduction and exploitation of new and advanced technologies in the field of Artificial Intelligence.
CG3 - Search and select the useful information necessary to solve complex problems, handling the bibliographic sources of the field with ease.
CG4 - Prepare adequately and with some originality written compositions or motivated arguments, write plans, work projects, scientific articles and formulate reasonable hypotheses in the field.
CB6 - Possess and understand knowledge that provides a basis or opportunity to be original in the development and/or application of ideas, often in a research context
CB7 - That students know how to apply the knowledge acquired and their ability to solve problems in new or little-known environments within broader (or multidisciplinary) contexts related to their area of study.
CB10 - That students have the learning skills that allow them to continue studying in a way that will be largely self-directed or autonomous.
Transversal skills:
CT2 - Master the oral and written expression and comprehension of a foreign language.
CT3 - Use the basic tools of information and communication technologies (ICT) necessary for the exercise of their profession and for lifelong learning.
CT7 - Develop the ability to work in interdisciplinary teams, to offer proposals that contribute to sustainable environmental, economic, political and social development.
CT8 - Assess the importance of research, innovation and technological development in the socioeconomic and cultural progress of society.
Specific skills:
CE1.- Understanding and command of techniques for text processing in natural language.
CE2.- Understanding and command of the fundamentals and semantic processing techniques of linked, structured and unstructured documents, and of the representation of their content.
CE3.- Understanding and knowledge of knowledge representation techniques and reasoning through ontologies, knowledge graphs and data models (such as RDF), as well as the tools associated with them.
Learning outcomes:
− Know, understand and analyze the formal representation of various lexical, syntactic and semantic phenomena of natural language
− Know, understand and know how to use the technologies, frameworks and libraries for the construction of natural language processing systems
− Design, implement and know how to use algorithms and data structures to process and support the various phenomena characteristic of natural language
− Know, understand and analyze natural language processing techniques for processing and disambiguation at a lexical, syntactic and semantic level.
− Know and understand the problems posed by ambiguity and inaccuracy in natural language data sources and techniques to solve them.
− Know how to use the techniques and methods of natural language processing to solve real problems of analysis of texts in natural language.
− Know, understand and analyze techniques based on ontologies applied to natural language processing
− Know, understand and analyze deep learning techniques applied to natural language processing
− Know how to use deep learning techniques and methods to solve practical natural language processing problems
− Know and understand the environmental problems posed by the computational cost of deep learning techniques when applied to text analysis.
− Know, understand and analyze current search and mining techniques on the web.
− Know, understand and analyze the current techniques of semantic technologies.
− Know how to use the techniques and methods of representing knowledge and reasoning through ontologies and knowledge graphs to solve real problems.
− Know techniques, methods and good practices for the representation and publication of data and their subsequent consultation, using semantic technologies.
− Design, implement and know how to use algorithms and data structures for recommendation systems.
− Know how to apply different models of information retrieval and extraction, sentiment analysis and other possible applications of text mining.
Lectures, laboratory practices, tutorials, autonomous work, case studies, project-based learning.
From the combination of methods, there will be:
Lectures, in which the content of each topic is presented and discussed. The student will have copies of the slides in advance and the lecturer will promote an active attitude, asking questions that allow clarifying specific aspects and leaving questions open for the student's reflection.
Reading and study of diverse material provided by the lecturer in the form of bibliography books, articles and scientific journals.
Practical classes with computers, which will allow students to familiarize themselves from a practical and hands-on point of view with the issues raised in the lectures.
E1: Final exam 50%
E2: Evaluation of practical works 50%
Students must achieve at least 40% of the maximum grade for each part (E1, E2) and in any case the sum of both parts must exceed 5 to pass the subject. If any of the above requirements is not met, the grade will be established according to the lowest grade obtained.
In case of not reaching the minimum in any of the parts, the student will have a second opportunity in which they will only be required to deliver that part.
The practices must be handed in within the period established in the virtual campus and must follow the specifications indicated in the rules for each exercise in terms of their presentation and defense.
Whoever submits all the mandatory practices or attends the objective test in the official evaluation period will have the status of "Presented".
In the case of fraudulent completion of exercises or tests, the Regulations for evaluating the academic performance of students and reviewing qualifications will be applied. In application of the corresponding regulations on plagiarism, the total or partial copy of some practice or theory exercise will suppose the suspense in the two opportunities of the course, with the qualification of 0.0 in both cases.
Study time and personal work comprises a total of 150 hours, divided into the following training activities:
A1: Theory classes, 21h. in-person lectures + 42h. dedication to self-study
A2: Practical laboratory classes, 14h. in-person interactive sessions + 62h. autonomous work.
A3: Problem-based learning, seminars, case studies and projects, 7 h. in-person + 46h. of dedication.
Recommended prerequisites: Basic knowledge of Automata Theory and Formal Languages.
The virtual campus is used.
Alejandro Catala Bolos
Coordinador/a- Department
- Electronics and Computing
- Area
- Computer Science and Artificial Intelligence
- alejandro.catala [at] usc.es
- Category
- Professor: LOU (Organic Law for Universities) PhD Assistant Professor
Nikolay Babakov
- Department
- Electronics and Computing
- Area
- Computer Science and Artificial Intelligence
- nikolay.babakov [at] usc.es
- Category
- Predoutoral Marie Curie
Wednesday | |||
---|---|---|---|
17:00-18:30 | Grupo /CLE_01 | English | IA.02 |
18:30-20:00 | Grupo /CLIL_01 | English | IA.02 |
01.20.2025 16:00-20:00 | Grupo /CLE_01 | IA.02 |
01.20.2025 16:00-20:00 | Grupo /CLIL_01 | IA.02 |
06.20.2025 16:00-20:00 | Grupo /CLIL_01 | IA.02 |
06.20.2025 16:00-20:00 | Grupo /CLE_01 | IA.02 |