SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
Language Technologies in Practice - NPFL128
Title: Jazykové technologie v praxi
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2021
Semester: summer
E-Credits: 4
Hours per week, examination: summer s.:2/1, MC [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: English
Teaching methods: full-time
Teaching methods: full-time
Additional information: https://ufal.mff.cuni.cz/courses/npfl128
Guarantor: RNDr. Jiří Hana, Ph.D.
Incompatibility : NPFL096
Interchangeability : NPFL096
Is incompatible with: NPFL096
Is interchangeable with: NPFL096
Annotation -
Last update: doc. Mgr. Barbora Vidová Hladká, Ph.D. (31.01.2019)
The course surveys solutions to common NLP tasks ranging from entity recognition to text generation. It evaluates various approaches (machine learning, rules, larger resources, ...) and their combinations. Part of the course consists of students presenting and discussing papers relevant to a give topic. Each student implements a prototype system solving a particular task.
Course completion requirements -
Last update: RNDr. Jiří Hana, Ph.D. (10.06.2019)
  • leading discussion on selected papers (max 2 papers per person)
  • programming project

Literature -
Last update: doc. Mgr. Barbora Vidová Hladká, Ph.D. (31.01.2019)
  • Koskenniemi, Kimmo, 1983, Two-level Morphology: A General Computational Model for Word-Form Recognition and Production, University of Helsinki, Department of General Linguistics.
  • Goldsmith, John. 2001. Unsupervised Acquisition of the Morphology of a Natural Language.
  • Yarowsky, David and Richard Wicentowski. 2001. Minimally supervised morphological analysis by multimodal alignment. Proceedings of ACL-2000, Hong Kong, pages 207-216
  • Schone, Patrick and Daniel Jurafsky. 2001. Knowledge-Free Induction of Inflectional Morphologies. Proceedings of the North American Chapter of the Association for Computational Linguistics.
  • Cucerzan. 2007. Large-Scale Named Entity Disambiguation Based on Wikipedia Data
  • Daiber, Joachim, Max Jakob, Chris Hokamp and Pablo N. Mendes 2013. Improving Efficiency and Accuracy in Multilingual Entity Extraction. Proceedings of the 9th International Conference on Semantic Systems (I-Semantics)
  • Surdeanu, Mihai, David McClosky, Mason R. Smith, Andrey Gusev, and Christopher D. Manning. 2011. Customizing an Information Extraction System to a New Domain. In Proceedings of the ACL 2011 Workshop on Relational Models of Semantics
  • Reiter, Ehud and Robert Dale 2000. Building Natural Language Generation Systems. Cambridge University Press.

Syllabus -
Last update: doc. Mgr. Barbora Vidová Hladká, Ph.D. (31.01.2019)
  • processing morphology
    • engineering approach to morphology, lemmatization
    • unsupervised and lightly-supervised morphology
    • Linguistica, Yarowski & Wicentowski 2001, Schoene & Jurafsky 2001, Morfessor
  • sentiment analysis
  • entities
    • named, unnamed and structured entities
    • recognition, normalization, standardization,
    • linking, knowledge graphs
  • intent detection
  • relation extraction
  • Natural Language Generation (NLG)
    • generation of documents vs. short texts/phrases
    • classical NLG vs neural NLG
    • document planning, microplanning, lexicalization, realization
  •  
    Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html