Thesis (Selection of subject)Thesis (Selection of subject)(version: 385)
Thesis details
   Login via CAS
Morfosyntaktická anotace torwalštiny
Thesis title in Czech: Morfosyntaktická anotace torwalštiny
Thesis title in English: Morphosyntactic Annotation of Torwali
Key words: treebank|universal dependencies|low-resource languages
English key words: treebank|universal dependencies|low-resource languages
Academic year of topic announcement: 2023/2024
Thesis type: diploma thesis
Thesis language:
Department: Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor: doc. RNDr. Daniel Zeman, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 20.05.2024
Date of assignment: 20.05.2024
Confirmed by Study dept. on: 21.05.2024
Guidelines
The goal of the thesis is to create a small dependency treebank of Torwali, an Indo-Aryan language with very little pre-existing digital resources. The treebank will follow the Universal Dependencies standard; however, certain language-specific guidelines for Torwali will have to be specified. For the morphological part of the annotation, a finite-state morphological analyzer and lexicon will be prepared. The annotation of the sample can be helped using known techniques for low-resource languages, such as parser bootstrapping, multilingual and cross-lingual parsing.
References
Marie-Catherine de Marneffe, Christopher Manning, Joakim Nivre, Daniel Zeman (2021): Universal Dependencies. In: Computational Linguistics, ISSN 1530-9312, vol. 47, no. 2, pp. 255-308

Kenneth R. Beesley, Lauri Karttunen: Finite State Morphology. CSLI Publications, 2003

Daniel Zeman, Philip Resnik (2008): Cross-Language Parser Adaptation between Related Languages. In: IJCNLP 2008 Workshop on NLP for Less Privileged Languages, pp. 35-42, International Institute of Information Technology, Hyderabad, India

Željko Agić, Dirk Hovy, and Anders Søgaard (2015). If all you have is a bit of the Bible: Learning POS taggers for truly lowresource languages. In The 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2015).

Universal Dependencies v2 guidelines (2014-2018): http://universaldependencies.org/
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html