Thesis (Selection of subject)Thesis (Selection of subject)(version: 385)
Thesis details
   Login via CAS
Řešení dezinformací v srbochorvatštině: korpusy a experimenty
Thesis title in Czech: Řešení dezinformací v srbochorvatštině: korpusy a experimenty
Thesis title in English: Tackling misinformation in Serbo-Croatian: corpora and experiments
Key words: NLP|dezinformace|srbochorvatština|korpus|klasifikace
English key words: NLP|misinformation|fake news|Serbo-Croatian|corpora|classification
Academic year of topic announcement: 2022/2023
Thesis type: diploma thesis
Thesis language:
Department: Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor: RNDr. Jiří Hana, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 29.03.2023
Date of assignment: 30.03.2023
Confirmed by Study dept. on: 08.03.2024
Guidelines
Explore the area of misinformation in the news written in Serbo-Croatian (Serbian, Croatian, Bosnian, Montenegrin; closely related South Slavic languages).

- Create a news corpus with metadata describing whether the articles is trustworthy and if not then in which respect
- Evaluate the possibilities of automatic processing, for example:
- Classification of news articles: binary (truthful vs misinformation) or mutlilabel (fake news, pseudoscience, conspiracy theory, etc.)
- Claim detection - extraction of claims from articles
References
- Max Glockner, Yufang Hou, and Iryna Gurevych. 2022. Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5916–5936, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Zhijiang Guo, Michael Schlichtkrull, and Andreas Vlachos. 2022. A Survey on Automated Fact-Checking. Transactions of the Association for Computational Linguistics, 10:178–206.
- Isabelle Augenstein. 2021. Towards Explainable Fact Checking. ArXiv, abs/2108.10274.
- Nikola Ljubešić and Davor Lauc. 2021. BERTić - The Transformer Language Model for Bosnian, Croatian, Montenegrin and Serbian. In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing, pages 37–42, Kiyv, Ukraine. Association for Computational Linguistics.
- James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. FEVER: a Large-scale Dataset for Fact Extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 809–819, New Orleans, Louisiana. Association for Computational Linguistics.
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html