|
|
|
||
Last update: RNDr. Jiří Mírovský, Ph.D. (16.05.2022)
|
|
||
Last update: RNDr. Jiří Mírovský, Ph.D. (16.05.2022)
Ensuring a credit is conditional on active participation in teaching, handing over all homework and earning >70% of the points from these tasks. |
|
||
Last update: RNDr. Jiří Mírovský, Ph.D. (16.05.2022)
presentations from the past: http://ufal.mff.cuni.cz/courses/NPFL131 Learning Perl, 8th Edition (use at least 5th Edition) Pro Git Learning the bash Shell Linux Pocket Guide |
|
||
Last update: RNDr. Jiří Mírovský, Ph.D. (16.05.2022)
Using large texts, we will learn the basic methods of text processing needed to obtain non-trivial information. For Czech we will use texts of works by Karel Čapek, for Classical Chinese selected texts from https://github.com/kanripo, for other languages works according to the focus of the students.
importance and statistical properties of Big Data unix shell; most basic commands more unix commands and basic Perl to manipulate texts text editors quantitative analysis of text comparing texts and visualizing differences search using regular expressions using regular expressions to batch edit text diacritic removal, sentence segmentation, tokenization getting information on Chinese characters from Unihan database rule-based automatic part of speech identification creating your own corpus "NLP workflow engines" - GATE, OpenNLP, Treex calling REST APIs UDPipe and select the appropriate model if there are more than one for the language visualization of analysis and results |