|
|
|
||
Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)
|
|
||
Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)
Introduce the operation of large neural language models: their basic typology, how they are trained, how they are used, their application potential and societal implications. Graduates of the course should be able to use large-scale language models in problem solving and make informed judgements about the risks associated with the use of this technology. |
|
||
Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)
active participation, final test |
|
||
Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)
information processing systems, 2017, 30. DEVLIN, Jacob; CHANG, Ming-Wei; KENTON, Lee; TOUTANOVA, Kristina. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of NAACL-HLT. 2019. p. 4171-4186. RADFORD, Alec, et al. Language models are unsupervised multitasklearners. OpenAI blog, 2019, 1.8: 9. BROWN, Tom, et al. Language models are few-shot learners. Advances in neural information processing systems, 2020, 33: 1877-1901 RAFFEL, Colin, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21.1: 5485-5551. ROGERS, Anna; KOVALEVA, Olga; RUMSHISKY, Anna. A primer in BERTology:What we know about how BERT works. Transactions of the Association for Computational Linguistics, 2021, 8: 842-866. CONNEAU, Alexis, et al. Unsupervised Cross-lingual RepresentationLearning at Scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. p. 8440-8451. XUE, Linting, et al. mT5: A Massively Multilingual Pre-trainedText-to-Text Transformer. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021. p. 483-498. RADFORD, Alec, et al. Learning transferable visual models from naturallanguage supervision. In: International conference on machine learning. PMLR, 2021. p. 8748-8763. OUYANG, Long, et al. Training language models to follow instructionswith human feedback. Advances in Neural Information Processing Systems, 2022, 35: 27730-27744. TOUVRON, Hugo, et al. Llama: Open and efficient foundation languagemodels. arXiv preprint arXiv:2302.13971, 2023. |
|
||
Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)
Language model typology [2] Data acquisition and curation, downstream tasks Training (self-supervised learning, reinforcement learning with human feedback) Finetuning & Inference Multilinguality and cross-lingual transfer Large Language Model Applications (e.g., conversational systems, robotics, code generation) [2-3] Multimodality (CLIP, diffusion models) Societal impacts Interpretability |