Předměty

Poslední úprava: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

Předmět se věnuje velkým neuronovým jazykovým modelům. Pokrývá související teoretické koncepty, technické základy fungování a využití jazykových modelů.

Poslední úprava: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

he course is devoted to large neural language models. It covers the related theoretical concepts, the technical foundations of operation and the use of language models.

Poslední úprava: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

Představit fungování velkých neuronových jazykových modelů: jejich

základní typologii, jakým způsobem se trénují, jakým způsobem se

používají, jejich aplikační potenciál a společenské důsledky. Absolventi

předmětu by měli být schopni využívat velké jazykové modely při řešení

úloh a kvalifikovaně posoudit rizika spojená s využíváním této technologie.

Poslední úprava: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

Introduce the operation of large neural language models: their

basic typology, how they are trained, how they are used, their

application potential and societal implications. Graduates of the course

should be able to use large-scale language models in problem solving and

make informed judgements about the risks associated with the use of this

technology.

Poslední úprava: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

aktivní účast, závěrečný test

Poslední úprava: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

active participation, final test

Poslední úprava: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

VASWANI, Ashish, et al. Attention is all you need. Advances in neural
information processing systems, 2017, 30.

DEVLIN, Jacob; CHANG, Ming-Wei; KENTON, Lee; TOUTANOVA, Kristina.
BERT: Pre-training of Deep Bidirectional Transformers for Language

Understanding. In: Proceedings of NAACL-HLT. 2019. p. 4171-4186.

RADFORD, Alec, et al. Language models are unsupervised multitask
learners. OpenAI blog, 2019, 1.8: 9.

BROWN, Tom, et al. Language models are few-shot learners. Advances in
neural information processing systems, 2020, 33: 1877-1901

RAFFEL, Colin, et al. Exploring the limits of transfer learning with a
unified text-to-text transformer. The Journal of Machine Learning

Research, 2020, 21.1: 5485-5551.

ROGERS, Anna; KOVALEVA, Olga; RUMSHISKY, Anna. A primer in BERTology:
What we know about how BERT works. Transactions of the Association for

Computational Linguistics, 2021, 8: 842-866.

CONNEAU, Alexis, et al. Unsupervised Cross-lingual Representation
Learning at Scale. In: Proceedings of the 58th Annual Meeting of the

Association for Computational Linguistics. 2020. p. 8440-8451.

XUE, Linting, et al. mT5: A Massively Multilingual Pre-trained
Text-to-Text Transformer. In: Proceedings of the 2021 Conference of the

North American Chapter of the Association for Computational Linguistics:

Human Language Technologies. 2021. p. 483-498.

RADFORD, Alec, et al. Learning transferable visual models from natural
language supervision. In: International conference on machine learning.

PMLR, 2021. p. 8748-8763.

OUYANG, Long, et al. Training language models to follow instructions
with human feedback. Advances in Neural Information Processing Systems,

2022, 35: 27730-27744.

TOUVRON, Hugo, et al. Llama: Open and efficient foundation language
models. arXiv preprint arXiv:2302.13971, 2023.

Poslední úprava: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

VASWANI, Ashish, et al. Attention is all you need. Advances in neural
information processing systems, 2017, 30.

DEVLIN, Jacob; CHANG, Ming-Wei; KENTON, Lee; TOUTANOVA, Kristina.
BERT: Pre-training of Deep Bidirectional Transformers for Language

Understanding. In: Proceedings of NAACL-HLT. 2019. p. 4171-4186.

RADFORD, Alec, et al. Language models are unsupervised multitask
learners. OpenAI blog, 2019, 1.8: 9.

BROWN, Tom, et al. Language models are few-shot learners. Advances in
neural information processing systems, 2020, 33: 1877-1901

RAFFEL, Colin, et al. Exploring the limits of transfer learning with a
unified text-to-text transformer. The Journal of Machine Learning

Research, 2020, 21.1: 5485-5551.

ROGERS, Anna; KOVALEVA, Olga; RUMSHISKY, Anna. A primer in BERTology:
What we know about how BERT works. Transactions of the Association for

Computational Linguistics, 2021, 8: 842-866.

CONNEAU, Alexis, et al. Unsupervised Cross-lingual Representation
Learning at Scale. In: Proceedings of the 58th Annual Meeting of the

Association for Computational Linguistics. 2020. p. 8440-8451.

XUE, Linting, et al. mT5: A Massively Multilingual Pre-trained
Text-to-Text Transformer. In: Proceedings of the 2021 Conference of the

North American Chapter of the Association for Computational Linguistics:

Human Language Technologies. 2021. p. 483-498.

RADFORD, Alec, et al. Learning transferable visual models from natural
language supervision. In: International conference on machine learning.

PMLR, 2021. p. 8748-8763.

OUYANG, Long, et al. Training language models to follow instructions
with human feedback. Advances in Neural Information Processing Systems,

2022, 35: 27730-27744.

TOUVRON, Hugo, et al. Llama: Open and efficient foundation language
models. arXiv preprint arXiv:2302.13971, 2023.

Poslední úprava: Mgr. Jindřich Libovický, Ph.D. (29.04.2024)

Základy neuronových sítí pro modelování jazyka
Typologie jazykových modelů [2]
Získávání a správa dat, navazující úlohy
Trénování (samoučení, zpětnovazební učení s využítím lidských anotací)
Dolaďování a odvozování
Vícejazyčnost a mezijazykový přenos
Aplikace velkých jazykových modelů (např. konverzační systémy, robotika, generování kódu) [2-3].
Multimodalita (CLIP, difuzní modely)
Společenské dopady
Interpretabilita

Poslední úprava: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

Basics of neural networks for language modeling
Language model typology [2]
Data acquisition and curation, downstream tasks
Training (self-supervised learning, reinforcement learning with human feedback)
Finetuning & Inference
Multilinguality and cross-lingual transfer
Large Language Model Applications (e.g., conversational systems, robotics, code generation) [2-3]
Multimodality (CLIP, diffusion models)
Societal impacts
Interpretability