Subjects

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Large Language Models - NPFL140

Title:	Velké jazykové modely
Guaranteed by:	Institute of Formal and Applied Linguistics (32-UFAL)
Faculty:	Faculty of Mathematics and Physics
Actual:	from 2023
Semester:	summer
E-Credits:	3
Hours per week, examination:	summer s.:0/2, C [HT]
Capacity:	unlimited
Min. number of students:	unlimited
4EU+:	no
Virtual mobility / capacity:	no
State of the course:	taught
Language:	English
Teaching methods:	full-time
Teaching methods:	full-time

Guarantor:	Mgr. Jindřich Helcl, Ph.D. Mgr. Jindřich Libovický, Ph.D.

Opinion survey results Examination dates SS schedule Noticeboard

Annotation -

Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

he course is devoted to large neural language models. It covers the related theoretical concepts, the technical foundations of operation and the use of language models.

Aim of the course -

Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

Introduce the operation of large neural language models: their

basic typology, how they are trained, how they are used, their

application potential and societal implications. Graduates of the course

should be able to use large-scale language models in problem solving and

make informed judgements about the risks associated with the use of this

technology.

Course completion requirements -

Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

active participation, final test

Literature -

Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

VASWANI, Ashish, et al. Attention is all you need. Advances in neural
information processing systems, 2017, 30.

DEVLIN, Jacob; CHANG, Ming-Wei; KENTON, Lee; TOUTANOVA, Kristina.
BERT: Pre-training of Deep Bidirectional Transformers for Language

Understanding. In: Proceedings of NAACL-HLT. 2019. p. 4171-4186.

RADFORD, Alec, et al. Language models are unsupervised multitask
learners. OpenAI blog, 2019, 1.8: 9.

BROWN, Tom, et al. Language models are few-shot learners. Advances in
neural information processing systems, 2020, 33: 1877-1901

RAFFEL, Colin, et al. Exploring the limits of transfer learning with a
unified text-to-text transformer. The Journal of Machine Learning

Research, 2020, 21.1: 5485-5551.

ROGERS, Anna; KOVALEVA, Olga; RUMSHISKY, Anna. A primer in BERTology:
What we know about how BERT works. Transactions of the Association for

Computational Linguistics, 2021, 8: 842-866.

CONNEAU, Alexis, et al. Unsupervised Cross-lingual Representation
Learning at Scale. In: Proceedings of the 58th Annual Meeting of the

Association for Computational Linguistics. 2020. p. 8440-8451.

XUE, Linting, et al. mT5: A Massively Multilingual Pre-trained
Text-to-Text Transformer. In: Proceedings of the 2021 Conference of the

North American Chapter of the Association for Computational Linguistics:

Human Language Technologies. 2021. p. 483-498.

RADFORD, Alec, et al. Learning transferable visual models from natural
language supervision. In: International conference on machine learning.

PMLR, 2021. p. 8748-8763.

OUYANG, Long, et al. Training language models to follow instructions
with human feedback. Advances in Neural Information Processing Systems,

2022, 35: 27730-27744.

TOUVRON, Hugo, et al. Llama: Open and efficient foundation language
models. arXiv preprint arXiv:2302.13971, 2023.

Syllabus -

Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2023)

Basics of neural networks for language modeling
Language model typology [2]
Data acquisition and curation, downstream tasks
Training (self-supervised learning, reinforcement learning with human feedback)
Finetuning & Inference
Multilinguality and cross-lingual transfer
Large Language Model Applications (e.g., conversational systems, robotics, code generation) [2-3]
Multimodality (CLIP, diffusion models)
Societal impacts
Interpretability