Ladění modelů pro generování kódu pomocí zpětné vazby od kompilátoru
Název práce v češtině: Ladění modelů pro generování kódu pomocí zpětné vazby od kompilátoru
Název v anglickém jazyce: Fine-tuning Code Generation Models with Compiler Feedback
Ústav: Katedra teoretické informatiky a matematické logiky (32-KTIML)
Vedoucí / školitel: Mgr. Martin Pilát, Ph.D.
Oponenti: Mgr. Gabriela Kadlecová
State-of-the-art large language models (LLM) trained on code, although powerful, can sometimes make mistakes that a compiler can detect. LLM-generated code may produce non-terminating loops or use uninitialized variables or incorrect types. This problem can be framed as the model's output not aligned with the user’s desired outcome of compilable code. The goal of the thesis is to use the compiler feedback to improve code generation models.

The student will study relevant literature about code generation with compiler feedback. Based on this, she will create a new model or improve an existing model for code generation. She will compare the results to existing models.
[1] Jain, Abhinav, Chima Adiole, Swarat Chaudhuri, Thomas Reps, and Chris Jermaine. "Tuning Models of Code with Compiler-Generated Reinforcement Learning Feedback." arXiv preprint arXiv:2305.18341 (2023).
[2] Rafailov, Rafael, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, and Chelsea Finn. "Direct preference optimization: Your language model is secretly a reward model." arXiv preprint arXiv:2305.18290 (2023).
[3] Goodfellow I., Bengio Y., Courville, A.: "Deep Learning". MIT Press, 2016. ISBN: 978-0262035613. Online: http://www.deeplearningbook.org
