- Handa, D., Chirmule, A., Gajera, B., & Baral, C. (2024). Jailbreaking proprietary large language models using word substitution cipher. https://arxiv.org/abs/2402.10601
- Janus. (2023). Simulators [Accessed: 2023-07-07]. https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators
- Liu, Y., Deng, G., Xu, Z., Li, Y., Zheng, Y., Zhang, Y., Zhao, L., Zhang, T., Wang, K., & Liu, Y. (2024). Jailbreaking chatgpt via prompt engineering: An empirical study. https://arxiv.org/abs/2305.13860
- Milička, J. (2024). Theoretical and methodological framework for studying texts produced by large language models. https://arxiv.org/abs/2408.16740
- Nardo, C. (2024). The waluigi effect: Mega post [Accessed: 2024-08-18]. https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7
- Reynolds, L., & McDonell, K. (2021). Multiversal views on language models.https://arxiv.org/abs/2102.06391
- Shanahan, M., McDonell, K., & Reynolds, L. (2023). Role play with large language models. Nature, 623 (7987), 493–498.
Last update: Milička Jiří, doc. PhDr., Ph.D. (25.02.2025)
|