Trustworthiness of LLMs

Generative AI/trends

Trustworthiness of LLMs

김아다만티움 2024. 7. 30. 09:48

Liu, Y., Yao, Y., Ton, J. F., Zhang, X., Cheng, R. G. H., Klochkov, Y., ... & Li, H. (2023). Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment. arXiv preprint arXiv:2308.05374.

Truthworthiness에 대한 정의 및 세부 능력군 설정

Sun, L., Huang, Y., Wang, H., Wu, S., Zhang, Q., Gao, C., ... & Zhao, Y. (2024). Trustllm: Trustworthiness in large language models. arXiv preprint arXiv:2401.05561.

2)의 논문 뒤에는 각 카테고리별 벤치마크 방법 및 성능 전개