D
-
Through the Lens of Core Competency: Survey on Evaluation of Large Language ModelsGenerative AI/benchmarks 2023. 11. 4. 14:07
Abstract From pre-trained language model(PLM) to large language model (LLM), the field of natural language processing (NLP) has witnessed steep performance gains and wide practical uses. The evaluation of a research field guides its direction of improvement. However, LLMs are extremely hard to thoroughly evaluate for two reasons. First of all, traditional NLP tasks become inadequate due to the e..