Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework
Loading...
Identifiers
Publication date
Reading date
Collaborators
Advisors
Tutors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Share
Center
Department/Institute
Keywords
Abstract
The integration of Large Language Models (LLMs) in software modeling tasks presents both opportunities and challenges.
This Expert Voice addresses a significant gap in the evaluation of these models, advocating for the need for standardized
benchmarking frameworks. Recognizing the potential variability in prompt strategies, LLM outputs, and solution space, we
propose a conceptual framework to assess their quality in software model generation. This framework aims to pave the way
for standardization of the benchmarking process, ensuring consistent and objective evaluation of LLMs in software modeling.
Our conceptual framework is illustrated using UML class diagrams as a running example.
Description
Bibliographic citation
Cámara, J., Burgueño, L. & Troya, J. Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework. Softw Syst Model (2024). https://doi.org/10.1007/s10270-024-01206-9
Collections
Endorsement
Review
Supplemented By
Referenced by
Creative Commons license
Except where otherwised noted, this item's license is described as Atribución 4.0 Internacional














