Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework

Cámara-Moreno, Javier; Burgueño-Caballero, Lola; Troya-Castilla, Javier

doi:10.1007/s10270-024-01206-9

Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework

Files

s10270-024-01206-9 (1).pdf (1.41 MB)

Identifiers

URI: https://hdl.handle.net/10630/32630

DOI: 10.1007/s10270-024-01206-9

Publication date

2024-09-03

Authors

Cámara-Moreno, Javier

Burgueño-Caballero, Lola

Troya-Castilla, Javier

Publisher

Springer

Metrics

Share

Export

Center

E.T.S.I. Informática

Department/Institute

Instituto de Tecnología e Ingeniería del Software de la Universidad de Málaga

Keywords

Empresas - Gestión

Abstract

The integration of Large Language Models (LLMs) in software modeling tasks presents both opportunities and challenges. This Expert Voice addresses a significant gap in the evaluation of these models, advocating for the need for standardized benchmarking frameworks. Recognizing the potential variability in prompt strategies, LLM outputs, and solution space, we propose a conceptual framework to assess their quality in software model generation. This framework aims to pave the way for standardization of the benchmarking process, ensuring consistent and objective evaluation of LLMs in software modeling. Our conceptual framework is illustrated using UML class diagrams as a running example.

Bibliographic citation

Cámara, J., Burgueño, L. & Troya, J. Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework. Softw Syst Model (2024). https://doi.org/10.1007/s10270-024-01206-9

Collections

Artículos

Creative Commons license

Except where otherwised noted, this item's license is described as Atribución 4.0 Internacional

Full item page

Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework

Files

Identifiers

Publication date

Reading date

Authors

Collaborators

Advisors

Tutors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Metrics

Share

Export

Research Projects

Organizational Units

Journal Issue

Center

Department/Institute

Keywords

Abstract

Description

Bibliographic citation

Collections

Endorsement

Review

Supplemented By

Referenced by

Creative Commons license