Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework

Cámara-Moreno, Javier; Burgueño-Caballero, Lola; Troya-Castilla, Javier

doi:10.1007/s10270-024-01206-9

Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework

dc.centro	E.T.S.I. Informática	es_ES
dc.contributor.author	Cámara-Moreno, Javier
dc.contributor.author	Burgueño-Caballero, Lola
dc.contributor.author	Troya-Castilla, Javier
dc.date.accessioned	2024-09-18T12:15:35Z
dc.date.available	2024-09-18T12:15:35Z
dc.date.issued	2024-09-03
dc.departamento	Instituto de Tecnología e Ingeniería del Software de la Universidad de Málaga
dc.description.abstract	The integration of Large Language Models (LLMs) in software modeling tasks presents both opportunities and challenges. This Expert Voice addresses a significant gap in the evaluation of these models, advocating for the need for standardized benchmarking frameworks. Recognizing the potential variability in prompt strategies, LLM outputs, and solution space, we propose a conceptual framework to assess their quality in software model generation. This framework aims to pave the way for standardization of the benchmarking process, ensuring consistent and objective evaluation of LLMs in software modeling. Our conceptual framework is illustrated using UML class diagrams as a running example.	es_ES
dc.description.sponsorship	Funding for open access charge: Universidad de Málaga / CBUA	es_ES
dc.identifier.citation	Cámara, J., Burgueño, L. & Troya, J. Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework. Softw Syst Model (2024). https://doi.org/10.1007/s10270-024-01206-9	es_ES
dc.identifier.doi	10.1007/s10270-024-01206-9
dc.identifier.uri	https://hdl.handle.net/10630/32630
dc.language.iso	eng	es_ES
dc.publisher	Springer	es_ES
dc.rights	Atribución 4.0 Internacional	*
dc.rights.accessRights	open access	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	*
dc.subject	Empresas - Gestión	es_ES
dc.subject.other	Modeling	es_ES
dc.subject.other	LLMs	es_ES
dc.subject.other	Benchmarking	es_ES
dc.title	Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework	es_ES
dc.type	journal article	es_ES
dc.type.hasVersion	VoR	es_ES
dspace.entity.type	Publication
relation.isAuthorOfPublication	20052283-aeaf-42b8-85ee-52d9589e5797
relation.isAuthorOfPublication	31808e70-d2ec-4318-8ead-dded38954d40
relation.isAuthorOfPublication	3ea98dd7-8c4e-4639-9c87-2228ad0f56be
relation.isAuthorOfPublication.latestForDiscovery	20052283-aeaf-42b8-85ee-52d9589e5797

Files

Original bundle

Now showing 1 - 1 of 1

Name:: s10270-024-01206-9 (1).pdf
Size:: 1.41 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Artículos