A framework for assessing the capabilities of code generation of constraint domain-specific languages with large language models
Loading...
Identifiers
Publication date
Reading date
Collaborators
Advisors
Tutors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Share
Center
Department/Institute
Abstract
Large language models (LLMs) can be used to support software development tasks, e.g., through code completion
or code generation. However, their effectiveness drops significantly when considering less popular programming
languages such as domain-specific languages (DSLs). In this paper, we propose a generic framework for evaluating
the capabilities of LLMs generating DSL code from textual specifications. The generated code is assessed from the
perspectives of well-formedness and correctness. This framework is applied to a particular type of DSL, constraint
languages, focusing our experiments on OCL and Alloy and comparing their results to those achieved for Python,
a popular general-purpose programming language. Experimental results show that, in general, LLMs have better
performance for Python than for OCL and Alloy. LLMs with smaller context windows such as open-source LLMs
may be unable to generate constraint-related code, as this requires managing both the constraint and the domain
model where it is defined. Moreover, some improvements to the code generation process such as code repair
(asking an LLM to fix incorrect code) or multiple attempts (generating several candidates for each coding task)
can improve the quality of the generated code. Meanwhile, other decisions like the choice of a prompt template
have less impact. All these dimensions can be systematically analyzed using our evaluation framework, making
it possible to decide the most effective way to set up code generation for a particular type of task.
Description
Bibliographic citation
David Delgado, Lola Burgueño, Robert Clarisó, A framework for assessing the capabilities of code generation of constraint domain-specific languages with large language models, Journal of Systems and Software, Volume 238, 2026, 112871, ISSN 0164-1212, https://doi.org/10.1016/j.jss.2026.112871.
Collections
Endorsement
Review
Supplemented By
Referenced by
Creative Commons license
Except where otherwised noted, this item's license is described as Attribution-NonCommercial 4.0 International













