Decision making for lunar landing applications using AI agents and reinforcement learning.

Navarro, Tomás; Stroescu, Ana; Izzo, Darío; Gálvez-Rojas, Sergio; López-Valverde, Francisco

doi:10.1007/s42064-025-0292-2

Decision making for lunar landing applications using AI agents and reinforcement learning.

dc.centro	E.T.S.I. Informática
dc.contributor.author	Navarro, Tomás
dc.contributor.author	Stroescu, Ana
dc.contributor.author	Izzo, Darío
dc.contributor.author	Gálvez-Rojas, Sergio
dc.contributor.author	López-Valverde, Francisco
dc.date.accessioned	2026-04-15T06:42:46Z
dc.date.issued	2026-04-11
dc.departamento	Lenguajes y Ciencias de la Computación
dc.description	https://openpolicyfinder.jisc.ac.uk/publication/41443?from=single_hit
dc.description.abstract	This study explores the decision making capabilities of large language model (LLM) artificial intelligence (AI) agents to automate learning in lunar landing missions. In particular, the work investigates the use of AI agents to minimise human intervention in training a lunar lander by providing high-level strategic guidance to a reinforcement learning (RL) agent within the complex simulation environment of Kerbal Space Program (KSP). To that end, LLM AI agents are utilised to interpret a lander manual, extract key information to construct the reward function of the RL algorithm, and dynamically refine it based on training performance. A comparative case study evaluates the effectiveness of GPT-3.5-Turbo, GPT-4, and Meta-Llama-3-70B, in optimising the RL training process for lunar landings. Extending this approach, the study further explores AI-assisted hyperparameter optimisation (HPO) to streamline RL training. Instead of relying on traditional, computationally expensive methods like grid search and Bayesian optimisation, a zero-shot tuning approach is introduced, where an AI agent configures RL hyperparameters from a single instruction prompt without iterative refinements. Using GPT-4o, this method is applied to four RL algorithms (DQN, PPO, A2C, and SAC Discrete), demonstrating significant improvements in training efficiency, convergence speed, and reduced human effort, particularly for SAC Discrete. These findings highlight the potential of LLMs to automate both reward function design and hyperparameter tuning, thus advancing AI capabilities in space exploration and autonomous navigation tasks.
dc.identifier.citation	Navarro, T., Stroescu, A., Izzo, D. et al. Decision making for lunar landing applications using AI agents and reinforcement learning. Astrodyn (2026). https://doi.org/10.1007/s42064-025-0292-2
dc.identifier.doi	10.1007/s42064-025-0292-2
dc.identifier.uri	https://hdl.handle.net/10630/46380
dc.language.iso	eng
dc.publisher	Springer Nature
dc.rights	Attribution 4.0 International	en
dc.rights.accessRights	embargoed access
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	Aprendizaje automático (Inteligencia artificial)
dc.subject	Luna - Exploración
dc.subject.other	AI agents
dc.subject.other	Lunar landing
dc.subject.other	Reinforcement learning (RL)
dc.subject.other	Large language model (LLM)
dc.subject.other	Kerbal Space Program (KSP)
dc.title	Decision making for lunar landing applications using AI agents and reinforcement learning.
dc.type	journal article
dc.type.hasVersion	AM
dspace.entity.type	Publication
relation.isAuthorOfPublication	d978d7e6-74cb-4240-bb3a-5693f84d80ca
relation.isAuthorOfPublication	02fc094f-5f93-4ee1-9f93-c717c528c11b
relation.isAuthorOfPublication.latestForDiscovery	d978d7e6-74cb-4240-bb3a-5693f84d80ca

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Journal_Paper_Lander___post_review (1).pdf
Size:: 9.02 MB
Format:: Adobe Portable Document Format

Download

Collections

Artículos