Decision making for lunar landing applications using AI agents and reinforcement learning.

Navarro, Tomás; Stroescu, Ana; Izzo, Darío; Gálvez-Rojas, Sergio; López-Valverde, Francisco

doi:10.1007/s42064-025-0292-2

Decision making for lunar landing applications using AI agents and reinforcement learning.

Files

Journal_Paper_Lander___post_review (1).pdf (9.02 MB)

Identifiers

URI: https://hdl.handle.net/10630/46380

DOI: 10.1007/s42064-025-0292-2

Publication date

2026-04-11

Authors

López-Valverde, Francisco

Publisher

Springer Nature

Metrics

Share

Export

Center

E.T.S.I. Informática

Department/Institute

Lenguajes y Ciencias de la Computación

Keywords

Aprendizaje automático (Inteligencia artificial)
Luna - Exploración

Abstract

This study explores the decision making capabilities of large language model (LLM) artificial intelligence (AI) agents to automate learning in lunar landing missions. In particular, the work investigates the use of AI agents to minimise human intervention in training a lunar lander by providing high-level strategic guidance to a reinforcement learning (RL) agent within the complex simulation environment of Kerbal Space Program (KSP). To that end, LLM AI agents are utilised to interpret a lander manual, extract key information to construct the reward function of the RL algorithm, and dynamically refine it based on training performance. A comparative case study evaluates the effectiveness of GPT-3.5-Turbo, GPT-4, and Meta-Llama-3-70B, in optimising the RL training process for lunar landings. Extending this approach, the study further explores AI-assisted hyperparameter optimisation (HPO) to streamline RL training. Instead of relying on traditional, computationally expensive methods like grid search and Bayesian optimisation, a zero-shot tuning approach is introduced, where an AI agent configures RL hyperparameters from a single instruction prompt without iterative refinements. Using GPT-4o, this method is applied to four RL algorithms (DQN, PPO, A2C, and SAC Discrete), demonstrating significant improvements in training efficiency, convergence speed, and reduced human effort, particularly for SAC Discrete. These findings highlight the potential of LLMs to automate both reward function design and hyperparameter tuning, thus advancing AI capabilities in space exploration and autonomous navigation tasks.

Description

https://openpolicyfinder.jisc.ac.uk/publication/41443?from=single_hit

Bibliographic citation

Navarro, T., Stroescu, A., Izzo, D. et al. Decision making for lunar landing applications using AI agents and reinforcement learning. Astrodyn (2026). https://doi.org/10.1007/s42064-025-0292-2

Collections

Artículos

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution 4.0 International

Full item page

Decision making for lunar landing applications using AI agents and reinforcement learning.

Files

Identifiers

Publication date

Reading date

Authors

Collaborators

Advisors

Tutors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Metrics

Share

Export

Research Projects

Organizational Units

Journal Issue

Center

Department/Institute

Keywords

Abstract

Description

Bibliographic citation

Collections

Endorsement

Review

Supplemented By

Referenced by

Creative Commons license