LLMs as NAO Robot 3D Motion Planners

Catalini, Riccardo; Salici, Giacomo; Biagi, Federico; Borghi, Guido; Biagiotti, Luigi; Vezzani, Roberto

In this study, we demonstrate the capabilities of state-of-the-art Large Language Models (LLMs) in teaching social robots to perform specific actions within a 3D environment. Specifically, we introduce the use of LLMs to generate sequences of 3D joint angles - in both zero-shot and one-shot prompting - that a humanoid robot must follow to perform a given action. This work is driven by the growing demand for intuitive interactions with social robots: indeed, LLMs could empower non-expert users to operate and benefit from robotic systems effectively. Additionally, this method leverages the possibility to generate synthetic data without effort, enabling privacy-focused use cases. To evaluate the output quality of seven different LLMs, we conducted a blind user study to compare the pose sequences. Participants were shown videos of the well-known NAO robot performing the generated actions and were asked to identify the intended action and choose the best match with the original instruction from a collection of candidates created by different LLMs. The results highlight that the majority of LLMs are indeed capable of planning correct and complete recognizable actions, showing a novel perspective of how AI can be applied to social robotics.

LLMs as NAO Robot 3D Motion Planners / Catalini, Riccardo; Salici, Giacomo; Biagi, Federico; Borghi, Guido; Biagiotti, Luigi; Vezzani, Roberto. - (2025). ( 2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Honolulu (United States) 19/10/2025).

LLMs as NAO Robot 3D Motion Planners

Riccardo Catalini;Giacomo Salici;Federico Biagi;Guido Borghi;Luigi Biagiotti;Roberto Vezzani

2025

Abstract

In this study, we demonstrate the capabilities of state-of-the-art Large Language Models (LLMs) in teaching social robots to perform specific actions within a 3D environment. Specifically, we introduce the use of LLMs to generate sequences of 3D joint angles - in both zero-shot and one-shot prompting - that a humanoid robot must follow to perform a given action. This work is driven by the growing demand for intuitive interactions with social robots: indeed, LLMs could empower non-expert users to operate and benefit from robotic systems effectively. Additionally, this method leverages the possibility to generate synthetic data without effort, enabling privacy-focused use cases. To evaluate the output quality of seven different LLMs, we conducted a blind user study to compare the pose sequences. Participants were shown videos of the well-known NAO robot performing the generated actions and were asked to identify the intended action and choose the best match with the original instruction from a collection of candidates created by different LLMs. The results highlight that the majority of LLMs are indeed capable of planning correct and complete recognizable actions, showing a novel perspective of how AI can be applied to social robotics.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo del Convegno
	
				2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
			
	Luogo del Convegno
	
				Honolulu (United States)
			
	Data del Convegno
	
				19/10/2025
			
	Tutti gli autori
	
						Catalini, Riccardo; Salici, Giacomo; Biagi, Federico; Borghi, Guido; Biagiotti, Luigi; Vezzani, Roberto
					
	Citazione
	
				LLMs as NAO Robot 3D Motion Planners / Catalini, Riccardo; Salici, Giacomo; Biagi, Federico; Borghi, Guido; Biagiotti, Luigi; Vezzani, Roberto. - (2025). ( 2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Honolulu (United States) 19/10/2025).
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
2025247674.pdf Open access Tipologia: AAM - Versione dell'autore revisionata e accettata per la pubblicazione Dimensione 3.01 MB Formato Adobe PDF Visualizza/Apri	3.01 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris