This repository is a two in one codebase for two papers on using Large Language Models (LLMs) for autonomous driving on a small-scale robotic platform.
In our RSS 2025 paper, Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models. We demonstrate how LLMs can be used for autonomous driving. It provides the codebase for the MPCxLLM and DecisionxLLM modules, alongside tools for training, testing, and deployment.
π Small LLMs can adapt driving behavior through MPC and perform decision making:
Watch an explanatory Youtube video accompanying the paper here.
In our CoRL 2025 paper, RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning. We build uppon the previous robotic system and introduce closed-loop RL training of LLMs to to foster a learning by doing paradigm for embodied robotic intelligence. Showing that small scale LLMs can learn to drive a robot car through RL from its own experience.
π +20.2%-points improvement over SFT baseline with Qwen2.5-1.5B via RL.
π§ 63.3% control adaptability with Qwen2.5-3B, surpassing GPT-4o (58.5%) in this robotic task:
π Check out the Wandb report of the training runs here.
πΉ The Youtube video accompanying the paper can be found here.
-
Build the Docker container (adapt
CUDA_ARCH
accordingly:86
for RTX 30xx,89
for 40xx):docker build --build-arg CUDA_ARCH=<your_compute_capability> -t embodiedai -f .docker_utils/Dockerfile.cuda .
-
Mount the container to the project directory:
./.docker_utils/main_dock.sh cuda
-
Attach to the container:
docker exec -it embodiedai_dock /bin/bash
or use VS Code Remote Containers.
-
Build the ARM-compatible Docker image:
docker build -t embodiedai -f .docker_utils/Dockerfile.jetson .
Note that on the jetson, unsloth can not be installed (as of 07.05.2025). So only inference via quantized models are possible!
-
Mount and launch the container:
./.docker_utils/main_dock.sh jetson
-
Attach via terminal or VS Code.
Create a .env
file in the root directory with the following content:
HUGGINGFACEHUB_API_TOKEN="<your_huggingface_token>"
OPENAI_API_TOKEN="<your_openai_token>"
WANDB_API_KEY="<your_wandb_api_key>" # Optional
This is needed for downloading models and using OpenAI APIs which is required if you want to use gpt-4o
or for using the modules with their RAG embeddings. Make sure to keep this file private!
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models:
You can use the LoRA + RAG SFT trained FP16 model nibauman/RobotxLLM_Qwen7B_SFT directly from HuggingFace without having to download it locally. If you want to use the quantized model, you can download it with the following command:
huggingface-cli download nibauman/race_llm-Q5_K_M-GGUF --local-dir models/race_llm_q5
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning:
The SFT and GRPO trained models for DecisionxLLM and MPCxLLM are available on HuggingFace:
- DecisionxLLM Qwen-1.5B: nibauman/DecisionxR1_Qwen1.5B_SFT_GRPO
- DecisionxLLM Qwen-3B: nibauman/DecisionxR1_Qwen3B_SFT_GRPO
- MPCxLLM Qwen-1.5B: nibauman/MPCxR1_Qwen1.5B_SFT_GRPO
- MPCxLLM Qwen-3B: nibauman/MPCxR1_Qwen3B_SFT_GRPO
This repo integrates with the ForzaETH Race Stack. Follow their installation instructions and ensure your ROS_MASTER_URI
is correctly configured (see example line) in this readme we use 192.168.192.75 as an example!
Run each command in a separate terminal. Use f
map for evaluation and circle
map (map_name:=circle
) for RobotxR1 RL training.
roscore
roslaunch stack_master base_system.launch map_name:=f racecar_version:=NUC2 sim:=true
roslaunch stack_master time_trials.launch ctrl_algo:=KMPC
roslaunch rosbridge_server rosbridge_websocket.launch address:=192.168.192.75
python3 llm_mpc.py --model custom --model_dir nibauman/RobotxLLM_Qwen7B_SFT --hostip 192.168.192.75 --prompt "Drive in Reverse!"
Key Options:
--model
:custom
orgpt-4o
--model_dir
: HuggingFace or local path (used forcustom
)--hostip
: ROS master IP--prompt
: Natural language instruction--quant
: Use quantizedGGUF
model--mpconly
: Skip DecisionxLLM
As an example for on the Jetson you can only run the quantized models with the models downloaded to the models folder as explained above. You can run the following command to test the quantized model:
python3 llm_mpc.py --model custom --model_dir models/race_llm_q5 --hostip 192.168.192.75 --prompt "Drive in Reverse!" --quant
To train a new LoRA adapter on synthetic data with Supervised Fine-Tuning (SFT) akin to Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models here, you can use the following command:
python3 -m train.sft_train --config train/config/sft_train.yaml
You can modify sft_train.yaml
to change training parameters. Default setup:
- Base Model:
unsloth/Qwen2.5-7B-Instruct
To train the DecisionxLLM through static Reinforcement Learning akin to RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning here, you can use the following command:
python3 -m train.rl_decision_train --config train/config/rl_decision_train.yaml
Modify rl_decision_train.yaml
to change training parameters. Default setup:
- Base Model:
Qwen/Qwen2.5-3B-Instruct
To train the MPCxLLM through feedback driven Reinforcement Learning akin to RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning here, you can use the following command:
python3 -m train.rl_mpc_train --config train/config/rl_mpc_train.yaml
Modify rl_mpc_train.yaml
to change training parameters. Default setup:
- Base Model:
Qwen/Qwen2.5-3B-Instruct
Note: Train the model on the circle
map, then evaluate it on the f
map. Command to launch the robot stack on the circle
map:
roslaunch stack_master base_system.launch map_name:=circle racecar_version:=NUC2 sim:=true
python3 -m tests.decision_tester.decision_tester --model nibauman/RobotxLLM_Qwen7B_SFT --dataset all --mini --rag
python3 -m tests.mpc_tester.mpc_tester --model custom --model_dir nibauman/RobotxLLM_Qwen7B_SFT --host_ip 192.168.192.75
Evaluation Options:
--dataset
: e.g.,all
,stop
,reverse
, etc.--mini
: Run a small evaluation subset--rag
: Enable retrieval-augmented decision prompts--quant
: Use quantized model
SFT training was performed through the distillation of OpenAI GPT-4o queries. This work would not have been possible without the great work of other repositories such as:
If this repository is useful for your research, please consider citing our work:
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models:
@article{baumann2025enhancing,
title={Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models},
author={Baumann, Nicolas and Hu, Cheng and Sivasothilingam, Paviththiren and Qin, Haotong and Xie, Lei and Magno, Michele and Benini, Luca},
journal={arXiv preprint arXiv:2504.11514},
year={2025}
}
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning:
@misc{boyle2025robotxr1enablingembodiedrobotic,
title={RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning},
author={Liam Boyle and Nicolas Baumann and Paviththiren Sivasothilingam and Michele Magno and Luca Benini},
year={2025},
eprint={2505.03238},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2505.03238},
}