Complete Guide: Training and Inference with π₀.₅ (pi05) on Custom Datasets
This guide walks you through the complete process of training and running inference with the π₀.₅ (pi05) policy on your custom datasets, assuming you have datasets with at least 600 rows.
Table of Contents
Prerequisites
1. Install LeRobot with pi05 Dependencies
# Install LeRobot
pip install lerobot
# Install pi05-specific dependencies
pip install -e ".[pi]"
Note: For lerobot 0.4.0, use:
pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"
2. Set Up Hugging Face Hub Access
# Login to Hugging Face Hub (required for dataset/model uploads)
huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
# Get your username
HF_USER=$(hf auth whoami | head -n 1)
echo $HF_USER
3. Verify Your Dataset Format
Your dataset should be in LeRobot v3.0 format and include:
- Observations: State vectors and/or camera images
- Actions: Robot action vectors
- Task descriptions: Text descriptions for each episode
- Metadata: Episode boundaries and statistics
Minimum requirements:
- At least 600 rows (frames) of data
- Proper episode segmentation
- Consistent feature shapes across episodes
Dataset Preparation
Step 1: Add Quantile Statistics (Required for pi05)
π₀.₅ uses QUANTILES normalization for state and action features. If your dataset doesn't have quantile statistics (q01, q99), you must add them:
python src/lerobot/datasets/v30/augment_dataset_quantile_stats.py \
--repo-id=${HF_USER}/your_dataset_name
This script will:
- Load your dataset
- Compute quantile statistics (q01, q10, q50, q90, q99) for all features
- Update the dataset metadata with these statistics
- Save the updated dataset
Alternative: If you prefer MEAN_STD normalization instead, you can override this during training (see Training section).
Step 2: Verify Dataset Statistics
Check that your dataset has the required statistics:
from lerobot.datasets.lerobot_dataset import LeRobotDataset
dataset = LeRobotDataset("${HF_USER}/your_dataset_name")
print(dataset.meta.stats) # Should include q01, q99 for state and action features
Training the Policy
Step 1: Basic Training Command
Here's the complete training command for finetuning π₀.₅ on your custom dataset:
lerobot-train \
--dataset.repo_id=${HF_USER}/your_dataset_name \
--policy.type=pi05 \
--output_dir=./outputs/pi05_training \
--job_name=pi05_training \
--policy.repo_id=${HF_USER}/my_pi05_policy \
--policy.pretrained_path=lerobot/pi05_base \
--policy.compile_model=true \
--policy.gradient_checkpointing=true \
--wandb.enable=true \
--policy.dtype=bfloat16 \
--steps=3000 \
--policy.device=cuda \
--batch_size=32
Step 2: Key Training Parameters Explained
| Parameter | Description | Recommended Value |
|---|---|---|
--dataset.repo_id |
Your dataset repository ID | ${HF_USER}/your_dataset_name |
--policy.type |
Policy type (must be pi05) |
pi05 |
--policy.pretrained_path |
Base model to finetune | lerobot/pi05_base or lerobot/pi05_libero |
--policy.compile_model |
Enable torch.compile for faster training | true |
--policy.gradient_checkpointing |
Reduce memory usage (important for large models) | true |
--policy.dtype |
Mixed precision training | bfloat16 (or float32 if no bfloat16 support) |
--batch_size |
Training batch size | 32 (adjust based on GPU memory) |
--steps |
Number of training steps | 3000 (adjust based on dataset size) |
--policy.device |
Training device | cuda (or mps for Apple Silicon, cpu for CPU) |
Step 3: Alternative Normalization (If No Quantiles)
If your dataset doesn't have quantile statistics and you don't want to add them, you can use MEAN_STD normalization instead:
lerobot-train \
--dataset.repo_id=${HF_USER}/your_dataset_name \
--policy.type=pi05 \
--policy.normalization_mapping='{"ACTION": "MEAN_STD", "STATE": "MEAN_STD", "VISUAL": "IDENTITY"}' \
--output_dir=./outputs/pi05_training \
--job_name=pi05_training \
--policy.repo_id=${HF_USER}/my_pi05_policy \
--policy.pretrained_path=lerobot/pi05_base \
--policy.compile_model=true \
--policy.gradient_checkpointing=true \
--policy.dtype=bfloat16 \
--steps=3000 \
--policy.device=cuda \
--batch_size=32
Step 4: Monitor Training
Training progress will be logged to:
- Weights & Biases (if
--wandb.enable=true): Visit your W&B dashboard - Checkpoints: Saved in
./outputs/pi05_training/checkpoints/ - Console: Training loss and metrics printed to terminal
Step 5: Resume Training (Optional)
To resume from a checkpoint:
lerobot-train \
--config_path=./outputs/pi05_training/checkpoints/last/pretrained_model/train_config.json \
--resume=true
Step 6: Upload Trained Model (Optional)
After training completes, upload your model to the Hugging Face Hub:
# Upload latest checkpoint
huggingface-cli upload ${HF_USER}/my_pi05_policy \
./outputs/pi05_training/checkpoints/last/pretrained_model
# Or upload a specific checkpoint
CKPT=010000
huggingface-cli upload ${HF_USER}/my_pi05_policy_${CKPT} \
./outputs/pi05_training/checkpoints/${CKPT}/pretrained_model
Running Inference/Evaluation
Step 1: Inference on Real Robot
To run inference with your trained policy on a real robot:
lerobot-record \
--robot.type=your_robot_type \
--robot.port=/dev/ttyACM1 \
--robot.cameras="{front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
--robot.id=my_robot_id \
--display_data=false \
--dataset.repo_id=${HF_USER}/eval_pi05 \
--dataset.single_task="Your task description" \
--dataset.num_episodes=10 \
--policy.path=${HF_USER}/my_pi05_policy
Key parameters:
--policy.path: Path to your trained model (local path or Hugging Face repo ID)--dataset.repo_id: Where to save evaluation episodes--dataset.single_task: Task description (must match training task format)--dataset.num_episodes: Number of evaluation episodes to run
Step 2: Inference with Python API
Here's how to run inference programmatically:
from lerobot.cameras.opencv.configuration_opencv import OpenCVCameraConfig
from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.datasets.utils import hw_to_dataset_features
from lerobot.policies.pi05.modeling_pi05 import PI05Policy
from lerobot.policies.factory import make_pre_post_processors
from lerobot.robots.your_robot import YourRobot, YourRobotConfig
from lerobot.scripts.lerobot_record import record_loop
from lerobot.utils.control_utils import init_keyboard_listener
from lerobot.utils.utils import log_say
from lerobot.utils.visualization_utils import init_rerun
# Configuration
HF_MODEL_ID = "${HF_USER}/my_pi05_policy"
HF_DATASET_ID = "${HF_USER}/eval_pi05"
FPS = 30
EPISODE_TIME_SEC = 60
NUM_EPISODES = 10
TASK_DESCRIPTION = "Your task description"
# Create robot configuration
camera_config = {"front": OpenCVCameraConfig(index_or_path=0, width=640, height=480, fps=FPS)}
robot_config = YourRobotConfig(
port="/dev/ttyACM1",
id="my_robot_id",
cameras=camera_config
)
# Initialize robot
robot = YourRobot(robot_config)
# Load trained policy
policy = PI05Policy.from_pretrained(HF_MODEL_ID)
# Configure dataset features
action_features = hw_to_dataset_features(robot.action_features, "action")
obs_features = hw_to_dataset_features(robot.observation_features, "observation")
dataset_features = {**action_features, **obs_features}
# Create evaluation dataset
dataset = LeRobotDataset.create(
repo_id=HF_DATASET_ID,
fps=FPS,
features=dataset_features,
robot_type=robot.name,
use_videos=True,
image_writer_threads=4,
)
# Initialize keyboard listener and visualization
_, events = init_keyboard_listener()
init_rerun(session_name="evaluation")
# Connect robot
robot.connect()
# Create pre/post processors
preprocessor, postprocessor = make_pre_post_processors(
policy_cfg=policy,
pretrained_path=HF_MODEL_ID,
dataset_stats=dataset.meta.stats,
)
# Run evaluation episodes
for episode_idx in range(NUM_EPISODES):
log_say(f"Running inference, recording eval episode {episode_idx + 1} of {NUM_EPISODES}")
# Run policy inference loop
record_loop(
robot=robot,
events=events,
fps=FPS,
policy=policy,
preprocessor=preprocessor,
postprocessor=postprocessor,
dataset=dataset,
control_time_s=EPISODE_TIME_SEC,
single_task=TASK_DESCRIPTION,
display_data=True,
)
dataset.save_episode()
# Clean up
robot.disconnect()
dataset.push_to_hub()
Step 3: Evaluation in Simulation (if applicable)
For simulation environments like LIBERO:
lerobot-eval \
--output_dir=./eval_logs/ \
--env.type=libero \
--env.task=libero_10 \
--eval.batch_size=1 \
--eval.n_episodes=10 \
--policy.path=${HF_USER}/my_pi05_policy \
--policy.n_action_steps=10
Important Configuration Details
Dataset Requirements for π₀.₅
- State Features: Required. π₀.₅ uses state information in the language prompt.
- Image Features: Optional but recommended. Can have multiple cameras.
- Action Features: Required. Must match your robot's action space.
- Task Descriptions: Required. Each episode should have a task description.
Model Architecture Details
- Base Model: PaliGemma-2B (vision-language model)
- Action Expert: Gemma-300M (action prediction)
- Chunk Size: 50 action steps predicted per inference
- Image Resolution: 224x224 (automatically resized)
- Max State/Action Dim: 32 (smaller dimensions are padded)
Normalization Modes
π₀.₅ uses different normalization for different feature types:
- VISUAL: IDENTITY (no normalization)
- STATE: QUANTILES (uses q01, q99)
- ACTION: QUANTILES (uses q01, q99)
Troubleshooting
Issue: "QUANTILES normalization mode requires q01 and q99 stats"
Solution: Run the quantile augmentation script:
python src/lerobot/datasets/v30/augment_dataset_quantile_stats.py \
--repo-id=${HF_USER}/your_dataset_name
Or use MEAN_STD normalization instead:
--policy.normalization_mapping='{"ACTION": "MEAN_STD", "STATE": "MEAN_STD", "VISUAL": "IDENTITY"}'
Issue: Out of Memory (OOM) during training
Solutions:
- Reduce batch size:
--batch_size=16or--batch_size=8 - Enable gradient checkpointing:
--policy.gradient_checkpointing=true - Use bfloat16:
--policy.dtype=bfloat16 - Reduce image resolution (if applicable)
Issue: State dimension mismatch
Solution: π₀.₅ automatically pads state/action to max dimensions (32). Ensure your state/action dimensions are ≤ 32, or adjust --policy.max_state_dim and --policy.max_action_dim.
Issue: Training loss not decreasing
Solutions:
- Check dataset quality and task descriptions
- Increase training steps:
--steps=5000or more - Adjust learning rate (default: 2.5e-5)
- Verify pretrained model: Try
lerobot/pi05_liberoif your task is similar to manipulation
Issue: Inference actions are too slow or jerky
Solutions:
- Enable Real-Time Chunking (RTC) - see RTC documentation
- Reduce
--policy.n_action_steps(default: 50) - Use compiled model:
--policy.compile_model=true
Example Workflow Summary
Prepare Dataset (600+ rows)
# Record or prepare your dataset # Ensure it has task descriptions and proper episode structureAdd Quantile Statistics
python src/lerobot/datasets/v30/augment_dataset_quantile_stats.py \ --repo-id=${HF_USER}/my_datasetTrain Policy
lerobot-train \ --dataset.repo_id=${HF_USER}/my_dataset \ --policy.type=pi05 \ --policy.pretrained_path=lerobot/pi05_base \ --output_dir=./outputs/pi05_training \ --steps=3000 \ --batch_size=32 \ --policy.device=cudaEvaluate Policy
lerobot-record \ --robot.type=your_robot \ --policy.path=./outputs/pi05_training/checkpoints/last/pretrained_model \ --dataset.repo_id=${HF_USER}/eval_results \ --dataset.num_episodes=10
Additional Resources
Quick Reference: Training Command Template
# Basic training
lerobot-train \
--dataset.repo_id=${HF_USER}/your_dataset \
--policy.type=pi05 \
--policy.pretrained_path=lerobot/pi05_base \
--output_dir=./outputs/pi05_training \
--job_name=pi05_training \
--policy.repo_id=${HF_USER}/my_pi05_policy \
--policy.compile_model=true \
--policy.gradient_checkpointing=true \
--policy.dtype=bfloat16 \
--steps=3000 \
--batch_size=32 \
--policy.device=cuda \
--wandb.enable=true
Quick Reference: Inference Command Template
# Real robot inference
lerobot-record \
--robot.type=your_robot_type \
--robot.port=/dev/ttyACM1 \
--robot.id=my_robot_id \
--policy.path=${HF_USER}/my_pi05_policy \
--dataset.repo_id=${HF_USER}/eval_results \
--dataset.single_task="Your task description" \
--dataset.num_episodes=10
Note: This guide assumes you have datasets with at least 600 rows. For smaller datasets, you may need to adjust training parameters (fewer steps, smaller batch size) or consider data augmentation techniques.