Robot learning

Steering Vision-Language-Action Models for Safe Robotic Execution

Inference-time activation steering for pretrained VLA models to influence robotic behavior without retraining.

Category
Robot learning
Role
Developed and evaluated activation-space interventions for safety- and behavior-aware robot control.
Technologies
Vision-Language-Action ModelsActivation EngineeringPyTorchROS

Core idea

The project explored whether semantic directions in a VLA model’s latent activation space can steer robot behavior at inference time, avoiding costly fine-tuning while preserving task behavior.

Methods

The implementation used contrastive activation vectors, conditional activation steering, and PyTorch forward hooks on mid-to-late transformer layers. The approach was evaluated with a Franka Panda arm using the Pi 0.5 VLA model.

Why it matters

The work highlights a practical path toward interpretable, lightweight behavior modification for robotic policies, especially where retraining is expensive or safety constraints change after deployment.