Core idea
The project explored whether semantic directions in a VLA model’s latent activation space can steer robot behavior at inference time, avoiding costly fine-tuning while preserving task behavior.
Methods
The implementation used contrastive activation vectors, conditional activation steering, and PyTorch forward hooks on mid-to-late transformer layers. The approach was evaluated with a Franka Panda arm using the Pi 0.5 VLA model.
Why it matters
The work highlights a practical path toward interpretable, lightweight behavior modification for robotic policies, especially where retraining is expensive or safety constraints change after deployment.