Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning