Back to Repositories

Testing Vector Environment Information Handling in OpenAI Gym

This test suite validates the information handling and termination behavior in OpenAI Gym’s vectorized environments. It focuses on testing both synchronous and asynchronous vector environments, with particular attention to final observation states and concurrent termination scenarios.

Test Coverage Overview

The test suite provides comprehensive coverage of vector environment information handling.

Key areas tested include:
  • Synchronous and asynchronous vector environment behavior
  • Final observation handling during environment termination
  • Concurrent termination scenarios with multiple environments
  • Verification of observation data structures and types

Implementation Analysis

The implementation employs pytest’s parametrized testing approach to validate different vector environment configurations. The tests use a combination of CartPole environment instances and custom environment creation utilities to verify both individual and concurrent termination scenarios.

Technical patterns include:
  • Parametrized test cases for different async modes
  • Controlled environment stepping with fixed seeds
  • Verification of numpy array types and structures

Technical Details

Testing infrastructure includes:
  • pytest for test organization and execution
  • NumPy for array validation
  • OpenAI Gym’s vector environment implementations
  • Custom environment creation utilities
  • Environment configuration: CartPole-v1, 3 environments, 50 steps

Best Practices Demonstrated

The test suite exemplifies several testing best practices in the context of reinforcement learning environments.

Notable practices include:
  • Systematic state verification after environment steps
  • Explicit testing of edge cases in termination scenarios
  • Consistent seed management for reproducibility
  • Clear separation of synchronous and asynchronous test cases

openai/gym

tests/vector/test_vector_env_info.py

            
import numpy as np
import pytest

import gym
from gym.vector.sync_vector_env import SyncVectorEnv
from tests.vector.utils import make_env

ENV_ID = "CartPole-v1"
NUM_ENVS = 3
ENV_STEPS = 50
SEED = 42


@pytest.mark.parametrize("asynchronous", [True, False])
def test_vector_env_info(asynchronous):
    env = gym.vector.make(
        ENV_ID, num_envs=NUM_ENVS, asynchronous=asynchronous, disable_env_checker=True
    )
    env.reset(seed=SEED)
    for _ in range(ENV_STEPS):
        env.action_space.seed(SEED)
        action = env.action_space.sample()
        _, _, terminateds, truncateds, infos = env.step(action)
        if any(terminateds) or any(truncateds):
            assert len(infos["final_observation"]) == NUM_ENVS
            assert len(infos["_final_observation"]) == NUM_ENVS

            assert isinstance(infos["final_observation"], np.ndarray)
            assert isinstance(infos["_final_observation"], np.ndarray)

            for i, (terminated, truncated) in enumerate(zip(terminateds, truncateds)):
                if terminated or truncated:
                    assert infos["_final_observation"][i]
                else:
                    assert not infos["_final_observation"][i]
                    assert infos["final_observation"][i] is None


@pytest.mark.parametrize("concurrent_ends", [1, 2, 3])
def test_vector_env_info_concurrent_termination(concurrent_ends):
    # envs that need to terminate together will have the same action
    actions = [0] * concurrent_ends + [1] * (NUM_ENVS - concurrent_ends)
    envs = [make_env(ENV_ID, SEED) for _ in range(NUM_ENVS)]
    envs = SyncVectorEnv(envs)

    for _ in range(ENV_STEPS):
        _, _, terminateds, truncateds, infos = envs.step(actions)
        if any(terminateds) or any(truncateds):
            for i, (terminated, truncated) in enumerate(zip(terminateds, truncateds)):
                if i < concurrent_ends:
                    assert terminated or truncated
                    assert infos["_final_observation"][i]
                else:
                    assert not infos["_final_observation"][i]
                    assert infos["final_observation"][i] is None
            return