Back to Repositories

Implementing Dynamic Test Discovery Pattern in AutoGPT Benchmark Suite

This test suite implements automated discovery and loading of challenge classes for AGBenchmark using Pytest. It dynamically loads both built-in and WebArena challenges, making them available for testing while maintaining a category mapping system.

Test Coverage Overview

The test suite provides comprehensive coverage for challenge class discovery and loading functionality.
  • Handles both built-in and WebArena challenge types
  • Validates challenge class naming patterns
  • Maintains challenge category mapping
  • Ensures proper module attribution

Implementation Analysis

The implementation uses Python’s importlib for dynamic module loading and itertools for efficient challenge processing. The approach leverages Pytest’s test discovery patterns while maintaining clean separation between challenge sources.
  • Dynamic module attribution using setattr
  • Chainable challenge loading process
  • Structured category mapping

Technical Details

Key technical components include:
  • Python importlib for dynamic loading
  • Pytest for test discovery
  • Logging configuration for debugging
  • Itertools chain for sequence processing
  • Global DATA_CATEGORY mapping

Best Practices Demonstrated

The implementation showcases several testing best practices:
  • Clean separation of concerns between challenge sources
  • Efficient resource utilization through iterator chains
  • Proper logging implementation
  • Maintainable category mapping structure
  • Conformance to Pytest naming conventions

significant-gravitas/autogpt

classic/benchmark/agbenchmark/generate_test.py

            
"""
AGBenchmark's test discovery endpoint for Pytest.

This module is picked up by Pytest's *_test.py file matching pattern, and all challenge
classes in the module that conform to the `Test*` pattern are collected.
"""

import importlib
import logging
from itertools import chain

from agbenchmark.challenges.builtin import load_builtin_challenges
from agbenchmark.challenges.webarena import load_webarena_challenges

logger = logging.getLogger(__name__)

DATA_CATEGORY = {}

# Load challenges and attach them to this module
for challenge in chain(load_builtin_challenges(), load_webarena_challenges()):
    # Attach the Challenge class to this module so it can be discovered by pytest
    module = importlib.import_module(__name__)
    setattr(module, challenge.__name__, challenge)

    # Build a map of challenge names and their primary category
    DATA_CATEGORY[challenge.info.name] = challenge.info.category[0].value