Back to Repositories

Testing AI Chat Model Integration in gpt-engineer

This test suite validates the core AI interaction functionality in gpt-engineer, focusing on chat model responses and token usage tracking. It ensures proper initialization, message handling, and usage monitoring for the AI component.

Test Coverage Overview

The test suite provides comprehensive coverage of the AI class functionality:
  • Tests AI initialization and first response generation
  • Validates subsequent message handling and responses
  • Verifies token usage tracking and cost calculations
  • Covers core interaction patterns with the chat model

Implementation Analysis

The implementation uses pytest’s monkeypatch fixture to isolate tests from external dependencies. It employs a FakeListChatModel for predictable response testing and follows the Arrange-Act-Assert pattern for clear test structure.
  • Mocks chat model creation for controlled testing
  • Implements sequential response verification
  • Uses step-based interaction testing

Technical Details

Key technical components include:
  • pytest testing framework
  • langchain’s BaseChatModel and FakeListChatModel
  • Monkeypatch for dependency injection
  • Token usage tracking system

Best Practices Demonstrated

The test suite exemplifies several testing best practices:
  • Proper test isolation using dependency injection
  • Clear test case organization and naming
  • Comprehensive assertion coverage
  • Effective mocking of external dependencies
  • Progressive complexity in test scenarios

gpt-engineer-org/gpt-engineer

tests/core/test_ai.py

            
from langchain.chat_models.base import BaseChatModel
from langchain_community.chat_models.fake import FakeListChatModel

from gpt_engineer.core.ai import AI


def mock_create_chat_model(self) -> BaseChatModel:
    return FakeListChatModel(responses=["response1", "response2", "response3"])


def test_start(monkeypatch):
    monkeypatch.setattr(AI, "_create_chat_model", mock_create_chat_model)

    ai = AI("gpt-4")

    # act
    response_messages = ai.start("system prompt", "user prompt", step_name="step name")

    # assert
    assert response_messages[-1].content == "response1"


def test_next(monkeypatch):
    # arrange
    monkeypatch.setattr(AI, "_create_chat_model", mock_create_chat_model)

    ai = AI("gpt-4")
    response_messages = ai.start("system prompt", "user prompt", step_name="step name")

    # act
    response_messages = ai.next(
        response_messages, "next user prompt", step_name="step name"
    )

    # assert
    assert response_messages[-1].content == "response2"


def test_token_logging(monkeypatch):
    # arrange
    monkeypatch.setattr(AI, "_create_chat_model", mock_create_chat_model)

    ai = AI("gpt-4")

    # act
    response_messages = ai.start("system prompt", "user prompt", step_name="step name")
    usageCostAfterStart = ai.token_usage_log.usage_cost()
    ai.next(response_messages, "next user prompt", step_name="step name")
    usageCostAfterNext = ai.token_usage_log.usage_cost()

    # assert
    assert usageCostAfterStart > 0
    assert usageCostAfterNext > usageCostAfterStart