Back to Repositories

Testing CLI Learning Collection Implementation in gpt-engineer

This test suite validates the collect_learnings functionality in the GPT Engineer CLI module, focusing on data collection and analytics tracking. It ensures proper handling of model outputs, user feedback, and logging mechanisms.

Test Coverage Overview

The test suite provides comprehensive coverage of the collect_learnings function, validating:
  • Analytics tracking integration with Rudder
  • Model output processing and storage
  • File repository interactions
  • Data structure integrity for collected learnings

Implementation Analysis

The testing approach utilizes pytest’s monkeypatch fixture for mocking external dependencies. It implements a structured pattern for validating complex data transformations and storage operations, with particular attention to JSON serialization and file system interactions.

Technical Details

Key technical components include:
  • pytest framework with monkeypatch fixture
  • FileRepositories for temporary storage
  • JSON data handling and validation
  • Mock objects for analytics tracking
  • File system operations testing

Best Practices Demonstrated

The test implementation showcases several testing best practices:
  • Isolation of external dependencies through mocking
  • Comprehensive assertion checking
  • Clean setup and teardown patterns
  • Structured data validation
  • Clear separation of test concerns

gpt-engineer-org/gpt-engineer

tests/applications/cli/test_collect.py

            
"""
Tests the collect_learnings function in the cli/collect module.
"""

import pytest

# def test_collect_learnings(monkeypatch):
#     monkeypatch.setattr(rudder_analytics, "track", MagicMock())
#
#     model = "test_model"
#     temperature = 0.5
#     steps = [simple_gen]
#     dbs = FileRepositories(
#         OnDiskRepository("/tmp"),
#         OnDiskRepository("/tmp"),
#         OnDiskRepository("/tmp"),
#         OnDiskRepository("/tmp"),
#         OnDiskRepository("/tmp"),
#         OnDiskRepository("/tmp"),
#         OnDiskRepository("/tmp"),
#     )
#     dbs.input = {
#         "prompt": "test prompt
 with newlines",
#         "feedback": "test feedback",
#     }
#     code = "this is output

it contains code"
#     dbs.logs = {steps[0].__name__: json.dumps([{"role": "system", "content": code}])}
#     dbs.memory = {"all_output.txt": "test workspace
" + code}
#
#     collect_learnings(model, temperature, steps, dbs)
#
#     learnings = extract_learning(
#         model, temperature, steps, dbs, steps_file_hash=steps_file_hash()
#     )
#     assert rudder_analytics.track.call_count == 1
#     assert rudder_analytics.track.call_args[1]["event"] == "learning"
#     a = {
#         k: v
#         for k, v in rudder_analytics.track.call_args[1]["properties"].items()
#         if k != "timestamp"
#     }
#     b = {k: v for k, v in learnings.to_dict().items() if k != "timestamp"}
#     assert a == b
#
#     assert json.dumps(code) in learnings.logs
#     assert code in learnings.workspace


if __name__ == "__main__":
    pytest.main(["-v"])