Back to Repositories

Testing Automated Test Failure Analysis and Resolution in AlphaCodium

This test suite implements error analysis and automated test failure resolution in the AlphaCodium project. It handles parsing, analyzing, and fixing test failures through an iterative approach with YAML response processing and AST validation.

Test Coverage Overview

The test coverage focuses on the analyze_and_fix_test_failure functionality, handling error processing and code fixes.

Key areas covered include:

YAML response parsing and validation
Code solution processing and cleaning
AST validation of generated solutions
Diff generation between previous and new solutions

Implementation Analysis

The testing approach implements an async retry mechanism with error handling and validation steps.

Key implementation patterns include:

Async/await pattern for test execution
YAML parsing for structured responses
AST validation for code correctness
Difflib integration for solution comparison

Technical Details

Testing infrastructure utilizes:

Python’s ast module for code validation
YAML for response parsing
difflib for code comparison
Custom logging configuration
Settings management through config_loader

Best Practices Demonstrated

The test implementation showcases robust error handling and validation practices.

Notable practices include:

Structured error handling with retries
Comprehensive logging
Clean code organization with separate prompt selection
Fallback mechanisms for code parsing

codium-ai/alphacodium

alpha_codium/gen/stages/indirect/run_analyze_and_fix_test_failure.py

            
import ast
import difflib
import functools
import logging
import yaml
from alpha_codium.llm.ai_invoker import send_inference
from alpha_codium.log import get_logger
from alpha_codium.settings.config_loader import get_settings

logger = get_logger(__name__)


async def run_analyze_and_fix_test_failure(self, problem, error_str):
    counter_retry = 0
    while True:
        try:
            problem['error_str'] = error_str
            f = functools.partial(self._run, problem=problem, prompt=choose_prompt())
            response_analyze_failure, _ = await send_inference(f)
            problem['error_str'] = ''

            response_analyze_failure = response_analyze_failure.rstrip("'` 
") # remove trailing spaces and newlines from yaml response
            if response_analyze_failure.startswith("```yaml"):
                response_analyze_failure = response_analyze_failure[8:]
            response_analyze_failure_yaml = yaml.safe_load(response_analyze_failure)
            problem['response_analyze_failure'] = response_analyze_failure
            code_recent_solution = response_analyze_failure_yaml['fixed_code'].rstrip("'` 
")

            # some cleaning
            if code_recent_solution .startswith("```python"):
                code_recent_solution= code_recent_solution[10:]
            elif code_recent_solution.startswith("python"):
                code_recent_solution = code_recent_solution[6:]
            try:
                ast.parse(code_recent_solution)
            except:
                code_recent_solution_fallback = '
'.join(code_recent_solution.splitlines()[:-1]).rstrip("'` 
")
                try:
                    ast.parse(code_recent_solution_fallback)
                    code_recent_solution = code_recent_solution_fallback
                except:
                    logger.error(f"Invalid code:
{code_recent_solution}")
                    return problem
            problem['code_recent_solution'] = code_recent_solution

            # diff patch
            diff = difflib.unified_diff(problem['code_prev_solution'].splitlines(keepends=True),
                                        problem['code_recent_solution'].splitlines(keepends=True))
            # patch = ''.join(diff)
            # if get_settings().solve.reduce_verbose:
            #     logger.debug(f"diff:
{patch}")
            # else:
            #     logger.info(f"diff:
{patch}")

            return problem
        except Exception as e:
            logging.error(f"'analyze_and_fix_test_failure' stage, counter_retry {counter_retry}, Error: {e}")
            counter_retry += 1
            if counter_retry > 2:
                raise e

def choose_prompt():
    if get_settings().get("solve.use_direct_solutions", False):
        return "code_contests_prompt_analyze_and_fix_direct"
    else:
        return "code_contests_prompt_analyze_and_fix"