Back to Repositories

Validating Local File Ingestion System in Private-GPT

This test suite validates the local file ingestion functionality in Private-GPT, focusing on file handling, environment configuration, and ingestion controls. It ensures proper document processing and environment-based feature toggles.

Test Coverage Overview

The test suite provides comprehensive coverage of local file ingestion scenarios.

Key areas tested include:
  • File ingestion in allowed folders
  • Ingestion feature toggle functionality
  • Environment variable handling
  • Document count verification

Implementation Analysis

The implementation uses pytest fixtures and FastAPI TestClient for HTTP endpoint testing. The approach combines system-level file operations with API testing, utilizing subprocess for script execution and environment manipulation.

Key patterns include:
  • Fixture-based file path management
  • Environment variable control
  • Subprocess execution validation
  • Response assertion checks

Technical Details

Testing tools and configuration:
  • pytest for test framework
  • FastAPI TestClient for API testing
  • subprocess module for script execution
  • pathlib for file system operations
  • Environment variable configuration for feature toggles

Best Practices Demonstrated

The test suite exemplifies strong testing practices through proper isolation and cleanup.

Notable practices include:
  • Clean setup and teardown with fixture usage
  • Environment isolation for tests
  • Clear assertion messages
  • Modular helper functions
  • Comprehensive error handling

zylon-ai/private-gpt

tests/server/ingest/test_local_ingest.py

            
import os
import subprocess
from pathlib import Path

import pytest
from fastapi.testclient import TestClient


@pytest.fixture
def file_path() -> str:
    return "test.txt"


def create_test_file(file_path: str) -> None:
    with open(file_path, "w") as f:
        f.write("test")


def clear_log_file(log_file_path: str) -> None:
    if Path(log_file_path).exists():
        os.remove(log_file_path)


def read_log_file(log_file_path: str) -> str:
    with open(log_file_path) as f:
        return f.read()


def init_structure(folder: str, file_path: str) -> None:
    clear_log_file(file_path)
    os.makedirs(folder, exist_ok=True)
    create_test_file(f"{folder}/${file_path}")


def test_ingest_one_file_in_allowed_folder(
    file_path: str, test_client: TestClient
) -> None:
    allowed_folder = "local_data/tests/allowed_folder"
    init_structure(allowed_folder, file_path)

    test_env = os.environ.copy()
    test_env["PGPT_PROFILES"] = "test"
    test_env["LOCAL_INGESTION_ENABLED"] = "True"

    result = subprocess.run(
        ["python", "scripts/ingest_folder.py", allowed_folder],
        capture_output=True,
        text=True,
        env=test_env,
    )

    assert result.returncode == 0, f"Script failed with error: {result.stderr}"
    response_after = test_client.get("/v1/ingest/list")

    count_ingest_after = len(response_after.json()["data"])
    assert count_ingest_after > 0, "No documents were ingested"


def test_ingest_disabled(file_path: str) -> None:
    allowed_folder = "local_data/tests/allowed_folder"
    init_structure(allowed_folder, file_path)

    test_env = os.environ.copy()
    test_env["PGPT_PROFILES"] = "test"
    test_env["LOCAL_INGESTION_ENABLED"] = "False"

    result = subprocess.run(
        ["python", "scripts/ingest_folder.py", allowed_folder],
        capture_output=True,
        text=True,
        env=test_env,
    )

    assert result.returncode != 0, f"Script failed with error: {result.stderr}"