Back to Repositories

Validating Embedding Model Resolution in LlamaIndex

This test suite validates the embedding model resolution functionality in LlamaIndex, focusing on the correct instantiation and handling of different embedding model types including MockEmbedding, HuggingFaceEmbedding, and OpenAIEmbedding.

Test Coverage Overview

The test suite provides comprehensive coverage of the embedding model resolution logic.

Key areas tested include:
  • Null input handling with MockEmbedding fallback
  • String-based model initialization for HuggingFace
  • Direct instance passing for both HuggingFace and OpenAI embeddings
  • Edge case handling for different input types

Implementation Analysis

The implementation uses pytest’s MonkeyPatch to mock embedding model initializations, enabling isolated testing of the resolution logic. The testing approach employs mock implementations of HuggingFaceEmbedding and OpenAIEmbedding constructors to avoid actual model loading.

Key patterns include:
  • Dependency injection through monkeypatching
  • Type verification using isinstance checks
  • Systematic test case organization

Technical Details

Testing tools and configuration:
  • pytest framework with MonkeyPatch fixture
  • Mock implementations for embedding constructors
  • Type hints for improved code clarity
  • Custom mock functions for HuggingFace and OpenAI embeddings

Best Practices Demonstrated

The test suite exemplifies several testing best practices in Python.

Notable practices include:
  • Isolation of external dependencies
  • Clear test case organization
  • Comprehensive type checking
  • Efficient mock implementations
  • Proper use of pytest fixtures

run-llama/llama_index

llama-index-core/tests/embeddings/todo_hf_test_utils.py

            
from typing import Any, Dict

# pants: no-infer-dep
from llama_index.core.embeddings.mock_embed_model import MockEmbedding
from llama_index.core.embeddings.utils import resolve_embed_model
from llama_index.embeddings.huggingface import (
    HuggingFaceEmbedding,
)  # pants: no-infer-dep
from llama_index.embeddings.openai import OpenAIEmbedding  # pants: no-infer-dep
from pytest import MonkeyPatch


def mock_hf_embeddings(self: Any, *args: Any, **kwargs: Dict[str, Any]) -> Any:
    """Mock HuggingFaceEmbeddings."""
    super(HuggingFaceEmbedding, self).__init__(
        model_name="fake",
        tokenizer_name="fake",
        model="fake",
        tokenizer="fake",
    )
    return


def mock_openai_embeddings(self: Any, *args: Any, **kwargs: Dict[str, Any]) -> Any:
    """Mock OpenAIEmbedding."""
    super(OpenAIEmbedding, self).__init__(
        api_key="fake", api_base="fake", api_version="fake"
    )
    return


def test_resolve_embed_model(monkeypatch: MonkeyPatch) -> None:
    monkeypatch.setattr(
        "llama_index.embeddings.huggingface.HuggingFaceEmbedding.__init__",
        mock_hf_embeddings,
    )
    monkeypatch.setattr(
        "llama_index.embeddings.openai.OpenAIEmbedding.__init__",
        mock_openai_embeddings,
    )

    # Test None
    embed_model = resolve_embed_model(None)
    assert isinstance(embed_model, MockEmbedding)

    # Test str
    embed_model = resolve_embed_model("local")
    assert isinstance(embed_model, HuggingFaceEmbedding)

    # Test LCEmbeddings
    embed_model = resolve_embed_model(HuggingFaceEmbedding())
    assert isinstance(embed_model, HuggingFaceEmbedding)

    # Test BaseEmbedding
    embed_model = resolve_embed_model(OpenAIEmbedding())
    assert isinstance(embed_model, OpenAIEmbedding)