Back to Repositories

Validating Formula Recognition Components in PaddleOCR

This test suite validates the UniMERNet model implementation in PaddleOCR, focusing on both the backbone and head components. It ensures proper processing of formula images and text generation using the Donut-Swin architecture.

Test Coverage Overview

The test suite provides comprehensive coverage of the UniMERNet model’s core components:
  • Backbone testing with DonutSwinModel configuration and output validation
  • Head component testing with UniMERNetHead for text generation
  • Shape verification for encoder features and output tensors
  • Integration testing between backbone and head components

Implementation Analysis

The testing approach utilizes pytest fixtures for efficient test setup and dependency injection. The tests implement paddle.no_grad() context managers for evaluation mode testing, ensuring consistent and memory-efficient validation of model components.

Key patterns include separate validation of backbone and head components, shape assertion checks, and controlled model configuration testing.

Technical Details

Testing infrastructure includes:
  • Paddle framework for tensor operations
  • Pytest for test organization and execution
  • Custom fixtures for sample image and encoder feature generation
  • DonutSwinModel backbone with configurable parameters
  • UniMERNetHead for sequence generation with customizable settings

Best Practices Demonstrated

The test suite exemplifies several testing best practices:
  • Isolation of component testing for better maintainability
  • Use of fixtures for reusable test data
  • Explicit shape assertions for validation
  • Proper model evaluation mode setting
  • Memory-efficient testing with no_grad contexts
  • Clear documentation of test purposes and parameters

paddlepaddle/paddleocr

tests/test_formula_model.py

            
import sys
import os
from pathlib import Path
from typing import Any

import paddle
import pytest

current_dir = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.abspath(os.path.join(current_dir, "..")))
from ppocr.modeling.backbones.rec_donut_swin import DonutSwinModel, DonutSwinModelOutput
from ppocr.modeling.heads.rec_unimernet_head import UniMERNetHead


@pytest.fixture
def sample_image():
    return paddle.randn([1, 1, 192, 672])


@pytest.fixture
def encoder_feat():
    encoded_feat = paddle.randn([1, 126, 1024])
    return DonutSwinModelOutput(
        last_hidden_state=encoded_feat,
    )


def test_unimernet_backbone(sample_image):
    """
    Test UniMERNet backbone.

    Args:
        sample_image: sample image to be processed.
    """
    backbone = DonutSwinModel(
        hidden_size=1024,
        num_layers=4,
        num_heads=[4, 8, 16, 32],
        add_pooling_layer=True,
        use_mask_token=False,
    )
    backbone.eval()
    with paddle.no_grad():
        result = backbone(sample_image)
        encoder_feat = result[0]
        assert encoder_feat.shape == [1, 126, 1024]


def test_unimernet_head(encoder_feat):
    """
    Test UniMERNet head.

    Args:
        encoder_feat: encoder feature from unimernet backbone.
    """
    head = UniMERNetHead(
        max_new_tokens=5,
        decoder_start_token_id=0,
        temperature=0.2,
        do_sample=False,
        top_p=0.95,
        encoder_hidden_size=1024,
        is_export=False,
        length_aware=True,
    )

    head.eval()
    with paddle.no_grad():
        result = head(encoder_feat)
        assert result.shape == [1, 6]