Back to Repositories

Validating Formula Recognition Components in PaddleOCR

This test suite validates the UniMERNet model implementation in PaddleOCR, focusing on both the backbone and head components. It ensures proper processing of formula images and text generation using the Donut-Swin architecture.

Test Coverage Overview

The test suite provides comprehensive coverage of the UniMERNet model’s core components:

Backbone testing with DonutSwinModel configuration and output validation
Head component testing with UniMERNetHead for text generation
Shape verification for encoder features and output tensors
Integration testing between backbone and head components

Implementation Analysis

The testing approach utilizes pytest fixtures for efficient test setup and dependency injection. The tests implement paddle.no_grad() context managers for evaluation mode testing, ensuring consistent and memory-efficient validation of model components.

Key patterns include separate validation of backbone and head components, shape assertion checks, and controlled model configuration testing.

Technical Details

Testing infrastructure includes:

Paddle framework for tensor operations
Pytest for test organization and execution
Custom fixtures for sample image and encoder feature generation
DonutSwinModel backbone with configurable parameters
UniMERNetHead for sequence generation with customizable settings

Best Practices Demonstrated

The test suite exemplifies several testing best practices:

Isolation of component testing for better maintainability
Use of fixtures for reusable test data
Explicit shape assertions for validation
Proper model evaluation mode setting
Memory-efficient testing with no_grad contexts
Clear documentation of test purposes and parameters

paddlepaddle/paddleocr

tests/test_formula_model.py

            
import sys
import os
from pathlib import Path
from typing import Any

import paddle
import pytest

current_dir = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.abspath(os.path.join(current_dir, "..")))
from ppocr.modeling.backbones.rec_donut_swin import DonutSwinModel, DonutSwinModelOutput
from ppocr.modeling.heads.rec_unimernet_head import UniMERNetHead


@pytest.fixture
def sample_image():
    return paddle.randn([1, 1, 192, 672])


@pytest.fixture
def encoder_feat():
    encoded_feat = paddle.randn([1, 126, 1024])
    return DonutSwinModelOutput(
        last_hidden_state=encoded_feat,
    )


def test_unimernet_backbone(sample_image):
    """
    Test UniMERNet backbone.

    Args:
        sample_image: sample image to be processed.
    """
    backbone = DonutSwinModel(
        hidden_size=1024,
        num_layers=4,
        num_heads=[4, 8, 16, 32],
        add_pooling_layer=True,
        use_mask_token=False,
    )
    backbone.eval()
    with paddle.no_grad():
        result = backbone(sample_image)
        encoder_feat = result[0]
        assert encoder_feat.shape == [1, 126, 1024]


def test_unimernet_head(encoder_feat):
    """
    Test UniMERNet head.

    Args:
        encoder_feat: encoder feature from unimernet backbone.
    """
    head = UniMERNetHead(
        max_new_tokens=5,
        decoder_start_token_id=0,
        temperature=0.2,
        do_sample=False,
        top_p=0.95,
        encoder_hidden_size=1024,
        is_export=False,
        length_aware=True,
    )

    head.eval()
    with paddle.no_grad():
        result = head(encoder_feat)
        assert result.shape == [1, 6]