Back to Repositories

Testing Model Accuracy Metrics Implementation in PaddleOCR

This test module implements metric calculation functionality for PaddleOCR’s model evaluation, focusing on accuracy measurements like top-1 and top-k predictions. It provides essential validation capabilities for model performance assessment across distributed environments.

Test Coverage Overview

The test suite covers accuracy metric calculations for deep learning model evaluation.

Tests top-1 and top-k accuracy computations
Validates softmax output processing
Handles distributed evaluation scenarios
Supports variable class numbers and prediction thresholds

Implementation Analysis

The implementation uses PaddlePaddle’s metric utilities for accuracy calculations.

Key patterns include:

Flexible topk parameter handling
Distributed computation support using all_reduce operations
Ordered dictionary result storage
Softmax activation processing

Technical Details

Testing infrastructure includes:

PaddlePaddle metric APIs
Distributed evaluation tools
OrderedDict for results management
F.softmax for probability normalization

Best Practices Demonstrated

The test implementation showcases robust evaluation practices.

Proper handling of distributed environments
Flexible parameter configuration
Clear separation of computation steps
Efficient result aggregation

paddlepaddle/paddleocr

test_tipc/supplementary/metric.py

            
import paddle
import paddle.nn.functional as F
from collections import OrderedDict


def create_metric(
    out,
    label,
    architecture=None,
    topk=5,
    classes_num=1000,
    use_distillation=False,
    mode="train",
):
    """
    Create measures of model accuracy, such as top1 and top5

    Args:
        out(variable): model output variable
        feeds(dict): dict of model input variables(included label)
        topk(int): usually top5
        classes_num(int): num of classes
        use_distillation(bool): whether to use distillation training
        mode(str): mode, train/valid

    Returns:
        fetchs(dict): dict of measures
    """
    # if architecture["name"] == "GoogLeNet":
    #     assert len(out) == 3, "GoogLeNet should have 3 outputs"
    #     out = out[0]
    # else:
    #     # just need student label to get metrics
    #     if use_distillation:
    #         out = out[1]
    softmax_out = F.softmax(out)

    fetchs = OrderedDict()
    # set top1 to fetchs
    top1 = paddle.metric.accuracy(softmax_out, label=label, k=1)
    # set topk to fetchs
    k = min(topk, classes_num)
    topk = paddle.metric.accuracy(softmax_out, label=label, k=k)

    # multi cards' eval
    if mode != "train" and paddle.distributed.get_world_size() > 1:
        top1 = (
            paddle.distributed.all_reduce(top1, op=paddle.distributed.ReduceOp.SUM)
            / paddle.distributed.get_world_size()
        )
        topk = (
            paddle.distributed.all_reduce(topk, op=paddle.distributed.ReduceOp.SUM)
            / paddle.distributed.get_world_size()
        )

    fetchs["top1"] = top1
    topk_name = "top{}".format(k)
    fetchs[topk_name] = topk

    return fetchs