Back to Repositories

Testing Parallel WaveGAN Vocoder Training Pipeline in Coqui-AI TTS

This test suite validates the training functionality of the Parallel WaveGAN vocoder model in the Coqui-AI TTS system. It covers model initialization, training execution, and checkpoint restoration capabilities.

Test Coverage Overview

The test suite provides comprehensive coverage of the Parallel WaveGAN vocoder training pipeline.

Key areas tested include:

Configuration initialization and validation
Single-epoch training execution
Model checkpoint management
Training restoration from saved checkpoints

Integration points cover data loading, model training, and checkpoint handling mechanisms.

Implementation Analysis

The testing approach employs a systematic validation of the training workflow using the CLI interface. It implements a two-phase testing strategy: initial training followed by restored training from checkpoints.

Technical patterns include:

Dynamic device selection for CUDA compatibility
Configuration serialization and management
Automated checkpoint discovery and handling
Resource cleanup after test completion

Technical Details

Testing tools and configuration:

Custom ParallelWaveganConfig for model parameters
CLI-based training execution
LJSpeech dataset for training data
CUDA device management
Automated output path handling
Dynamic checkpoint management

Best Practices Demonstrated

The test implementation showcases several testing best practices for deep learning systems.

Notable practices include:

Isolated test environment with controlled parameters
Deterministic configuration management
Proper resource cleanup
Comprehensive workflow validation
Efficient test data handling

coqui-ai/tts

tests/vocoder_tests/test_parallel_wavegan_train.py

            
import glob
import os
import shutil

from tests import get_device_id, get_tests_output_path, run_cli
from TTS.vocoder.configs import ParallelWaveganConfig

config_path = os.path.join(get_tests_output_path(), "test_vocoder_config.json")
output_path = os.path.join(get_tests_output_path(), "train_outputs")

config = ParallelWaveganConfig(
    batch_size=4,
    eval_batch_size=4,
    num_loader_workers=0,
    num_eval_loader_workers=0,
    run_eval=True,
    test_delay_epochs=-1,
    epochs=1,
    seq_len=2048,
    eval_split_size=1,
    print_step=1,
    print_eval=True,
    data_path="tests/data/ljspeech",
    output_path=output_path,
)
config.audio.do_trim_silence = True
config.audio.trim_db = 60
config.save_json(config_path)

# train the model for one epoch
command_train = f"CUDA_VISIBLE_DEVICES='{get_device_id()}' python TTS/bin/train_vocoder.py --config_path {config_path} "
run_cli(command_train)

# Find latest folder
continue_path = max(glob.glob(os.path.join(output_path, "*/")), key=os.path.getmtime)

# restore the model and continue training for one more epoch
command_train = (
    f"CUDA_VISIBLE_DEVICES='{get_device_id()}' python TTS/bin/train_vocoder.py --continue_path {continue_path} "
)
run_cli(command_train)
shutil.rmtree(continue_path)