Back to Repositories

Testing Parallel WaveGAN Vocoder Training Pipeline in Coqui-AI TTS

This test suite validates the training functionality of the Parallel WaveGAN vocoder model in the Coqui-AI TTS system. It covers model initialization, training execution, and checkpoint restoration capabilities.

Test Coverage Overview

The test suite provides comprehensive coverage of the Parallel WaveGAN vocoder training pipeline.

Key areas tested include:
  • Configuration initialization and validation
  • Single-epoch training execution
  • Model checkpoint management
  • Training restoration from saved checkpoints
Integration points cover data loading, model training, and checkpoint handling mechanisms.

Implementation Analysis

The testing approach employs a systematic validation of the training workflow using the CLI interface. It implements a two-phase testing strategy: initial training followed by restored training from checkpoints.

Technical patterns include:
  • Dynamic device selection for CUDA compatibility
  • Configuration serialization and management
  • Automated checkpoint discovery and handling
  • Resource cleanup after test completion

Technical Details

Testing tools and configuration:
  • Custom ParallelWaveganConfig for model parameters
  • CLI-based training execution
  • LJSpeech dataset for training data
  • CUDA device management
  • Automated output path handling
  • Dynamic checkpoint management

Best Practices Demonstrated

The test implementation showcases several testing best practices for deep learning systems.

Notable practices include:
  • Isolated test environment with controlled parameters
  • Deterministic configuration management
  • Proper resource cleanup
  • Comprehensive workflow validation
  • Efficient test data handling

coqui-ai/tts

tests/vocoder_tests/test_parallel_wavegan_train.py

            
import glob
import os
import shutil

from tests import get_device_id, get_tests_output_path, run_cli
from TTS.vocoder.configs import ParallelWaveganConfig

config_path = os.path.join(get_tests_output_path(), "test_vocoder_config.json")
output_path = os.path.join(get_tests_output_path(), "train_outputs")

config = ParallelWaveganConfig(
    batch_size=4,
    eval_batch_size=4,
    num_loader_workers=0,
    num_eval_loader_workers=0,
    run_eval=True,
    test_delay_epochs=-1,
    epochs=1,
    seq_len=2048,
    eval_split_size=1,
    print_step=1,
    print_eval=True,
    data_path="tests/data/ljspeech",
    output_path=output_path,
)
config.audio.do_trim_silence = True
config.audio.trim_db = 60
config.save_json(config_path)

# train the model for one epoch
command_train = f"CUDA_VISIBLE_DEVICES='{get_device_id()}' python TTS/bin/train_vocoder.py --config_path {config_path} "
run_cli(command_train)

# Find latest folder
continue_path = max(glob.glob(os.path.join(output_path, "*/")), key=os.path.getmtime)

# restore the model and continue training for one more epoch
command_train = (
    f"CUDA_VISIBLE_DEVICES='{get_device_id()}' python TTS/bin/train_vocoder.py --continue_path {continue_path} "
)
run_cli(command_train)
shutil.rmtree(continue_path)