Back to Repositories

Testing Japanese Phoneme Conversion Implementation in Coqui-AI TTS

This test suite validates Japanese text phonemization in the Coqui-AI TTS system, ensuring accurate conversion of Japanese text to phonetic representations. It tests a comprehensive set of cases including punctuation, numerals, and special characters.

Test Coverage Overview

The test suite covers conversion of Japanese text to phonemes with extensive case variations.

Key areas tested include:
  • Basic Japanese text and punctuation
  • Special characters and symbols
  • Numerical expressions
  • Mixed character sets (Hiragana, Katakana, Kanji)
  • Currency symbols and Western characters

Implementation Analysis

The implementation uses Python’s unittest framework with a data-driven approach. Test cases are stored in a multi-line string (_TEST_CASES) and processed line by line, with each line containing input text and expected phonetic output separated by a forward slash.

The testing pattern employs string splitting and direct assertion comparison using assertEqual to verify phoneme conversion accuracy.

Technical Details

Testing tools and configuration:
  • Python unittest framework
  • japanese_text_to_phonemes function from TTS.tts.utils.text.japanese.phonemizer
  • String-based test case storage
  • Line-by-line parsing and validation
  • Direct assertion checking

Best Practices Demonstrated

The test implementation showcases several testing best practices:

  • Comprehensive test case coverage
  • Clear input/output separation
  • Maintainable test case format
  • Efficient test execution structure
  • Self-contained test data

coqui-ai/tts

tests/text_tests/test_japanese_phonemizer.py

            
import unittest

from TTS.tts.utils.text.japanese.phonemizer import japanese_text_to_phonemes

_TEST_CASES = """
どちらに行きますか?/dochiraniikimasuka?
今日は温泉に、行きます。/kyo:waoNseNni,ikimasu.
「A」から「Z」までです。/e:karazeqtomadedesu.
そうですね!/so:desune!
クジラは哺乳類です。/kujirawahonyu:ruidesu.
ヴィディオを見ます。/bidioomimasu.
今日は8月22日です/kyo:wahachigatsuniju:ninichidesu
xyzとαβγ/eqkusuwaizeqtotoarufabe:tagaNma
値段は$12.34です/nedaNwaju:niteNsaNyoNdorudesu
"""


class TestText(unittest.TestCase):
    def test_japanese_text_to_phonemes(self):
        for line in _TEST_CASES.strip().split("
"):
            text, phone = line.split("/")
            self.assertEqual(japanese_text_to_phonemes(text), phone)


if __name__ == "__main__":
    unittest.main()