TTS Translation guide for developer : Sippy Software, Inc.

Translation of phrases in the IVR applications is almost the same as for the usual applications using gettext. But gettext is not simple package and there are so many utilities in it that it's worth to create this small manual.

Helper scripts

To simplify the translation process the po directory is present in the ivrd folder which contains several helper scripts.

ivrd/po/xgetpo.sh - the script gathers all the phrases from all the python source files in the ivrd directory and its subfolders and merges all new changes into the ivrd/po/ivrd.pot file.
ivrd/po/mergepo.sh - this script merges changes from the ivrd/po/ivrd.pot file into ivrd/po/${LANG}.po file. The script requires a parameter - two-letter language code.
ivrd/compilepo.sh - the script compiles translated ivrd/po/${LANG}.po file into ssp/locale/${LANG}/LC_MESSAGES/ivrd.mo file. The script requires a parameter - two-letter language code.
ivrd/prompt_utils.py - the script can make several useful tests and can gather some statistics on prompt sets

Preparing for translation to a new language

Imagine that the ivrd applications is to be translated into the Turkish language (language code is tr). This assumption will be used in the all the text below.

First of all the translation file is to be created. This can be achieved by running:

  $ cd ivrd/po
  $ ./mergepo.sh tr

This will create ivrd/po/tr.po file. This file is plain text file so it can be edited by any text editor.

Translating the translation file

Now when you have the tr.po the translation can be done. The file contains header like this:

msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2007-08-23 12:44+0300\n"
"PO-Revision-Date: 2007-08-23 12:44+0300\n"
"Last-Translator: Automatically generated\n"
"Language-Team: none\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=ASCII\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=1; plural=0;\n"

Note that you must change the Project-Id-Version value. Otherwise compiler will annoy you with warning.

The rest of the file is the phrases to translate. The phrases are of two forms - for simple phrases:

msgid "Some English text"
msgstr ""

and for plural forms:

msgid "There is one apple"
msgid_plural "There are %n apples"
msgstr[0] ""

The msgid* are the phrases to translate and msgstr*are the translations (they are empty for the moment). Turkish has no plural form so you can see that the plural form for translation hasn't been offered.

This was a quick overview of the file content so you can start translation. All the empty strings in the msgstr are to be filled with the translated phrases. You are free to use any encoding but do not forget to specify correct encoding name in the charset subfield of the Content-Type field.

Notes for translators

So here is the point when the phrases are to be passed to a human for translation. The notes for translators you can find here

TTS language module

To support the number into text, date into text, duration into text, etc conversions the language specific python module has to be created.

The module must be named by two-letter language code in uppercase. So for Turkish you have to create the TR.py file. This file is to be placed into ivrd/TextSynth directory. The existing language modules should be used to create new module. Here is a list of requirements to the module:

the _phrase_noop() function is to be defined and it has to convert your language specific phrases into UNICODE (unless the ASCII is sufficient). All the words and phrases must be encapsulated into _phrase_noop() calls. Also you cannot use any TTS features in this module to avoid infinite recursions.

When using non-ASCII encoding you must define it in the second line of the module:

#!/usr/local/bin/python
# -*- coding: UTF-8 -*-

These methods are to be created:

sayNumber()
sayDigits()
sayDuration()
sayDatetime() (this is used by the Voicemail app for now)

The TextSynth/__init__.py file is to be modified to support your new module.

The information to be obtained to create the module is summarized here.

Prompt creation

After the translations has been done and placed into the po/tr.po file the translation is to be compiled:

$ cd ivrd/po
$ ./compilepo.sh tr

Last thing to do is to create the prompt directory:

$ mkdir ~ssp/prompts/ivrd/tr

Now the prompt_utils.py script can be used to generate the prompt list:

$ cd ivrd
$ ./prompt_utils.py -l tr list unmapped

This will create unmapped-tr.htmlfile containing all the phrase chunks in Turkish language and corresponding English phrases.

Here is the point where the narrator starts his work.

Registering prompts

After prompts have been recorded they are to be placed into the ssp/prompts/ivrd/tr folder in signed linear 16 bit 8000 Hz mono format and in g729-encoded format.

Then the ssp/prompts/ivrd/tr/prompt_map.txt file is to be created. The first line of the file must contain the encoding used to present phrase chunks and then prompt mappings follow:

# encoding: utf-8
file1|First phrase
file2|second phrase

After this you can run again the

./prompt_utils.py -l tr list unmapped

to make sure that all phrases are mapped to audio prompts.

Sippy Software, Inc.

How can we help you today?

TTS Translation guide for developer Print

Helper scripts

Preparing for translation to a new language

Translating the translation file

Notes for translators

TTS language module

Prompt creation

Registering prompts

How can we help you today?

TTS Translation guide for developer Print

Helper scripts

Preparing for translation to a new language

Translating the translation file

Notes for translators

TTS language module

Prompt creation

Registering prompts

Related Articles