TEXT - TO - SPEECH SYNTHESIS
This project is part of the TeNeT
group's initiative for developing local language speech interface
systems. Speech synthesis forms part of the speech interfaces
system by enabling machines to generate interactive voice responses in
the user's native language.
Our mission is to develop unrestricted Text-to Speech systems for Indian languages & Indian English for use in various local language applications for the visually challenged, IVRS systems, applications on memory constrained devices (PDAs, mobile devices), computer aided learning for rural kiosks etc.
Though a TTS system is generally targeted for one particular language, in India, with 18 officially recognised languages and hundreds of dialects, it is very difficult to have one speech synthesizer for each language. The focus is also to develop a common multilingual corpora with support for multiple Indian languages and to build appropriate language specific linguistic analysis modules for text-to-speech synthesis.
Current research work
Quality Improvement Experiments for TTS
Text-to-Speech synthesis using syllable-like units
This work is being implemented using the Festvox voice building framework. For Indian languages, syllable units are a much better choice than units like diphone, phone, and half-phone. We use a new "syllable-like" speech unit that is suitable for concatenative speech synthesis. These units are automatically generated using a group delay based segmentation algorithm and acoustically correspond to the form C*VC* (C: consonant, V: vowel). The effectiveness of the unit is demonstrated by synthesizing natural-sounding speech in Tamil, a regional Indian language. Significant quality improvement is obtained if bisyllable units are also used, rather than just monosyllables, with results far superior to the traditional diphone-based approach.
Text-to-Speech synthesis on Embedded systems
In this work we are looking at a new prototype for developing TTS synthesizers for embedded systems using Flite, a low footprint text to speech system. We are working on two methods by which the new system can be implemented on a low resource device with the low memory and computing power -
Synthesized wave files for Hindi, Telugu & Indian English using Festival system (16KHz, mono)
Concatenative synthesis using cluster unit selection
1. N.Sridhar Krishna, Hema.A.Murthy and Timothy.A. Gonsalves, Text-to-Speech in Indian Languages, in the proceedings of International Conference on Natural Language Processing, ICON-2002, Mumbai, pp. 317.326, December 18-21, 2002.
2. N.Sridhar Krishna and Hema.A.Murthy, Duration Modelling of Indian Languages Hindi and Telugu, in the proceedings of 5th ISCA Speech Synthesis workshop, Carnegie Mellon University, Pittsburgh, pp. 197-202. June 14-16, 2004
3. N.Sridhar Krishna and Hema.A.Murthy, A New Prosodic Phrasing Model for Indian Language Telugu, in the proceedings of International Conference on Spoken Languages Processing, October 4-7, 2004.
4. M. Nageshwara Rao, Samuel Thomas, T. Nagarajan and Hema A. Murthy, Text-to-speech synthesis using syllable-like units, in the proceedings of National Conference on Communications, IIT Kharagpur, India, Jan 2005, pp. 277-280.
5. Samuel Thomas, Hema A. Murthy, C. Chandra Sekhar, Distributed speech synthesis for embedded devices - an analysis, in the proceedings of National Conference on Communications, IIT Kharagpur, India, Jan 2005, pp. 273-276.
6. Samuel Thomas, M. Nageshwara Rao, Hema A. Murthy, C. S. Ramalingam, Natural Sounding TTS based on Syllable-like Units, in appear in the proceedings of 14th European Signal Processing Conference, Florence, Italy, Sep 2006.
External links & references
Source & documentation for the Festival TTS framework and Flite project are downloadable at these links
All querries related to Festival TTS are avalaible at
For general querries related Text-to-Speech Synthesis
Dr. C. S. Ramalingam
N. Sridhar Krishnan
M. Nageshwara Rao
Y. R. Venugopalakrishna
M. V. Vinodh