|
Hardware:
Software:
· future
|
DHVANI: Problems, Future Work and How You can Help
1. The speech output appears rather slow
right now. The reason for this is that basic sounds have been recorded
individually instead of being recorded as parts of words. The next version aims
to do precisely this. This should speed up the speech output and also reduce the
size of the database down to about 500KB.
2. Dynamic speed and pitch modification needs to
be incorporated. We have TD-PSOLA code running on a prototype and could add that
to the next version, but somebody please tell us whether this method is patented
and therefore, not amenable to free distribution.
3. Consonant-Consonant junctions are currently
not very satisfactory; either there are large gaps, or sounds get gobbled
up. Some attention needs to be paid to this. The next version will record more
of these cluster characters.
4. Multiple voices need to be added. A major
challenge here is to automate the process of going from recording to setting up
the database. The main obstacle here is that of automatic segmentation of the
recording. Work is currently in progress on this front.
5. The phonetic description used by this version
is cumbersome; the next version needs a more friendly phonetic description,
something which is akin to transcription in Roman, possibly with markups for
duration and pitch.
6. Text-to-Phonetics modules need to be written
for all Indian languages. Currently we have only Hindi and Kannada, all other
languages need to be addressed.
7. As mentioned on the technical
description page the Hindi text-to-phonetics algorithm for compound
words needs to be improved, i.e., compound words need to be identified,
decomposed into basic words, and the algorithm must be run on each basic word.
It is not clear how this can be done without using a lexicon.
If you can help on any of the above,
contact Ramesh Hariharan(ramesh@csa.iisc.ernet.in)
|