diphones, male voice.
The following phoneme symbols are assumed in the us3 diphone sets. It
slightly different than the SAMPA alphabet since american english is not
british english.
SYMBOL PRONOUNCED LIKE IN
p drop proxy
t plot tromp
4 later (flapped allophone of t)
k rock crop
b cob box
d nod dot
g jog gospel
f prof fox
s boss sonic
S wash shop
tS notch chop
T cloth thomp
v salve volley
z was zombie
Z garage jacques
dZ dodge jog
D clothe thy
m palm mambo
n john novel
N bong
l doll lockwood
l= litle
r star roxanne
j yacht
w show womble
h harm
r= her urgent
i even
A arthur
u oodles
I illness
E else
{ apple
V nut
U good
@ about
EI able
AI island
OI oyster
@U over
aU out
O all
-Julian Assange <proff@iq.org>
synthesis system.
This voice provides a American English male voice using the MBROLA
synthesis method. It uses a modified CMU lexicon for pronunciations.
Prosodic phrasing is provided by a statistically trained model using
part of speech and local distribution of breaks. Intonation is
provided by a CART tree predicting ToBI accents and an F0 contour
generated from a model trained from natural speech. The duration
model is also trained from data using a CART tree.
The quality of this voice is not as high as us1 and us2
This voice can be activated via (voice_us3_mbrola)
-Julian Assange <proff@iq.org>
synthesis system.
This voice provides a American English male voice using the MBROLA
synthesis method. It uses a modified CMU lexicon for pronunciations.
Prosodic phrasing is provided by a statistically trained model using
part of speech and local distribution of breaks. Intonation is
provided by a CART tree predicting ToBI accents and an F0 contour
generated from a model trained from natural speech. The duration
model is also trained from data using a CART tree.
This voice can be activated via (voice_us2_mbrola)
-Julian Assange <proff@iq.org>
synthesis system.
This voice provides a American English female voice using the MBROLA
synthesis method. It uses a modified CMU lexicon for pronunciations.
Prosodic phrasing is provided by a statistically trained model using
part of speech and local distribution of breaks. Intonation is
provided by a CART tree predicting ToBI accents and an F0 contour
generated from a model trained from natural speech. The duration
model is also trained from data using a CART tree.
This voice can be activated via (voice_us1_mbrola)
-Julian Assange <proff@iq.org>
wq
Festival 1.4.0 has the following improvements over the previous release (1.3.1 January 1999)
o distributed under a free X11-type licence
o generalization of stats modules, ngram, CART, wfst with viterbi so they
can be shard more easily
o Tidy up of Utterance/Relation/Item architecture
o Initial JSAPI support
o Three new us voices using MBROLA databases
o Tilt code overhaul
o XML load for Relations
o Fringe graphic display (ALPHA) released seperately
http://www.cstr.ed.ac.uk/projects/fringe.html
Changes from 12.15:
* Improved I/O performance on uncompressed data
* Play script can now handle spaces in filename
* Improved default output quality of ADPCM files
* Added support for ALSA audio devices
* Several bug fixes to AIFF files
* Resample effect was greatly improved and now SoX does a great job
on almost all resamples.
* Added A-law support to .au files plus bug fixes
* Updated OS/2 support
* Added auto-configure support. Greatly enhanced cross-platform support.
* Imrproved 16-bit DOS compiler support
* Added swap effect
* Combined play and rec script and added more options
* Fixed bugs in low/high/band-pass filters and avg effect.
work on all architectures. Tested on pmax, i386 and alpha (no
big-endian machines!) and gives identical results, although not
identical to the binary-only 0.76. If there's any big-endian people
who want to test this, please let me know.
Lots of patches because 64-bit longs just "Don't Work(tm)" - just use
ints everywhere. I'm in communication with the author on how to fix
this at his end.