NEW Amazon Polly TTS Service

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,206
Reaction score
5,228
Playing with Amazon Polly TTS service today. In a word, UNBELIEVABLE! Yep, it's that good. First year is free for 5 million characters a month. After that, it's $4 for a million characters of text-to-speech. Dirt cheap!



If you'd like to try it out, call 1-843-606-0199. The IVR introduction uses Amazon Polly MP3 file converted to GSM for Asterisk. For comparison, choose option 1 in the IVR for today's news headlines which uses Google's best TTS voice. I think you'll agree that it's apples and oranges (and pretty lousy oranges at that). This is running on a Raspberry Pi 3, by the way.

There's a good tutorial on PHP implementation here. In the same install directory, you also have to install Amazon's AWS SDK using Composer:
Code:
curl -sS https://getcomposer.org/installer | php
php composer.phar require aws/aws-sdk-php

Next, edit speak_text.php and add the following lines to the end of the file to convert the generated MP3 to something that can be used with Asterisk:
Code:
// convert MP3 to GSM for Asterisk
system('sox text.mp3 -r 8000 -c 1 /var/lib/asterisk/sounds/custom/text.gsm');

The generated text.gsm file is copied into the Asterisk sounds directory, /var/lib/asterisk/sounds/custom.
 
Last edited:

krzykat

Telecom Strategist
Joined
Aug 2, 2008
Messages
3,149
Reaction score
1,238
Could you modify your IVR so that it's 1 for Google's Best TTS, and then perhaps 2 or another digit for Amazon Polly TTS so that its a direct apples to apples comparison?
 

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,206
Reaction score
5,228
Could you modify your IVR so that it's 1 for Google's Best TTS, and then perhaps 2 or another digit for Amazon Polly TTS so that its a direct apples to apples comparison?

Good idea. I'll put it on "the list." I've also got the IBM TTS offering about ready to go as well. Just wrestling with how to keep all the credentials straight so they can be used in multiple apps.

A lot of these TTS apps were written almost a decade ago, and the design was different for every TTS provider. Now we're reworking them so that the methodology works like this. An API script for news or weather or whatever will check which TTS app you want to use. Then it will look up your credentials for that TTS provider. If successful, it then will do the query and scrape the results depositing them in a plain text "results" file in /tmp. Then the TTS engine will be called to convert the text file into an Asterisk-compatible sound file. Finally, Asterisk (and maybe 3CX one day soon) will play the sound file to the caller and delete the /tmp files.

In the words of our fearless leader, it's gonna be great. :chef:
 
Last edited:

krzykat

Telecom Strategist
Joined
Aug 2, 2008
Messages
3,149
Reaction score
1,238
I've been thinking through some of that philosophy and wonder if there's not a better method for implementation. Right now each code must be modified to "fit" the method used by the TTS system. What if there was a universal one that simply plugged in to allow your TTS to be selected from a menu and be carried out everywhere. Right now we've got: Flite, Google, Amazon, and who knows what is next. I think instead we need to just call TTS("Connect") - and the TTS is converted to use whichever TTS provider you set in another admin menu selection.

Flite: Flite("Please enter your account number.")
Google: agi(googletts.agi,"Please enter your account number.",en)
Swift: swift("Please enter your account number.")

Instead make it:

TTS("Please enter your account number.")

OK - oops - upon looking for a Polly example, I found that what I'm saying here already exists now in I guess the newer FPBX.

I think this should be added to incredible so that when the new flavor comes out, it is a simple update your TTS selection and done.
 

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,206
Reaction score
5,228
A few issues I see with that are the following:

(1) We are no longer operating in a FreePBX-centric universe. Want something more generic that will work for Wazo and 3CX as well as FreePBX.

(2) You may not always want to use the same TTS engine for all your apps. Some are better at certain tasks than others. When you need it, some are also more customizable than others and in different ways.

(3) It may be that we need a default selection and then a means to override it for certain applications.

(4) With Incredible PBX, we want TTS apps to work out of the box. That means the default needs to be FLITE, Festival, or PICO. Then perhaps the user can insert credentials and use Google, IBM, or Amazon.
 

krzykat

Telecom Strategist
Joined
Aug 2, 2008
Messages
3,149
Reaction score
1,238
Agreed. I think options are always best. Default with ability to over-ride seems the right move and from what I'm hearing, the flavor of the day has just moved to Amazon.

FYI ... 5 million characters - I wanted to equate that to something more common, so I've come up with:

In rough terms, each page of a standard-format hardcover book has about 300-350 words, and each word is five characters plus a space. So a typical book page has, say 1,500 to 1,800 characters (not counting spaces.). If we consider 250 pages as standard book length, then you're talking about maybe 400,000 characters if you don't count the spaces; 500,000 if you do. So you are talking about more than 10 books worth per month. Average Words per minute of an audio book is 150. So I guess we can also take 350 words per page * 250 pages = 87,500 words per book divided by the 150 = 583 minutes per book. But we're given 5 million characters or 5,830 minutes of translated verbiage. With there only being 10 hours / day * 60 minutes / hour * 20 work days / month = 12,000 work minutes per month, that means this thing could be talking half the work minutes per month without incurring any penalty. I can't imagine any small business exceeding that.
 

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,206
Reaction score
5,228
How Good Is Polly in Real Time?

 
Last edited:

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,206
Reaction score
5,228
New Live Demo IVR on CloudAtCost with Wazo and Polly apps for News (option 5) and Weather (option 6):

C-2F6IdXYAAt_bL.jpg
 

Mark Thompson

New Member
Joined
Dec 29, 2014
Messages
29
Reaction score
4
Hi Ward,

Polly TTS integration with Incredible PBX is great news!

Are you planning a single tarball install for other Incredible platforms such as Incredible 13-12 Ubuntu?

Thanks,

mark
 

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,206
Reaction score
5,228
Polly TTS uses very new Linux technology. I suspect it will fail with Ubuntu 14.04 but I haven't tried it. Upgrading from 14.04 to 16.04 is a gigantic can of worms so the short answer is no. The longer answer is it's time to move to either Wazo or 3CX, both of which use the latest and greatest Debian 8. If your existing system works great and you have a proven backup strategy in place, there's no need to jump ship. Just don't expect all the latest and greatest technologies to work. :scooter:
 

Members online

Forum statistics

Threads
25,825
Messages
167,842
Members
19,250
Latest member
mark-curtis
Get 3CX - Absolutely Free!

Link up your team and customers Phone System Live Chat Video Conferencing

Hosted or Self-managed. Up to 10 users free forever. No credit card. Try risk free.

3CX
A 3CX Account with that email already exists. You will be redirected to the Customer Portal to sign in or reset your password if you've forgotten it.
Top