PIONEERS Exploring Speech to Text

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,168
Reaction score
5,199
speak.jpg


I don't use F-A-N-T-A-S-T-I-C quite as often as Steve Jobs, but Lefteris Zafiris has really outdone himself this time around. His new Asterisk AGI script gives you near perfect speech-to-text recognition using Google's speech recognition service. And it's FREE!


1. Install the AGI script by logging into your server and issuing the following commands:

Code:
cd /root
wget --no-check-certificate https://github.com/downloads/zaf/asterisk-speech-recog/asterisk-speech-recog-0.4.tar.gz
tar zxvf asterisk-speech*
cd asterisk-speech-recog-0.4
cp speech-recog.agi /var/lib/asterisk/agi-bin/.
cd /etc/asterisk
nano -w extensions_custom.conf


2. Now add the following sample code to /etc/asterisk/extensions_custom.conf at the top of the [from-internal-custom] context:

Code:
exten => 77325,1,Answer()
exten => 77325,n,flite("Say something in English, when done press the pound key.")
exten => 77325,n(record),agi(speech-recog.agi,en-US)
exten => 77325,n,Noop(= Script returned: ${status} , ${id} , ${confidence} , ${utterance} =)
;exten => 77325,n,GotoIf($["${status}" = "0"]?success:fail)
exten => 77325,n,flite("${utterance}")
exten => 77325,n,flite("Have a nice day! Good bye.")
exten => 77325,n,hangup

exten => 2255,1,Answer()
exten => 2255,2,Wait(1)
exten => 2255,3,flite("Say the number you wish to call. Then press the pound key.")
exten => 2255,4(record),agi(speech-recog.agi,en-US)
exten => 2255,5,Noop(= Script returned: ${status} , ${id} , ${confidence} , ${utterance} =)
exten => 2255,6,Set(NUM2CALL=${utterance})
exten => 2255,7,SayDigits("${NUM2CALL}")
exten => 2255,8,Background(vm-star-cancel)
exten => 2255,9,Background(vm-tocallnum)
exten => 2255,10,Read(PROCEED,beep,1)                                        
exten => 2255,11,GotoIf($["foo${PROCEED}" = "foo1"]?12:13)
exten => 2255,12,Goto(outbound-allroutes,${NUM2CALL},1)
exten => 2255,13,hangup


3. Reload your dialplan: asterisk -rx "dialplan reload"

4. Now pick up a phone and dial S-P-E-A-K. When prompted, say a few words and press #. The speech-to-text script will pass your memorable words to Google, have it converted to text, and then say it back to you using Flite's Egor.

5. Next, pick up a phone and dial C-A-L-L. When prompted, say a phone number to dial and press #. Listen to the playback of the number. If it is correct, press 1 to place the call. Stay tuned for loads of apps!

:party::party::party::party::party:
 

tbrummell

Guru
Joined
Jan 8, 2011
Messages
1,275
Reaction score
339
Ohhh, can't wait for someone to make it transcribe a voicemail that is left and then email it with the email notification. Let the waiting begin!
 

randy7376

Defnyddiwr Gweithredol
Joined
Sep 29, 2010
Messages
864
Reaction score
144
Ohhh, can't wait for someone to make it transcribe a voicemail that is left and then email it with the email notification. Let the waiting begin!

That's exactly what I was thinking when I read Ward's portion of this thread!

I wish I had more time to play... :)
 

rossiv

Guru
Joined
Oct 26, 2008
Messages
2,624
Reaction score
139
I just tried it on my PIAF2 box and it works! Voice transcriptions were almost perfect on my tries as were the numbers for Speak2Dial. YAY!
 

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,168
Reaction score
5,199
Good question. There's not much info from Google on this, mostly from third parties. The AGI script essentially masquerades as the Chrome web browser to use Google's freely available public service. Google could certainly add a layer of encryption if they wanted to keep the public out. There now are patent trolls to deal with as well. My guess is you probably can expect the same sort of Wild West ride that everyone came to know and love with Google Voice. :cowboyb:
 

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,168
Reaction score
5,199
Speech to Text for Voicemails

For those that want to experiment, here's a very rough cut at what would be needed to transcribe voicemails.

1. Install the perl script that's included in the open source tarball above:

Code:
cd /root/asterisk-speech-recog-0.4/samples
cp speech-recog-cli.pl /usr/local/sbin/.


2. Copy any Asterisk voicemail message to a temporary folder. You'll find the messages in the directory tree for a particular extension, and they look like this:

Code:
/var/spool/asterisk/voicemail/default/[COLOR="Red"]702[/COLOR]/INBOX/msg0000.wav


3. Run the following two commands to convert the voicemail to a Google-supported sound format and then pass the sound file to Google to do the heavy lifting transcribing the voicemail message via the open source perl script:

Code:
flac --best --sample-rate=8000 msg0000.wav -o msg0000.flac
speech-recog-cli.pl msg0000.flac | head -2 | tail -1 | cut -f 2 -d ":"


4a. The raw output from using speech-recog-cli.pl will look like this:

Code:
Openning msg0000.flac
utterance  : here is a sample voicemail message that I'm going to leave after the tone have a nice day
status     : 0
confidence : 0.9633785  [COLOR="Magenta"]<-- The likelihood that the transcription is accurate, 96% in this case[/COLOR]
id         : ac9869b4a460bae157a793245bdc0f36


4b. After massaging with | head -2 | tail -1 | cut -f 2 -d ":", you get this:

Code:
 here is a sample voicemail message that I'm going to leave after the tone have a nice day


Enjoy! :hat:
 
Joined
Jun 29, 2009
Messages
258
Reaction score
0
Very nice! Another possible use of this might be to allow a user to dial a feature code, record some speech, get it transcribed, and have it e-mailed to the address at which they normally receive voicemail notifications (as defined in the extension's settings).

I haven't played with this yet (much too early in the morning) but the possibilities are quite interesting.
 

darmock

PIAF Developer
Joined
Oct 18, 2007
Messages
2,892
Reaction score
98
Dictation! I never even thought of that and it would be the most useful application for me. I have been doing some thinking about how to integrate this with an IVR, but my skills are not up to that. If google doesn't pull the rug out on this one, IVR's are sure to be the most requested feature.

Yep and one of the most litigious ones also. Just imagine the patent trolls crawling out of the woodwork suing everyone in sight. All in the name of the almighty dollar. I keep wondering how google has gotten away with it..... Of course they have more lawyers on tap than our freedom loving government.... (sarcasm intentional)

I can see them going after all the ip addresses that use this service and suing the lot. Amazing what blanket search warrants can do and they sure are easy to get unless you are the government and you don't need one any more..... Course we could use the tor network......

Tom

Sorry in a ranting mood this morning
 

darmock

PIAF Developer
Joined
Oct 18, 2007
Messages
2,892
Reaction score
98
The short answer is yes it does not have all the dependencies. The long answer is about 20 pages long. Can 1757 get all the new dependencies installed? Yes.

Keep a list if you get it working. It may be something simple or complex.

Tom
 

darmock

PIAF Developer
Joined
Oct 18, 2007
Messages
2,892
Reaction score
98
Unmodified will work with PIAF 2062X Unknown if it will work with 2060X or 2061X. A definite maybe.

May not work at all with any other prior version of PIAF without modifications up to and including modification of dialplans, installation of new dependencies, recompilation of multiple programs. I just cant predict it.

What in the world are you doing installing this on a production system anyway? That is not a good thing ever. This code does not even qualify as alpha.... It is just something to play around with until it is more formalized.

Unfortunately we dont have the resources to test with anything other than the current version of PIAF. So for clarity we are only testing on the current version of PIAF which is today 2.0.6.2.x. We no longer develop new stuff for anything other than the 2.0.6.2.x and above tree.

The 1.7.5.7.X tree is the last stable version of the 1.7 tree and it is in security fix only mode. We no longer actively develop new products for it.

Any new programs that we have released in the last couple of weeks or so only work on 2.0.6.2.X or above. So if you have a 2060x or 2061x box it might be time to upgrade it to Centos 6.2 using update source. You would need to update the kernels then let it do a yum update. Then let update-source continue as dahdi gets broken when you update the kernel and requires a recompile. Of course you can just do it all by hand if that is what you want. It is your box and you have a right to do anything to it.


Tom
 

rg00dman

Guru
Joined
May 25, 2010
Messages
38
Reaction score
0
Well this will be another weekend I am not doing what the wife wants me to :) Just wondering in the first example given could it be configured so instead of reading back the number you say someones name it checks the phone book and calls them? I cant imagine it will be too hard (I hope), will give it a go later but if anyone has any ideas how that would be great.

Had my PBX for over 3 years now and it just keeps getting better and better.
 
Joined
Jun 29, 2009
Messages
258
Reaction score
0
For those who want to experiment with a speech to e-mail (text) application for your own personal use, you might try dropping something like this into extensions_custom.conf. I am only posting this as an example of what might work, and because of darmock's comment I'm specifically NOT saying anyone should actually try this. If you do try it then it's at your own risk (including any legal risks):

Code:
exten => 788,1,Answer
exten => 788,n,Macro(user-callerid,)
exten => 788,n,Noop(CallerID is ${AMPUSER})
exten => 788,n,Set(DICTEMAIL=${DB(AMPUSER/${AMPUSER}/dictate/email)})
exten => 788,n,Set(NAME=${DB(AMPUSER/${AMPUSER}/cidname)})
; exten => 788,n,Playback(silence/1&after-the-tone&custom/say-msg-prs-pound)
exten => 788,n,agi(speech-recog.agi,en-US)
exten => 788,n,Noop(= Script returned: ${status} , ${id} , ${confidence} , ${utterance} =)
exten => 788,n,System(echo "${utterance}" | mail -s "Dictation from ${NAME} converted to text with ${confidence} confidence" ${DICTEMAIL});
exten => 788,n,Playback(goodbye)
exten => 788,n,hangup

(EDIT: Uncomment the commented out line if you use my suggestion from post #24 in this thread)

For this to work there must be a valid e-mail address in the "Dictation Services"/"Email Address" setting of the calling extension. Since this is simply a suggestion of what might work, there is no error checking and no announcements of any kind (I personally hate the quality of "Flite" synthesized speech, so you won't find it in any of my examples). You dial STT ("Speech To Text") and if the moon and the planets are all in proper alignment AND you don't hang up prematurely (after you stop talking you must wait until you hear "goodbye") your speech converted to text just might be e-mailed to the Dictation Services e-mail address for your extension.

Please check with YOUR lawyer before you assume it's okay to use this, especially in any application that's even remotely commercial. I personally don't care if someone else builds on this, but someone else might. I also happen to think that our so-called "intellectual property" laws need to be seriously overhauled or even abolished altogether (ideas are not the same thing as property and never will be, no matter how many lawyers are willing to stand up in court and tell that lie under oath), but I'm too old to be trying to start any reform movements. The fact that we are in this state is just one symptom of the real cancer on our society, which is the amount of influence big corporations have over our elected officials, but that's another rant for anther forum (and this thought would not have even occurred to me in relation to this topic had it not been for darmock's comment).
 

darmock

PIAF Developer
Joined
Oct 18, 2007
Messages
2,892
Reaction score
98
This is completely consistent behavior for my recent appointment as court jester. If things had have gone pear shaped, my post would would have started as "Help my boss is really mad at me now..." Luckily God has a soft spot for idiots.


Ouch butt hurt when I fell off the chair laughing too hard! You owe me 400 quatloos for medical expenses.....

Not to mention idiot developers...... sigh sometimes it is easier to just slam my head in the door......

I understand however.

Tom :crazy:
 

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,168
Reaction score
5,199
Install seems to proceed just fine from there, but when trying out the sample feature code, S-P-E-A-K I get "Say something in English, when done press the pound key." Immediately followed by "Have a nice day! Good bye." It doesn't wait for me to say anything. Thinking it might be a permissions problem, I changed the script ownership to asterisk:asterisk 0777, the full asterisk log shows the script exits with code 0 indicating no errors, but I can't get it to work on this system. Anyone have any ideas? Is it possible that this PIAF version doesn't have the necessary dependencies, i.e. flac?

I had reworked the download to make it simpler from GitHub. :crazy: But I substituted the production code link for the experimental one. I tested the latter and it worked fine. So it may be that this was a bug in the current production code. You might try downloading the latest and greatest to see if that fixes the problem. There's nothing that needs to be done to a base PIAF2 install to get it working.
 

darmock

PIAF Developer
Joined
Oct 18, 2007
Messages
2,892
Reaction score
98
this thought would not have even occurred to me in relation to this topic had it not been for darmock's comment).

I take no credit for the original thought. We were "informed" about what had happened to other players in the the VOIP market who ran afoul of the little bastard patent trolls. Of course this entire thread is hypothetical and no one should ever try anything that even might be patented without paying royalty..... I bet someone patented the concept of fire and will come after you for your next BBQ.

Just have to remember the jpg format and the patent trolls who went after websites and suggested (blackmailed) them into buying a license for the site. When they they tried that with several website I help with we deleted all the jpg and went over to other open source unencumbered formats. You want a laugh look at VP8 from google and the lawsuits over that.

Sorry way off topic.... Back to the regularly scheduled programming.


Tom
 
Joined
Jun 29, 2009
Messages
258
Reaction score
0
Tried an install to PIAF ver. 1.7.5.6 PURPLE using the directions above. The first problem was the wget wouldn't work until I changed it to:
Code:
wget --no-check-certificate
Install seems to proceed just fine from there, but when trying out the sample feature code, S-P-E-A-K I get "Say something in English, when done press the pound key." Immediately followed by "Have a nice day! Good bye." It doesn't wait for me to say anything. Thinking it might be a permissions problem, I changed the script ownership to 0777, the full asterisk log shows the script exits with code 0 indicating no errors, but I can't get it to work on this system. Anyone have any ideas? Is it possible that this PIAF version doesn't have the necessary dependencies, i.e. flac?

I don't know what PiaF has installed but I do know you need to have the following:

flac and sox (try using which to see if they are installed, if not then yum install them)

The two perl modules mentioned in the "use" statements in /var/lib/asterisk/agi-bin/speech-recog.agi
Hopefully you know how to install Perl modules.

And you already figured out that the ownership of /var/lib/asterisk/agi-bin/speech-recog.agi should be asterisk:asterisk

The may be other dependencies but those are the ones I am aware of after a quick glance at the agi file.
 
Joined
Jun 29, 2009
Messages
258
Reaction score
0
Just one more thought, for those who dislike flite as much as I do...

There is a stock Asterisk recording where Allison says "Say your temporary message and then press the pound key". It is in the /var/lib/asterisk/sounds/en directory and the filename is say-temp-msg-prs-pound.wav (it may have other extensions also).

Now, I'm not telling anyone to do this, but if one were precise with an audio editor (such as Audacity), one might be able to expertly cut out the word "temporary" by, say, deleting the portion of the file from approximately 0.4576 seconds to 0.9623 seconds or thereabouts (be sure to "find zero crossings" before deleting if you don't want a click). One might also want to normalize the result to -6.0 dB.

If you then uploaded the resulting recording (which you might have named say-msg-prs-pound.wav) using the system recordings module… well, I'll leave the rest to your imagination. [Possible hint: Playback(silence/1&after-the-tone&custom/say-msg-prs-pound) just before the agi call.]

Is cutting one word out of an existing recording that comes with Asterisk legal? In this day and age, darned if I know.
 

rg00dman

Guru
Joined
May 25, 2010
Messages
38
Reaction score
0
Installing flac made it work for me on the following setup


********************************************************************
* PBX in a Flash Version Daemon Status *
* Running Asterisk 1.4 *
********************************************************************
* Asterisk * ONLINE * Zaptel * ONLINE * MySQL * ONLINE *
* SSH * ONLINE * Apache * ONLINE * Iptables * ONLINE *
* Fail2ban * ONLINE * IP Connect* ONLINE * Ip6tables * ONLINE *
* BlueTooth * ONLINE * Hidd * ONLINE * NTPD * ONLINE *
* Sendmail * ONLINE * Samba * ONLINE * Webmin * ONLINE *
* Ethernet0 * ONLINE * Ethernet1 * N/A * Wlan0 * N/A *
********************************************************************
* Running Asterisk Version : Asterisk 1.4.21.2
* Asterisk Source Version : 1.4.21.2
* Zaptel Source Version : 1.4.12.1
* Libpri Source Version : 1.4.7
* Addons Source Version : 1.4.7
********************************************************************
CentOS release 5.6 (Final) :32 Bit Kernel: 2.6.18-194.8.1.el5


Well I get an all circuits are busy message but that's probably down to some of my more creative dial plans
 
Get 3CX - Absolutely Free!

Link up your team and customers Phone System Live Chat Video Conferencing

Hosted or Self-managed. Up to 10 users free forever. No credit card. Try risk free.

3CX
A 3CX Account with that email already exists. You will be redirected to the Customer Portal to sign in or reset your password if you've forgotten it.
Top