FOOD FOR THOUGHT HOWTO Transcribe Asterisk VM

ghurty

Senior Member
Joined
Jan 13, 2009
Messages
852
Reaction score
4
Anyway of tying in a speech to text app to the asterisk voicemail system?

That way when a voicemail gets emailed, it also sends a rough transcription of it (the same way google voice does).



THanks
 
Joined
Apr 17, 2009
Messages
829
Reaction score
9
why not just have the system email you the voicemail. no need to create extra steps???;)
 

ghurty

Senior Member
Joined
Jan 13, 2009
Messages
852
Reaction score
4
When you are viewing it on a cell phone, it is much easier to get a basic idea of what the voice mail was about by glancing at a rough transcription.
 
Joined
Apr 17, 2009
Messages
829
Reaction score
9
sorry. so use to my HTC touch PRO and having windows on it that I read emails / type on it while driving to customer sites all day. so a little old voice mail doesnt slow me down. i just click on the attachment and it plays.

but i do also see where a transcription would be nice also.
 

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
19,206
Reaction score
5,226
Speech to text (without a system trained to the speaker) is hit and miss at best... and it's still pretty rough. Much of it depends upon the voice quality and consistency of the speaker. Google Voice is as good as it gets, and it's not very good. If you really need it, I'd forward my calls to Google Voice and let it do the transcription. To see how it works and try it yourself, see our article AND the results.
 

cjkeeme

Guru
Joined
Jun 18, 2008
Messages
203
Reaction score
0
Has there been in progress in the area or is it best to simply hire a company that has really people doing the transcriptions?

I have a client needing this feature.
 

parker

New Member
Joined
Apr 30, 2009
Messages
75
Reaction score
0
Has anyone ever tried the speech recognizer program that's referenced by the "zoip" freepbx module?
 

lowno

Guru
Joined
Feb 18, 2009
Messages
125
Reaction score
8
Here is how to modify you pbx to add voicemail transcription. This is not my work, but I have cleaned it up. It will not work if you delete your voicemails after they are emailed. Also I had to edit the email body in the voicemail.conf file because for some reason, if I edited the vm_email.inc file nothing would change. There is probable a good reason for that, but I just don't know. Anyways, these instructions worked on my 1.8.3 system.

Instructions

Step 1
Install the dependancies needed to compile and install PocketSphinx:

yum install libtool

Step 2
Download the latest version of PocketSphinx and SpinxBase and extract it. You can find the latest version on their website: http://cmusphinx.sourceforge.net/wiki/download
You can also download the packages using:
wget:
wget http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/trunk/sphinxbase/?view=tar -O sphinxbase.tgz
wget http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/trunk/pocketsphinx/?view=tar -O pocketsphinx.tgz

And extract:
tar xvfz sphinxbase.tgz

tar xvfz pocketsphinx.tgz

Step 3
You can then compile and install sphinxbase:
cd sphinxbase/

./autogen.sh

make

make install

Then compile and install pocketsphinx:
cd pocketsphinx/

./autogen.sh

make

make install

Step 4
Download an acoustic model that pocketsphinx will use to transcribe the voicemails.
wget http://downloads.sourceforge.net/pr...ustic Model/communicator_4000_20080321.tar.gz

And extract and move the acoustic model somewhere we can use it later:
tar xvfz communicator*.tar.gz

mv Communicator_40.cd_cont_4000 /var/lib/asterisk/communicator/

Step 5
We’ll next need to create a script that will perform the transcription, and save the transcript as a text file so we can include the text in an email. You can view the script below:

nano /sbin/voicemail-transcribe.sh

#!/bin/sh

voicemaildir=/var/spool/asterisk/voicemail/$1/$2/INBOX/

echo `date` ':' $voicemaildir >> /var/log/voicemail-notify.log

for audiofile in `ls $voicemaildir/*.wav`; do
transcriptfile=${audiofile/wav/transcript}
# For each message.wav we check if message.transcript
# exists
if [ ! -f $transcriptfile ]; then
# If not, we create it
/usr/local/bin/pocketsphinx_continuous -infile $audiofile \
-hmm /var/lib/asterisk/communicator \
-samprate 8000 2> /var/log/asterisk/voicemail-notify.log \
> $transcriptfile

# Now we can do whatever we want with the new transcription
echo `cat $transcriptfile | cut -d: -f2`
fi
done

Then, ensure that the script is executable:
chmod +x /sbin/voicemail-transcribe.sh

Step 6
We can then modify the voicemail email template to include a section for the voicemail transcription. This in an important step because we’ll be using a secondary script that will insert the voicemail transcript into the email after the keywords “Voicemail Transcript”.

Add the following text to /etc/asterisk/vm_email.inc at the end of the “emailbody=” line. Note that this must be one continuous line.

\nExtension: ${VM_MAILBOX}\nVoicemail Transcription: \n\n

Step 7
We’ll then need to create a secondary script that will grab the email generated by Asterisk and modify it to include the voicemail transcription. You can view the script below:

nano /sbin/vm-modify.sh

#!/bin/sh
cat > /tmp/voicemail.tmp

# Grab the extension of the voicemail
EXT=`cat /tmp/voicemail.tmp | grep Extension | awk {'print $2'}`

# Transcribe the voicemail for the given extension
/sbin/voicemail-transcribe.sh default $EXT > /tmp/transcribe.tmp

# Append the outgoing email with the voicemail transcription
cat /tmp/voicemail.tmp \
| sed "/Voicemail Transcription:/ r /tmp/transcribe.tmp" \
| /usr/sbin/sendmail -t

# Clean up temporary files
rm /tmp/voicemail.tmp
rm /tmp/transcribe.tmp

Then, ensure that the script is executable:
chmod +x /sbin/vm-modify.sh

Step 8
You’ll then need to modify the Asterisk voicemail configuration to use our script instead of the default sendmail agent:

nano /etc/asterisk/voicemail.conf

And add the mailcmd option to use the script:
[general]

…

mailcmd=/sbin/vm-modify.sh
 

KUMARULLAL

Guru
Joined
Feb 20, 2008
Messages
243
Reaction score
28
Is it possible to get this working on an openvz container.
I installed it on OPenvz container. However, it depends on a sound card and uses alsa to find the card.
I have install alsa-utils on Debian server which is the host node (Proxmox)
How do I assign the Alsa to the contaner?
When I run: (On the Openvz container)
/usr/local/bin/pocketsphinx_continuous

This is what I get: (Partial output)
Code:
INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
INFO: feat.c(684): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(167): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(520): Reading model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: bin_mdef.c(513): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(908): Loading senones from dump file /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
INFO: s2_semi_mgau.c(932): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(1027): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1304): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: dict.c(308): Allocating 137543 * 20 bytes (2686 KiB) for word entries
INFO: dict.c(323): Reading main dictionary: /usr/local/share/pocketsphinx/model/lm/en_US/cmu07a.dic
INFO: dict.c(212): Allocated 1010 KiB for strings, 1664 KiB for phones
INFO: dict.c(326): 133436 words read
INFO: dict.c(332): Reading filler dictionary: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(335): 11 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286
INFO: ngram_model_dmp.c(242):     5001 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(291):   436879 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(317):   418286 = LM.trigrams read
INFO: ngram_model_dmp.c(342):    37293 = LM.prob2 entries read
INFO: ngram_model_dmp.c(362):    14370 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(382):    36094 = LM.prob3 entries read
INFO: ngram_model_dmp.c(410):      854 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(466):     5001 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 60 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 60 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 13428
INFO: ngram_search_fwdtree.c(338): after: 457 root, 13300 non-root channels, 26 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(367): /usr/local/bin/pocketsphinx_continuous COMPILED ON: Aug  9 2011, AT: 16:57:36

ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:3985:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2184:(snd_pcm_open_noupdate) Unknown PCM default
Error opening audio device default for capture: No such file or directory
FATAL_ERROR: "continuous.c", line 242: Failed top open audio device
Any ideas?
Thanks in advance.
 

TheShniz

Guru
Joined
Nov 15, 2007
Messages
560
Reaction score
2
I can't help you there, but I will mention that the accuracy was soooooo very poor (on a bare metal install) that I found it completely unusable... now if someone has some ideas on how to increase accuracy, I'd love to hear!


This is a big item on my wishlist, but accuracy has to be significantly improved.
 

KUMARULLAL

Guru
Joined
Feb 20, 2008
Messages
243
Reaction score
28
Hi Randy7376,
I have read the post you sent a couple of days earlier. I also installed the snd_dummy but the container is specifically looking for ALSA. It did not work.
Thanks anyway for your response.
 
Get 3CX - Absolutely Free!

Link up your team and customers Phone System Live Chat Video Conferencing

Hosted or Self-managed. Up to 10 users free forever. No credit card. Try risk free.

3CX
A 3CX Account with that email already exists. You will be redirected to the Customer Portal to sign in or reset your password if you've forgotten it.
Top