TUTORIAL Use IBM Watson/BlueMix for Email Transcription

Ramsey F · Feb 14, 2017

This is a script I put together to send emails for voicemail with transcription from the IBM Watson/BlueMix Speech to Text service. The script is not elegant, but it works. If you want to use it for several numbers, make copies of the script for each number and set it up accordingly. Make multiple CRON entries to run all of the scripts.

No warranties expressed or implied. You are on your own.

First, disable the existing email for voicemail in IPBX settings > Users > General. Just remove your email address. Further instructions are in the script.

Code:

#! /bin/bash

################################
######### Instructions #########
################################
### Set variables below
### Place this script somewhere on the Wazo server
### Execute command:  crontab -e
### Place on bottom line: * * * * * /path/to/script
### - Note: above runs script once per minute
### Press:  CTRL-O (letter o)
### Press:  Enter/Return
### Press:  CTRL-X
### If you want to process multiple phone numbers
###  - Place multiple copies of this script with unique names
###############################################
######### Set the following variables #########
###############################################
#-- Set Wazo phone number to be processed
PN=702

#-- Set the IBM Watson BlueMix credentials
#-- Sign up here:  https://console.ng.bluemix.net/catalog/services/speech-to-text
#--   Speech-to-Text may be under additional services
API_USERNAME="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
API_PASSWORD="xxxxxxxxxxxx"
#-- Set IBM Watson band model - Narrowband Only!!
#--  Get from here: https://www.ibm.com/watson/developercloud/doc/speech-to-text/input.shtml
BAND=en-US_NarrowbandModel

#-- Send the email to:
send_to="[email protected]"

#-- The email FROM field should say:
from="Wazo <[email protected]>"

#-- The email SUBJECT field should say:
subj="Wazo Voicemail"

#-- Attach the WAV file to the email?
attach=YES
##############################################
######### End of variables to be set #########
##############################################

# voicemail directory
vm_dir="/var/spool/asterisk/voicemail/default"
# place processed file in phone directory
vm_processed="$vm_dir/$PN/vm_processed"
# inbox for new voicemails for the phone
INBOX="$vm_dir/$PN/INBOX"
# temporary list of the voicemails
tmp_list=$(mktemp /tmp/vm_list.XXXXX)
# temporary audio file
tmp_audio=$(mktemp /tmp/audio.txt.XXXXX)
# temporary outgoing email file
tmp_mail=$(mktemp /tmp/tmpmail.XXXXX)
# temporary processed file for cleaning
tmp_new_processed=$(mktemp /tmp/new_processed.XXXXX)

### Process New VM's ###

# Get a list of existing vm's
ls -1 "$INBOX" | grep "\.wav$" > "$tmp_list"

# Read tmp_list
cat "$tmp_list" | while read fn
do
   # Get date of vm file from epoch
   date=$(stat --printf=%Y "$INBOX/$fn")

   # If date and fn are in vm_processed, skip it
   grep -q "$date $fn" "$vm_processed" && continue


   #-- Get Transcription --#
   #-- Taken from:  https://jrklein.com/2015/08/17/asterisk-voicemail-transcription-via-ibm-bluemix-speech-to-text-api/
   CURL_OPTS=""

   # Send WAV to Watson Speech to Text API. Must use "Narrowband" (aka 8k) model since WAV is 8k sample.
   curl -s $CURL_OPTS -k -u $API_USERNAME:$API_PASSWORD -X POST \
       --limit-rate 40000 \
       --header "Content-Type: audio/wav" \
       --data-binary @"$INBOX/$fn" \
       "https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?continuous=true&model=$BAND" 1>"$tmp_audio"

   # Extract transcript results from JSON response
   TRANSCRIPT=$(cat "$tmp_audio" | grep transcript | sed 's#^.*"transcript": "##g' | sed 's# "$##g')
   #-- End Transcription --#

   # Asterisk creates a text file of information about each message
   # The information is in the form of variable assignment
   # First, add the transcription to the end of that file to store it
   # Next, read in and evaluate each variable from the file
   textfile=$(echo "$fn" | sed 's|\.wav$||g').txt
   echo "TRANSCRIPT=$TRANSCRIPT" >> "$INBOX/$textfile"
   while read -r ln ; do
       # add a trailing quote to the line
       ln="$ln'"
       # evaluate the variable
       eval "$ln"
       # the following dumps the text file so it can be read,
       # removes the lines starting with ";" and "["
       # then adds a quote "'" after the equals sign "="
   done < <(cat "$INBOX/$textfile" | grep -v -e "^\;" -e "^\[" | sed "s|=|=\'|")

   # Build a simple email
   # might upgrade this to HTML
   echo "Voicemail received at $(date -d @"$origtime")" > "$tmp_mail"
   echo "Received from: $(echo "$callerchan" | sed 's|Motif\/||' | sed 's|-.*$||')" >> "$tmp_mail"
   echo "CallerID: $callerid" >> "$tmp_mail"
   echo "Transcript: $TRANSCRIPT" >> "$tmp_mail"
   [ "${attach^^}" == "YES" ]] && echo "Listen below"  >> "$tmp_mail"
   echo " "  >> "$tmp_mail"
   echo " "  >> "$tmp_mail"

   # Mail it
   # Different command if attaching the WAV file
   [[ "${attach^^}" == "YES" ]] && cat "$tmp_mail" | mutt -a "$INBOX/$fn" -s "$subj" -e "my_hdr From:$from" -- "$send_to"
   [[ "${attach^^}" != "YES" ]] && cat "$tmp_mail" | mutt -s "$subj" -e "my_hdr From:$from" -- "$send_to"

   # Add it to processed list
   echo "$date $fn" >> "$vm_processed"
done

#-- Remove Old Entries From the Processed List --#
#-- When messages are listened to or deleted, they are still in the processed list
#-- This routine cleans the processed list

# Repeating steps from above because state of vm's might have changed during processing

# List all vm files into a tmp list
ls -1 "$INBOX" | grep "\.wav$" > "$tmp_list"

# Read one at a time
cat "$tmp_list" | while read fn
do
   # Get date from epoch
   date=$(stat --printf=%Y "$INBOX/$fn")

   # If the date and fn are in vm_processed, put them in new_processed
   grep -q "$date $fn" "$vm_processed" && echo "$date $fn" >> "$tmp_new_processed"

done

# Replace the vm_processed with the new_processed
test -e "$tmp_new_processed" && mv "$tmp_new_processed" "$vm_processed"

# Clean up temp files so /tmp doesn't get overrun
# temporary list of the voicemails
rm -rf "$tmp_list"
# temporary audio file
rm -rf "$tmp_audio"
# temporary outgoing email file
rm -rf "$tmp_mail"


exit 0

Good luck.

wardmundy · Feb 19, 2017

Nice job, @Ramsey F. Can't wait to try it.

TUTORIAL Use IBM Watson/BlueMix for Email Transcription

Ramsey F

Member

wardmundy

Nerd Uno

Members online

Latest Posts

Forum statistics