1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
  2. If you had a PIAF Forum account in the vBulletin days, log in with your old credentials. Otherwise, sign up again and we'll get you back in business as soon as we can.
  3. A serious FreePBX vulnerability has been reported. Update your Framework Module immediately. Click here for details.

Speech-to-Text Dir Assistance

Discussion in 'Add-On Install Instructions' started by wardmundy, Jan 30, 2012.

  1. wardmundy Nerd Uno

    Speech-to-Text Directory Assistance Comes to Asterisk

    [IMG]

    If you are running an existing copy of Incredible PBX, be sure to load this update after performing the steps in the article:

    Code:
    cd /var/lib/asterisk/agi-bin
    wget http://bestof.nerdvittles.com/applications/asteridex4/callwho21.tgz
    tar zxvf callwho21.tgz
    rm callwho21.tgz
  2. lgaetz Pundit

    Excellent. I haven't tried it yet, but I assume there will be lots of issues trying to convert proper names to text. I wonder if it's possible to incorporate some sort of fuzzy matching to the MySQL query, so if the search is off by a single character it would still yield a match. Maybe run an exact search and if no results come up then run a fuzzy search. I guess we are going to have to start building contact databases with a field for alternate pronunciations.
  3. wardmundy Nerd Uno

  4. wardmundy Nerd Uno

    Wasn't Too Hard

    OK. Here's the query you'd need to use in nv-callwho.php to implement soundex() lookups. Works great!

    Code:
    $query = "SELECT * FROM user1 where strcmp(soundex(name), soundex('$dialcode')) = 0 order by name asc";
  5. lgaetz Pundit

    Never heard of soundex before. I did a bit of reading and some trial and error and came up with this query:
    Code:
    SELECT * FROM `user1` WHERE LEFT(SOUNDEX(name),4) LIKE LEFT(SOUNDEX("american"),4) LIMIT 1
    which results in:
    Code:
    1     American Airlines     *     8004337300
    As shown above this query is not ready for prime time. First off, the SOUNDEX() function is tailored to English. Secondly the query only only compares the initial sound of the strings. I think the approach to take is to break up the strings into their constituent words, get soundex values for each word and search for the initial sounds for all words. Any MySQL gurus out there?

    D'Oh! Ward beat me by 60 seconds.
  6. wardmundy Nerd Uno

    Great minds...
  7. lzaf Guru

    The "right" (TM) :rolleyes: way to do that, and what is actually used in cases like this in speech recognition applications is the use of grammar rules. By defining a grammar for your application you can specify words and patterns of words to be recognised in a certain way. For instance we can have Joshua or Josh always recognised as Joshua, Mark and Marky recognised the same way, we can have optionally use of surnames, we can Have Delta air lines always regognised as "delta air lines" and American airlines as "american airlines" and so on. In the case of 'Directory Assistance' a well defined grammar by the user can make the database lookup a lot more reliable.

    There is already a defined format for grammar syntax by w3c.

    Asterisk has an internal API for speech recognition that supports grammar rules.

    Google's STT engine supports grammars but as far as i know it supports only some predefined grammars which cannot be edited by the user. (This might not be true, I ll have to search a bit deeper in their code but the only thing i have seen so far is the use of predefined grammars.)

    The speech recognition AGI script for asterisk has no support of grammars yet. It is high on my TODO list but its not yet implemented. My first thought was to use the internal asterisk speech recognition API but that doesn't seem very practical, especially for an AGI script. What I'm actually planning to do is to add support for Augmented Backus-Naur Form (ABNF) grammars without the use of any 3rd party APIs. This will take some time and it all depends on the amount of free time I ll have the next few weeks. ;)
  8. Phone_User Guru

    My thoughts would be to use php to get an array of the words, then do the sql query finding all entries with with first word, then from that subset find the second word etc....

    php has this function

    str_word_count("Your String",1));

    would return and array with
    [0] => Your
    [5] => String

    With the numbers being the starting positions of each word.
  9. lgaetz Pundit

    Great stuff, I wish I understood it all. Dragon is going to hate you.
  10. wardmundy Nerd Uno

    Soundex typically returns 4 characters, but not in MySQL. So... you may want to experiment with a substring of the result to get broader hits. For example, if you want to obtain all the Smith's when you say "Smith John" or when you just say "Smith" and you have "Smith John" and "Smith Mary" in your database, then use the function below. With MySQL's implementation of Soundex(), searching for "Smith" would not return "Smith John" without using this substring approach.

    Code:
    $query = "SELECT * FROM user1 where strcmp(substring(soundex(name),1,4), substring(soundex('$dialcode'),1,4)) = 0 order by name asc";
    You can retrieve and install our latest version of the CallWho AGI script by logging into your server as root and...

    Code:
    cd /var/lib/asterisk/agi-bin
    wget http://bestof.nerdvittles.com/applications/asteridex4/callwho21.tgz
    tar zxvf callwho21.tgz
    rm callwho21.tgz
  11. lgaetz Pundit

    More playing with MySQL and soundex(). It looks like soundex() will analyze the string and return a value that starts with a leading non-digit representing the initial sound of the string and groups of 3-4 digits representing each word in the string. soundex() will return as a minimum 4 characters, the leading non-digit, digits representing the word and trailing zeros if required to make up 4. The trailing zeros are not included in soundex() values of complex strings. In my testing I never got a soundex() value with zeros in it except where it was required to make up the minimum five characters. Also it is important to strip punctuation specifically apostrophes (probably others) from the utterance before searching or MySQL will choke.

    The MySQL soundex() values of varies strings are as follows:
    'american airlines' = A56256452
    'american air lines' = A56256452
    'airlines american' = A64525625
    'air lines american' = A64525625
    'american' = A5625
    'airlines' = A6452
    If I were to speak "American Airlines" I would want the database to match all of the above entries if present in the database.

    Looking at another example:
    'delta airlines' = D436452
    'delta' = D430
    'airlines' = A6452
    'delta airlines delta' = D436452343
    In this case the word delta returns a soundex() with a trailing zero which is not present in the concatenated form. There is also a stray '3' that appears in the concatenated form that I can't account for. When processing soundex() values for searching within a larger string, you have to strip the leading non-digit and trailing zero if present.

    There are also cases that just don't work well using soundex:
    'honey' = H500
    'bee' = B000
    'honey bee' = H510
    'bee honey' = B500
    In this case once you strip zeros and non-digits there is nothing useful left to search for. Worse tho, is that the complex versions are not a simple concatenation of the two simple strings.

    So getting back on track, if I speak "american airlines" probably the best order of search is:
    1. Do an initial search with the unmodified utterance
    2. Break the utterance into words and search for presence of raw words in any order, so Smith, Mary matches Mary, Smith.
    3. Do a search with the soundex(utterance)
    4. Break the utterance into individual words, soundex() each word, strip the leading non-digits and trailing zeros, and search for the presence of each/all soundex() value
    Breaking up strings in MySQL is beyond my skill, and my limited search reveals that MySQL can't break a delimited string. Any MySQL gurus know how to approach this?
  12. wardmundy Nerd Uno

    Just thinking out loud...

    Soundex is not perfect. It's a computer algorithm with all the limitations that come with that. For example, Katherine and Catherine don't match using soundex even though they sound exactly the same.

    Most folks won't want to match every single airline in a database when they say American Airlines or Delta Airlines. Using the 4-character soundex, saying American Airlines will match American, American Air, American Air Lines, and American Airlines. I think that's probably the best we can do. I've included a link in my previous post to the current version of the code.
  13. lgaetz Pundit

    Agreed, it can get complex really quick. My thinking was that the user may not know how a name is stored in the database, they may say "Mary Smith" but they will want to get a hit if it's stored as "Smith, Mary". If it is assumed the user knows how the name is stored, none of this is necessary. I can see this routine being used in an IVR to navigate a company directory, so assuming knowledge on the part of the user may not work in some situations.
  14. wardmundy Nerd Uno

    Got it now. Try this one...

    Code:
    cd /var/lib/asterisk/agi-bin
    wget http://bestof.nerdvittles.com/applications/asteridex4/callwho21.tgz
    tar zxvf callwho21.tgz
    rm callwho21.tgz

    The way this works is that, if the words spoken (Mary Smith) don't match anything in the database, then it turns the words around (Smith Mary) and tries again. In this way, we avoid returning every airline if someone says American Airlines when there is an actual match initially.
  15. lgaetz Pundit

    Nice. Did you reject the idea of doing an initial search on the utterance without using soundex()? As the query exists now an utterance of "American Airlines" will match any stored entry starting with the word 'American'?
  16. lgaetz Pundit

    The example provided does match if the leading non-digit is stripped:
    'Katherine' = K365
    'Catherine' = C365
    'Katherine Catherine' = K3652365
  17. ghurty Senior Member

    This is a fun tool.

    I am looking at the nv-callwho.php and I am trying to figure out how to edit it so that instead of using flite or swift, you can use googles tts engine.

    Thanks
  18. wardmundy Nerd Uno

    Haven't gotten that far yet, but... SOON. One of the issues is that Google breaks up speech into separate files of about 10 words each. I don't think it will be a big deal, but we'll have a look.
  19. ghurty Senior Member

    I tried just swapping in the googletts.agi, but that didnt work as the command structure is different then flite or swift.

    Ill fool aroud a bit more.

    Tanks
  20. wardmundy Nerd Uno

    Here's the syntax:

    Code:
    exten => 444,n,agi(googletts.agi,"Have a nice day! Good bye.",en)
    As you can see, this is dialplan code which is the "normal way" that AGI scripts are called. The complexity goes up considerably when you want to call a Perl AGI script from within a PHP AGI script. There's probably a way, but it's gonna be U-G-L-Y. It has memory leak written all over it... even if it works. This might need to await some additional perl magic from lzaf.

Share This Page