codingdir logo sitemap sitemap |

Pocketsphinx recognizes random phrases in a silence

By : , Category : speech-recognition

You need to use keyword spotting mode.

Pocketsphinx supports keyword spotting mode where you can specify the keyword list to look for. The advantage of this mode is that you can specify a threshold for each keyword so that keyword can be detected in continuous speech. All other modes will try to detect the words from grammar even if you used words which are not in grammar. The keyword list looks like this:

oh mighty computer /1e-40/
hello world /1e-30/
other phrase /1e-20/

To run pocketsphinx with keyword list use:

pocketsphinx_continuous -inmic yes -dict dict.dict -hmm /home/pi/zero_ru.cd_cont_4000 -kws keyword.list

Threshold must be specified for every keyphrase. For shorter keyphrase you can use smaller thresholds like 1e-1, for longer threshold must be bigger. Threshold must be tuned to balance between false alarms and missed detections, the best way to tune threshold is to use a prerecorded audio file.

For the best accuracy it is better to have keyphrase with 3-4 syllables. Too short phrases are easily confused.

ReLated :

So, as a launch-line, you want something like this:

gst-launch alsasrc device=hw:1 ! queue ! audioconvert ! audioresample ! "audio/x-raw-int, rate=16000, width=16, depth=16, channels=1" ! tee name=t ! queue ! audioresample ! "audio/x-raw-int, rate=8000" ! fvader name=vader auto-threshold=true ! pocketsphinx lm=/home/pi/dev/scarlettPi/config/speech/lm/scarlett.lm dict=/home/pi/dev/scarlettPi/config/speech/dict/scarlett.dic hmm=/usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k name=listener ! fakesink dump=1 t. ! valve drop=0 ! queue ! wavenc ! filesink location=test.wav async=0

The idea is that one alsasrc represents your microphone, and since this can be an exclusive access, you should not start another alsasrc to access the same device. This also makes for better code, since you want to record from the same device you are doing speech-detection from (as I understand), and by using the same source you are guaranteed that this will be the case.

Now, in order to do this in python, you should probably set up the pipeline "properly" rather than doing parse-launch, meaning you have to instantiate each element:

self.recording_valve = gst.element_factory_make('valve')

Add them to your pipeline:

self.pipeline.add (self.recording_valve)

And then link them: (self.next_element)

...which is what parse-launch does internally, however, in your start and stop recording, you can now do:

self.recording_valve.set_property('drop', False)

To start recording, and

self.recording_valve.set_property('drop', True)

To stop.

(Remember to set drop to True initially, since it is False by default)

The last piece you will need to put all this together is to know about the tee, and its request-pads. It can have any number of src-pads, and you can do the following to link it:


Basically, you request a srcpad from the Tee, and link it to the sinkpad of the valve. You will need to do this twice for each of the tee's branches.

Get that all up and running, and you should have something that works.


You can use the following function to filter out the data:

function filterData(data, key){
   var result = []; 
   $.each(data, function(index, rcd){       
       result.push([rcd['itemName'], parseFloat(rcd[key])]);
   return result;

Then for the HighCharts code you can just write:

   index: 0,
   name: "Jan",
   data: filterData(data, "Jan")


See this jsFiddle for code and demonstration.

You can use [A-Z].+?\.

This will match any upper case letter, followed by any other characters, until it finds the . character. By using the ? in our regex, we create what's known as a lazy match (i.e., it will stop as soon as the next bit is found.)

The problem with yours is the .*. This is a greedy match, so it will try to match as much as possible

If the set of phrases is defined and does not contain long phrases, maybe you can create not 1 trie, but n tries, where n is the maximum number of words in one phrase.

In i-th trie store i-th word of the phrase. Let's call it the trie with label 'i'.

To process query with m words let's consider the following algorithm:

  1. For each phrase we will store the lowest label of a trie, where the word from this phrase was found. Let's denote it as d[j], where j is the phrase index. At first for each phrase j, d[j] = -1.
  2. Search the first word in each of n tries.
  3. For each phrase j find the label of a trie that is greater than d[j] and where the word from this phrase was found. If there are several such labels, pick the smallest one. Let's denote such label as c[j].
  4. If there is no such index, this phrase can not be matched. You can mark this case with d[j] = n + 1.
  5. If there is such c[j] that c[j] > d[j], than assign d[j] = c[j].
  6. Repeat for every word left.
  7. Every phrase with -1 < d[j] < n is matched.

This is not very optimal. To improve performance you should store only usable values of d array. After first word, store only phrases, matched with this word. Also, instead of assignment d[j] = n + 1, delete index j. Process only already stored phrase indexes.


Message :
Login to Add Your Comments .
How to disable registered OpenCL platforms on Windows?
Is Observable broken in Angular 2 Beta 3?
Cross-thread operation not valid when using Invoke
How to pass an IEnumerable or queryable list of properties from Controller to View
Finding numbers after a certain keyword using Python
Pocketsphinx recognizes random phrases in a silence
Passing non-thread-safe objects through thread-safe containers
React scroll nav
BizTalk WCF-BasicHttp Adapter does not allow Empty string for Service Certificate Props
Why property ''cause" of Exception is repeating forever?
Privacy Policy 2017 © All Rights Reserved .