I have been fiddling around with a speech recognition program for a while now and i always had something that bothered me. Here is my code i have been using. It is the example code from code.activestate.com
from win32com.client import constants import win32com.client import pythoncom """Sample code for using the Microsoft Speech SDK 5.1 via COM in Python. Requires that the SDK be installed; it's a free download from http://microsoft.com/speech and that MakePy has been used on it (in PythonWin, select Tools | COM MakePy Utility | Microsoft Speech Object Library 5.1). After running this, then saying "One", "Two", "Three" or "Four" should display "You said One" etc on the console. The recognition can be a bit shaky at first until you've trained it (via the Speech entry in the Windows Control Panel.""" class SpeechRecognition: """ Initialize the speech recognition with the passed in list of words """ def __init__(self, wordsToAdd): self.speaker = win32com.client.Dispatch("SAPI.SpVoice") self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer") self.context = self.listener.CreateRecoContext() self.grammar = self.context.CreateGrammar() self.grammar.DictationSetState(0) self.wordsRule = self.grammar.Rules.Add("wordsRule", constants.SRATopLevel + constants.SRADynamic, 0) self.wordsRule.Clear() [ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ] self.grammar.Rules.Commit() self.grammar.CmdSetRuleState("wordsRule", 1) self.grammar.Rules.Commit() self.eventHandler = ContextEvents(self.context) self.say("Started successfully") def say(self, phrase): self.speaker.Speak(phrase) class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")): def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result): newResult = win32com.client.Dispatch(Result) print "You said: ",newResult.PhraseInfo.GetText() if __name__=='__main__': wordsToAdd = [ "One", "Two", "Three", "Four" ] speechReco = SpeechRecognition(wordsToAdd) while 1: pythoncom.PumpWaitingMessages()
My problem is you always need to provide the words in which you want it to be able to recognise. Is there any way to just try and let it decypher what you are saying and print that to screen rather then having a set list of words that can be recognised. I made a chat-bot a while ago and i think it would be great if i could intergrate it but that would be a lot easier if i could stop the way i need to provide a list of words that it can recognise, for this one it will only recognise one, two, three and four. So not that functional.
Any help would be greatly appreciated