Hello Everyone. I'm new to python and so far, I simply love it! I'm still just starting to get the knack though and there's a lot more I want to know. I've decided to use it mainly for my robots' programming scripts. Right now I'm currently stuck on a simple script on getting my windows vista machine to recognize phrases I say, and reply to them using TTS. It's a learning process in itself.

Bare with me, most of the time I feel like I haven't got a clue when I'm writing a python script. It's the first programming language I'm learning.

Here's the script I'm using on my computer. I'll implement it into my robot later, once I get it to work. I modified it from two python sample scripts by Inigo Surguy and Peter Parente, so a lot of credit goes to them, at least until I become savy enough to write my original speech Recognition and TTS scripts... What I want it to do is wait for me to say "Hello, Aelita" which is what I call my laptop, and respond by addressing my name and asking how I am doing. I'm also trying to get it to tell me the time, whenever I say "Time." It is as follows:

#Test Script for computer virtual personality.
#Credits to Inigo Surguy (inigosurguy@hotmail.com) and Peter Parente for original
#Speech Recognition and TTS scripts.
#Further commentary by Surguy
from win32com.client import constants
import win32com.client
import pythoncom

import pyTTS
impor time

tts = pyTTS.Create()

tts.Rate = 1

tts.Volume = 90

tts.GetVoiceNames()

tts.SetVoiceByName('MS-Anna-1033-20-DSK')

"""Sample code for using the Microsoft Speech SDK 5.1 via COM in Python.
    Requires that the SDK be installed (it's a free download from
            http://www.microsoft.com/speech
    and that MakePy has been used on it (in PythonWin,
    select Tools | COM MakePy Utility | Microsoft Speech Object Library 5.1).

    After running this, then saying "One", "Two", "Three" or "Four" should
    display "You said One" etc on the console. The recognition can be a bit
    shaky at first until you've trained it (via the Speech entry in the Windows
    Control Panel."""
class SpeechRecognition:
    """ Initialize the speech recognition with the passed in list of words """
    def __init__(self, wordsToAdd):
        # For text-to-speech
        self.speaker = win32com.client.Dispatch("SAPI.SpVoice")
        # For speech recognition - first create a listener
        self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
        # Then a recognition context
        self.context = self.listener.CreateRecoContext()
        # which has an associated grammar
        self.grammar = self.context.CreateGrammar()
        # Do not allow free word recognition - only command and control
        # recognizing the words in the grammar only
        self.grammar.DictationSetState(0)
        # Create a new rule for the grammar, that is top level (so it begins
        # a recognition) and dynamic (ie we can change it at runtime)
        self.wordsRule = self.grammar.Rules.Add("wordsRule",
                        constants.SRATopLevel + constants.SRADynamic, 0)
        # Clear the rule (not necessary first time, but if we're changing it
        # dynamically then it's useful)
        self.wordsRule.Clear()
        # And go through the list of words, adding each to the rule
        [ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ]
        # Set the wordsRule to be active
        self.grammar.Rules.Commit()
        self.grammar.CmdSetRuleState("wordsRule", 1)
        # Commit the changes to the grammar
        self.grammar.Rules.Commit()
        # And add an event handler that's called back when recognition occurs
        self.eventHandler = ContextEvents(self.context)
        # Announce we've started
        self.say("Started successfully")
    """Speak a word or phrase"""
    def say(self, phrase):
        self.speaker.Speak(phrase)

"""The callback class that handles the events raised by the speech object.
    See "Automation | SpSharedRecoContext (Events)" in the MS Speech SDK
    online help for documentation of the other events supported. """
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
    """Called when a word/phrase is successfully recognized  -
        ie it is found in a currently open grammar with a sufficiently high
        confidence"""
    def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
        newResult = win32com.client.Dispatch(Result)
            
if __name__=='__main__':
    wordsToAdd = [ "Hello Aelita", "Fine", "How are you?", "Not well", "Awful",
                   "Bad", "Not good", "Not too well", "Thank you", "Thanks", "Time" ]
    speechReco = SpeechRecognition(wordsToAdd)
    while 1:
        pythoncom.PumpWaitingMessages()

if speechReco("Hello Aelita"):
    greeting1 = "Hello, Lore-enn. How are you today?"
    tts.Speak(greeting1)
    if speechReco("Fine"):
        great = "Great!"
        tts.Speak(great)
        offr1 = "Is there anything I can do for you?"
        tts.Speak(offr1)
    if speechReco("Not well") or ("Awful") or ("Bad") or ("Not Good") or ("Not too well"):
        sorry1 = "That's too bad"
        sorry2 = "I'm so sorry."
        tts.Speak(sorry1) or (sorry2)
        onDuty1 = "If there's anything or need, just let me know."
        tts.Speak(onDuty1)
        if speechReco("Thanks") or ("Thank you"):
            hum1 = "Don't mention it"
            hum2 = "No problem"
            hum3 = "Your welcome"
            tts.Speak(hum1) or (hum2) or (hum3)

    if speechReco("Time"):
        timeStr1 = "The time is " + time.asctime(),
        timeStr2 = "It's " + time.asctime()
        timeStr3 = "Right now, it's " + time.asctime()
        tts.Speak(timeStr1) or (timeStr2) or (timeStr3)

It's working mostly, in that it opens up Microsoft Speech Recognition and recognizes the words I say, and as far as I know the script is free from syntax errors and exceptions. The problem I'm having now is getting the computer's voice to say something back. As of now, the phrase just shows up in the MSR window but doesn't say anything back. I think it has to do with the "speechReco" I write for when the computer's supposed to catch a certain phrase. I don't think its the right term. So, the MSR is recognizing my words, but the python script doesn't, hence no TTS response. I'd appreciate it if maybe someone could help me out with this script, and/or tell me if there's some sort of python syntax dictionary or glossary out there, that tells all the different words you can right in a python script, when to use them, and what they do?

Once again, bare with me. I'm trying to learn as much as I can about this stuff. Thanks.

Recommended Answers

All 23 Replies

It's good code, seagull (or should I say "Lore-enn"?). You're asking a question that has more to do with the speech recognition module than Python, so I can't directly help. But when I'm stuck on such, I usually try out the help files.

In this case, it looks as if tts.Speak() isn't working like you'd want. So from the command line:

import pyTTS

tts = pyTTS.Create()
help(tts.Speak)

and see if you get anything useful.

BTW, there is a typo in the line "impor time" --> should be "import time". I don't think that'll fix the code, though.

Hope it helps,
Jeff

Thanks for the input, jrcagle. Name's actually Loren, but wanted the TTS to pronounce it right.:)

The script still doesn't seem to be working. I'm not sure its the pyTTS that's not working but it always working fine in the interactive window...But even surguy's original speech recognition script didn't work completely when I downloaded it directly and tried it. The Microsoft Speech Recognition for Vista window still opened up and recognized the phrases I said, but I got no response. In surguy's script, the computer was supposed to print on the screen "You said ", and it would be "One," "Two," "Three," or "Four," depending on which of the four numbers you said. But all that came up in my Microsoft Speech Recognition was "One" "Two" "Three" or "Four." and it does that anyway, even without the python script. No window came up or anything that said, "You said one," or "You said two." I wonder if it has anything to do with Microsoft Speech Recognition which opens every time I run the script. Maybe its interferring somehow, because I don't think Surguy intended his script to work with Microsoft Speech Recognition on a windows Vista machine...

Does anyone know what would happen if I tried it on a different O.S. What if I ran the script in a windows XP or 98 on Microsoft Virtual PC? I think that's what I'll try and see what happens...

Okay, here's the code after I cleaned it up a bit. The computer's TTS still isn't giving any output whenever I say a phrase, like "Hello Aelita."

#Test Script for computer virtual personality.
#Credits to Inigo Surguy (inigosurguy@hotmail.com) and Peter Parente for original
#Speech Recognition and TTS scripts.
#Further commentary by Surguy
from win32com.client import constants
import win32com.client
import pythoncom

import time

"""Sample code for using the Microsoft Speech SDK 5.1 via COM in Python.
    Requires that the SDK be installed (it's a free download from
            http://www.microsoft.com/speech
    and that MakePy has been used on it (in PythonWin,
    select Tools | COM MakePy Utility | Microsoft Speech Object Library 5.1).

    After running this, then saying "One", "Two", "Three" or "Four" should
    display "You said One" etc on the console. The recognition can be a bit
    shaky at first until you've trained it (via the Speech entry in the Windows
    Control Panel."""
class SpeechRecognition:
    """ Initialize the speech recognition with the passed in list of words """
    def __init__(self, wordsToAdd):
        # For text-to-speech
        self.speaker = win32com.client.Dispatch("SAPI.SpVoice")
        # For speech recognition - first create a listener
        self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
        # Then a recognition context
        self.context = self.listener.CreateRecoContext()
        # which has an associated grammar
        self.grammar = self.context.CreateGrammar()
        # Do not allow free word recognition - only command and control
        # recognizing the words in the grammar only
        self.grammar.DictationSetState(0)
        # Create a new rule for the grammar, that is top level (so it begins
        # a recognition) and dynamic (ie we can change it at runtime)
        self.wordsRule = self.grammar.Rules.Add("wordsRule",
                        constants.SRATopLevel + constants.SRADynamic, 0)
        # Clear the rule (not necessary first time, but if we're changing it
        # dynamically then it's useful)
        self.wordsRule.Clear()
        # And go through the list of words, adding each to the rule
        [ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ]
        # Set the wordsRule to be active
        self.grammar.Rules.Commit()
        self.grammar.CmdSetRuleState("wordsRule", 1)
        # Commit the changes to the grammar
        self.grammar.Rules.Commit()
        # And add an event handler that's called back when recognition occurs
        self.eventHandler = ContextEvents(self.context)
        # Announce we've started
        self.say("Started successfully")
    """Speak a word or phrase"""
    def say(self, phrase):
        self.speaker.Speak(phrase)

"""The callback class that handles the events raised by the speech object.
    See "Automation | SpSharedRecoContext (Events)" in the MS Speech SDK
    online help for documentation of the other events supported. """
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
    """Called when a word/phrase is successfully recognized  -
        ie it is found in a currently open grammar with a sufficiently high
        confidence"""
    def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
        newResult = win32com.client.Dispatch(Result)
            
if __name__=='__main__':
    wordsToAdd = [ "Hello Aelita", "Fine", "Good", "Great", "Wonderful", "How are you?",
                   "Not well", "Awful", "Bad", "Not good", "Not too well", "Thank you", "Thanks", "Time" ]
    speechReco = SpeechRecognition(wordsToAdd)
    while 1:
        pythoncom.PumpWaitingMessages()

if RecognitionType == ("Hello Aelita"):
    self.say("Hello, Loren. How are you today?")
    if RecognitionType == ("Fine") or ("Great") or ("Wonderful") or ("Good"):
        say("Great")
        say("Is there anything I can do for you?")
    if RecognitionType == ("Not well") or ("Awful") or ("Bad") or ("Not Good") or ("Not too well"):
        say("That's too bad")
        say("If there's anything or need, just let me know.")

    if RecognitionType == ("Time"):
        say("The time is " + time.asctime(),)

The words come up on MSR, so I know it recognized what I said, but there still no speech, other than "Started Sucesfully" at the beginning when I first run the script. After that, my computer's completely silent. I'm still skeptical that I'm using the right syntax for when the computer's supposed to recognize my phrase and say something. Right now I'm typing "if RecognitionType == ("Hello Aelita")" and I don't think 'RecognitionType' is the right command word for it. Does this script work on anyone elses PC?

I'm completely out of my waters -- I've never even used Vista (waiting for SP 2 to work out the bugs...).

That said, this looks iffy to me:

if RecognitionType == ("Hello Aelita"):
    self.say("Hello, Loren. How are you today?")
    if RecognitionType == ("Fine") or ("Great") or ("Wonderful") or ("Good"):
        say("Great")
        say("Is there anything I can do for you?")
    if RecognitionType == ("Not well") or ("Awful") or ("Bad") or ("Not Good") or ("Not too well"):
        say("That's too bad")
        say("If there's anything or need, just let me know.")

    if RecognitionType == ("Time"):


        say("The time is " + time.asctime(),)

Shouldn't it be 'self.say' for all of those?

Jeff

Your right, I guess it should've been self.say.
But anyhow, I've downloaded a more useful python desktop Speech Recognition code that I seemed to have neglected on the same website from Inigo Surguy. It not only has a more useful python script, but part of the code allows you to edit your own commands and macros into the speech recognition database.

Now here's an interesting discovery I've made. I added "Hello Aelita" to the list of recognizeable words with the command "tts.Speak('Hello, Loren. How are you today?")" in the actions window, or whatever you call it. Now, when I click on the Test button to see if it works, I get the computer's voice "Hello, Loren. How are you toady." But whenever I speak "Hello Aelita" the computer still recognizes the words, but there's no voice? Which is very interesting.

I think I'll download python and the necessary requirements to run the script onto my XP desktop computer and see if I get better results.

Jrcagle, I think you're right about the TTS module not working like I want it to. Apparently, anytime I type in: tts.Speak("Blah, blah, blah"), in the interactive window, I get a response. The speakers will play a TTS Voice that says, "Blah, blah, blah." But whenever I add it to the speech recognition list of words to recognize and speak "Hello Aelita" into the microphone, nothing. It plays the voice when I click Test, so in Theory it should be working perfectly. But I never get anything when I speak into the microphone.

But check this out. I added google to the list of words to be recognized. I then added "browseTo("www.google.com")" to the actions window. when I spoke "google" into the microphone, wahlah! It went to google just like I wanted it to. So I know its not my vista computer. At least, I think.

So I typed in "help(tts.Speak)", like you told me to, jrcagle. I got a reply, but its half greek to me. I got this:

Help on method Speak in module pyTTS.sapi:

Speak(self, text, *flags) method of pyTTS.sapi.SynthAndOutput instance
    Speaks text with the optional flag modifiers. Text can include XML commands.

I'm still a python newbie: a get a very rough idea of what it means, but I'm still not at a level where I can take that information and find out how to make my computer talk to me like it should. If this is more of a speech recognition to TTS issue rather than python, does anyone know where I can go to learn how to fix this?

Yipee! I got it to work!

Here's what I did. I not only loaded the direct speech object in makepython, but I also applied the direct speech recognition. Also, I had to make sure that I had the interactive window selected, otherwise, nothing would happen.:P

Thanks for your help. Now to move on with my robot!:icon_lol:

Great debugging job!

Sorry, long post.:S

Okay, it's been a while since I've tested the code on my laptop computer. Since the actual
construction on my autonomous robot is to begin very soon, I've been experimenting with the actual code that's going to go into the robot.

I know there's something wrong with my code because it keeps raising an exception:

Traceback (most recent call last):
  File "C:\Python25\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py", line 305, in RunScript
    debugger.run(codeObject, __main__.__dict__, start_stepping=1)
  File "C:\Python25\Lib\site-packages\pythonwin\pywin\debugger\__init__.py", line 60, in run
    _GetCurrentDebugger().run(cmd, globals,locals, start_stepping)
  File "C:\Python25\Lib\site-packages\pythonwin\pywin\debugger\debugger.py", line 631, in run
    exec cmd in globals, locals
  File "C:\Users\Owner\Desktop\Nina Interaction.py", line 109, in <module>
    "What time is it" : "speaker.Speak('The time is ' + time.asctime)"}
NameError: name 'self' is not defined

Which is puzzling because isn't (self) supposed to be a natural, integrated part of the python programming language?:-/

Here is my code. Once again, I make use of Inigo Surguy's speech recognition sample code, and some sample code from Christian Wyglendowski's timer scripts.

#  A code to Interact with an autonomous robot called Nina.
#  Modified from Inigo Surguy's sample code for speech recognition
#  and the timer sample scripts by Christian Wyglendowski (http://mail.python.org/pipermail/tutor/2004-November/033333.html)
#  Further commentary by Surguy.

# Sample code for speech recognition using the MS Speech API
# Inigo Surguy (inigosurguy@hotmail.com)
import time
import threading
class Timer(threading.Thread):
    def __init__(self, seconds):
        self.runTime = seconds
        threading.Thread.__init__(self)
    def run(self):
        time.sleep(self.runTime)
        
class GreetingTimer(Timer):
        def run(self):
            counter = self.runTime
            for sec in range(self.runTime):
                time.sleep(1.0)
                counter -= 1
            Greeting = 0

GT = GreetingTimer(10)

Greeting = 0    

from win32com.client import constants
import win32com.client
import pythoncom
    
"""Sample code for using the Microsoft Speech SDK 5.1 via COM in Python.
    Requires that the SDK be installed (it's a free download from
            http://www.microsoft.com/speech
    and that MakePy has been used on it (in PythonWin,
    select Tools | COM MakePy Utility | Microsoft Speech Object Library 5.1).

    After running this, then saying "One", "Two", "Three" or "Four" should
    display "You said One" etc on the console. The recognition can be a bit
    shaky at first until you've trained it (via the Speech entry in the Windows
    Control Panel."""
class SpeechRecognition:
    """ Initialize the speech recognition with the passed in list of words """
    def __init__(self, wordsToAdd):
        # For text-to-speech
        self.speaker = win32com.client.Dispatch("SAPI.SpVoice")
        # For speech recognition - first create a listener
        self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
        # Then a recognition context
        self.context = self.listener.CreateRecoContext()
        # which has an associated grammar
        self.grammar = self.context.CreateGrammar()
        # Do not allow free word recognition - only command and control
        # recognizing the words in the grammar only
        self.grammar.DictationSetState(0)
        # Create a new rule for the grammar, that is top level (so it begins
        # a recognition) and dynamic (ie we can change it at runtime)
        self.wordsRule = self.grammar.Rules.Add("wordsRule",
                        constants.SRATopLevel + constants.SRADynamic, 0)
        # Clear the rule (not necessary first time, but if we're changing it
        # dynamically then it's useful)
        self.wordsRule.Clear()
        # And go through the list of words, adding each to the rule
        [ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ]
        # Set the wordsRule to be active
        self.grammar.Rules.Commit()
        self.grammar.CmdSetRuleState("wordsRule", 1)
        # Commit the changes to the grammar
        self.grammar.Rules.Commit()
        # And add an event handler that's called back when recognition occurs
        self.eventHandler = ContextEvents(self.context)
        # Announce we've started
        self.say("Started Successfully")
    """Speak a word or phrase"""
    def say(self, phrase):
        self.speaker.Speak(phrase)

"""The callback class that handles the events raised by the speech object.
    See "Automation | SpSharedRecoContext (Events)" in the MS Speech SDK
    online help for documentation of the other events supported. """
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
    """Called when a word/phrase is successfully recognized  -
        ie it is found in a currently open grammar with a sufficiently high
        confidence"""
    def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
        newResult = win32com.client.Dispatch(Result)

def SetItems(self):
    self.items = pickle.load(open(self.SAVE_FILENAME))


while Greeting == 1:
    self.items = {"Fine" : "self.say('That's good to hear!'); Greeting = 0",
                  "Great" : "self.say('Fantasic!'); Greeting = 0",
                  "Wonderful" : "self.say('That's spectacular'); Greeting = 0",
                  "Not Good" : "self.say('That's too bad'); Greeting = 0",
                  "Terrible" : "speaker.Speak('I'm sorry'); Greeting = 0",
                  "Awful" : "speaker.Speak('Oh I'm very sorry'); Greeting = 0"}
    

if __name__=='__main__':
    wordsToAdd = [ "Hello Nina", "Fine", "Great", "Wonderful", "Not Good", "Terrible", "Awful",
                   "What time is it"]
    speechReco = SpeechRecognition(wordsToAdd)
    while 1:
        pythoncom.PumpWaitingMessages()
        self.items = {"Hello Nina" : "Greeting = 1; GT.start",
                      "What time is it" : "speaker.Speak('The time is ' + time.asctime)"}

Here's how it's supposed to work:

My robot, Nina, understands serveral words in a human-interaction vocabulary (wordsToAdd). While Nina is waiting for things to be said to her (pythoncom.PumpWaitingMessages), there are two things that are availiable to say: Hello Nina, and What time is it. When you say "Hello Nina" Nina responds by saying hello and asking how you're doing. From that point on, you can only say six responses for how you're doing, 3 positive and 3 negative.

Timing is supposed to be an important part of Nina's interaction with Humans. When you say, hello Nina, and the Greeting stance is on, Nina waits ten seconds for you to respond, otherwise she "gets bored" and assumes she's not talking to anyone anymore, to put it in personified terms...

First off, anyone know how to fix that bizzare exception? Secondly, while I'm working at it, anyone have any pointers to polish up this script? I'd greatly appreciat any help.:)

sidenote:

Which is puzzling because isn't (self) supposed to be a natural, integrated part of the python programming language?

Not exactly. "Self" is a convention used by python programmers so that humans can read the code.

>>> class TempClass(object):
    def __init__(thingy):
        thingy.prop = 1
    def __str__(whatzit):
        return str(whatzit.prop)

    
>>> a = TempClass()
>>> print a
1
>>>

(In case it wasn't clear, __str__ gets called 'automagically' by print)

As you can see, __init__ and __str__ function just fine without using self. What's happening here is that the first parameter in a method is taken to be the object itself. When the method is called by

a.method(params)

Python calls the method like this

method(a,params)

So you can see that it really doesn't matter what that first thing is called to *Python* -- it just matters to us because we need to read it.

/sidenote

None of what I said really relevant to your problem, though, because the error message you got isn't the real problem. Instead, the real problem is your line

"speaker.Speak('The time is ' + time.asctime)"

That line is supposed to call time.asctime and add it to the string "The time is ", except that without (), you aren't really calling it! Instead, what your line literally means is

"Take the string 'The time is ' and add the function time.asctime to it"

Which is of course a TypeError, since strings and functions can't be added.

>>> import time
>>> help(time.asctime)
Help on built-in function asctime in module time:

asctime(...)
    asctime([tuple]) -> string
    
    Convert a time tuple to a string, e.g. 'Sat Jun 06 16:26:11 1998'.
    When the time tuple is not present, current time as returned by localtime()
    is used.

>>>

What you really need is

"speaker.Speak('The time is ' + time.asctime())"

Hope that helps,
Jeff

Thanks once again jrcagle! That worked.

I've had a lot of time to work up my robots human interaction program. So far, Nina can say a thing or two about politics, health, romance and whatnot. She can even tell me a joke on request by picking a random joke from a joke list. Its getting to be quite amazing.

There's something else I'm trying to do now. I'm trying to get Nina to remember the names of people she meets. There really should be nothing to this in theory. Nina asks "Yes, we can be friends. What was your name again." Then she waits for the human's response, captures the name with get.text and then repeats it to the human to make sure she got it right. "Let me make sure I've got it right. You name is 'Fred'?" upon giving an accepted name, she adds it to one of two lists for people she meets, Friends and Aquaintences. Then she just saves this to her disk space and every time she loads her code when powered on, the name is there. I know how to add items to lists, using the raw_input to capture the name, and the append.Friends('Fred') to add to her list of friends. So the code should theoretically look something like this:

Friends = [ ]
Aquaintences = [ ]

AwaitName = (0)
CheckName = (0)

# human says, lets be friends
speaker.Speak('All right we can do that. What was your name again?')
AwaitName = (1)

if AwaitName == (1):
    raw_input('name').GetText

AwaitName = (0)
CheckName = (1)

if CheckName == (1):
speaker.Speak("Let me make sure I have it right. Your name is ('name')?")

if #human says yes
append.Friends('name')
CheckName = (0)
speaker.Speak('All right. What do you want to do?')
else #human says "No, just Fred"
return to line 11

That was just a disgrace to python coding. I still have more to learn than I thought...

Anyway, the only thing I'm really having trouble with so far is how to make the raw_input pertain to speech recognition and not keyboard?

The speech recognition goes like this:

from wxPython.wx import *
import sys, time, math, string, win32com.client,win32event,pythoncom
from win32com.client import constants
import win32con
import cPickle, zlib

import string
import pickle
import win32api
import win32com.client
import traceback

import threading
import random

class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
    def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
        newResult = win32com.client.Dispatch(Result)
        try:
            # Exec the appropriate listbox entry
            exec app.items[newResult.PhraseInfo.GetText()]
        except:
            # If execution fails, display a messagebox with error and cause
            etype, value, tb = sys.exc_info()
            message = (str(etype)+":"+str(value)+
                      "\nat line "+`tb.tb_next.tb_lineno`+
                      "for text '"+newResult.PhraseInfo.GetText()+"'")
            dlg = wxMessageDialog(app.frame, 
                                  message,
                                  'Exception: '+str(etype),
                                  wxOK | wxICON_INFORMATION)
            dlg.ShowModal()
            dlg.Destroy()

"""Windows speech recognition application""" 
class MyApp(wxApp):
    ADD_BUTTON_ID = 10
    DELETE_BUTTON_ID = 20
    LISTBOX_ID = 30
    EDITOR_ID = 40
    TEST_BUTTON_ID = 50
    TURNON_BUTTON_ID = 60 
    TURNOFF_BUTTON_ID = 70
    SAVE_FILENAME = "save.p"
    def setItems(self):
        try:
            self.items = pickle.load(open(self.SAVE_FILENAME))
        except IOError:
            self.items = {"Hello Nina" : "speaker.Speak(random.choice(Greet1))",
                          "I'm fine thank you" : "speaker.Speak(random.choice(Greet3))" }
   def InitSpeech(self):
            listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
            self.context = listener.CreateRecoContext()
            self.grammar = self.context.CreateGrammar()
            self.grammar.DictationSetState(0)
            self.ListItemsRule = self.grammar.Rules.Add("ListItemsRule",    constants.SRATopLevel + constants.SRADynamic, 0)
            events = ContextEvents(self.context)
            self.turnedOn = true        
            self.SetWords()
        def SetWords(self):
            self.ListItemsRule.Clear()
            if self.turnedOn:
                print "Setting words - turned on"
                [ self.ListItemsRule.InitialState.AddWordTransition(None, word) for word in     self.items.keys() ]
            else:
                print "Setting words - OFF"
                self.ListItemsRule.InitialState.AddWordTransition(None, "Turn on")
            self.grammar.Rules.Commit()
            self.grammar.CmdSetRuleState("ListItemsRule", 1)
            self.grammar.Rules.Commit()

I think that's everything important. Anyone got any pointers?

Loren,

Your struggling with Surguy's example speech code was enough to make me write my own Python Speech Recognition module, so that speech recognition would be easier to work with.

The 'speech' module is available by typing 'easy_install speech' at the Windows command prompt (if you've got easy_install installed.) You'll still need to do the Speech SDK install and run MakePY, but I'm trying to figure out how to elimiate the MakePY step :)

The project lives at http://pyspeech.googlecode.com and is on PyPI.

The code looks like this:

import speech

# a callback to run whenever a certain phrase is heard.
def command_callback(phrase, listener):
    speech.say("You said %s" % phrase) # speak out loud
listener1 = speech.listenfor(['some', 'various phrases', 'to listen for'],
        command_callback)

# a callback to run when any English is heard.
def dictation_callback(phrase, listener):
    if phrase == "stop please":
        speech.stoplistening(listener)
    else:
        print "Heard %s" % phrase
listener2 = speech.listenforanything(dictation_callback)

# Both listeners are running right now.  When the user says
# "stop please", listener2 will stop itself in the callback.
while speech.islistening(listener2):
    speech.pump_waiting_messages() # safe to call in a tight loop

# Turn off listener1 as well.
speech.stoplistening(listener1)

It works great -- I've successfully used the module to build a music robot that understands instructions like "Play me some Radiohead, any album."

Hope this helps! Feedback at pyspeech.googlecode.com would be appreciated.

Michael

:-O

Whoa! That looks really cool! I'll definitely have to try that!

Thanks, Michael! You're brilliant!

Loren

Problem:

I typed in "easy_install speech" in the windows command prompt and indeed it said in wasn't recognized. So I downloaded and ran the python script to install easy install, and that ran fine. So I tried again to typer "easy_install speech" in the windows command prompt, but it still isn't recognizing it.

I tried running the python script again, but it said it was already installed.

Any ideas? I'm running Vista Ultimate. I really would love to try your code, if I can get it running!

Your current problem is probably that the location of easy_install is not on your windows path. For now, try typing in the full path to where you installed easy_install as such: C:\\Program Files\\<etc, etc>\\easy_install speech .

Oh, yeah... Thanks Jlm699! It works fine now.

Loren

At first sight, the problem is that you refer to a variable self outside of a class block. In the loop while 1 at the end of the file, it should probably be speechReco.items .
Also shouldn't the function def SetItem be a method of class speechRecognition ?
The loop while Greeting == 1 is probably false true. It occurs at global level and uses a variable self which doesn't exist at global level.
Hope this helps.

Sorry, I tried to answer to the post at the end of page 1 :))

No problem. It's happened to me before too. :P

Loren,

Did I read correctly that you're using Vista Ultimate? I am working with another user of speech.py, Wilton, who says he can't get it to work in Vista. Have you successfully gotten speech recognition to work, using speech.py, in Vista?

If so, by any chance is your program run in a console window (black command prompt like window)? Wilton is unable to move or close the console window when speech.py is being used, without lots of errors.

Thanks.
Michael

PS: 0.3.5 of speech.py is out at pyspeech.googlecode.com, which simplifies the code somewhat.

Hi Michael,

Your speech module runs fine on my PC.

What I'm trying to do in my script though, is use a command like "phrase.GetText()" so I can turn that spoken text into a string. Then I want to split that string and turn the words into a list, so I can find the individual words, and then replace them with other words.

For instance, if I tell my robot, "I love you" the script would get that text, turn it into a string, find the subject pronoun, "I," replace that with "You" then find the object pronoun "you" and replace that with "me." Then sort the split sentence back into a string and answer using speech.say "You love me"

However, when I try it, it says these so such function that goes with "phrase" as "Get Text".

I typed "help(speech)" after importing it in the command prompt, and it doesn't look like theres a "GetText()" function that goes with "phrase" in your module. Is it in the new version?

Or is there a way to work around it?

Thanks!

Loren

Hi Loren,

Sorry for the confusion. The first argument to your callback function *is* the spoken text. It's just a string. So you could do something like this, for example:

def callback(phrase, listener):
    print "The phrase that was heard is this:"
    print phrase
    if phrase == "I love you":
        speech.say("You love me!")

speech.listenforanything(callback)

I've updated the documentation for listenfor() and listenforanything() so that help(speech) should be a little more... helpful :)

You can run 'easy_install speech==0.4.1' to get the updated version.

Let me know how it goes,
Michael
PS: When you are running your speech recognition script in Vista, what happens when you move around your console window? The only other Vista report I have says it behaves terribly.

Hi Michael.

I don't really use your speech module to move around my consol window. I simply use it for my robot's control, and in that regards in works great.

Thanks for your input on the "phrase" feature. :) However, I'm hoping to write a program that's a little more flexible. You see, I'm working on a program for my robot, Nina, that's will allows humans to 'socialize' with her like a chatbot. But instead of saying specific phrases and having to word your sentences just right, You say anything you want, and my robot is supposed to split the string of the text from your voice and identity the parts of speech (articles, subject pronouns, verbs, object pronouns). Then she takes that string of words from dictation, replaces the subject pronoun from "I" to "You" and the object pronoun from "you" to "me," recompile the list back into a string and echo my sentence, but from her perspective (well, sort of...I mean its a robot *shrug*). Resulting in "You love me."

This is just a basic experiment in my Nina A.I. endeavors. I have a thread for it, where I post my progress and ask for advice. If you're interested in seeing what I've tried so far, or if you have any pointers of your own, you can find it under this thread:

A.I.--self programming and artificial 'learning.'

Thanks again, Micheal!

Loren

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.