Hello :-)!

I'd like to create application for mobile phone (maximum price of phone would be about 220USD, it is for Central Europe; I will be testing the application on Sony Ericsson k750i and maybe on Motorola V500). The application must recognize speech and properly react for recognized numbers (calculate control sum and respond). I'd like to use CMU Sphinx, but I still need to decide whether it would be [1] or [2]. Solution [1] is as follows: speech recognition on mobile phone (PocketSpinx) and sending results to server (post, httpconnection; I also tried Wireless Messaging API but it didn't work properly). Aprroach no. [2]: redirecting of speech from mobile phone to server (I thought about Digium + Asterisk but Digium cards are rather expensive) and speech recognition on server (Sphinx4).

I think I will choose the [2]-nd option, however I thought about different way of connecting to internet, because those Digium cards are expensive. I thought about GSM/3G or CDMA but I'm not sure whether those would be good enough to ensure real time speech between user on mobile phone and program on server.

Somebody suggested me that mobile phone with Skype would be cheap and good option. I will be using this application in the place where they've got office with access to cable internet and fax, and mobile phone would be several kilometers from the office. From my point of view, creating wireless network with some kilometers of range would be too expensive and there may be too many distortions of signal. And here I've got question for you. Let's assume I've got mobile phone. On this phone I installed Skype and the phone has access to internet (GSM/3G or CDMA). I also have got Skype and Sphinx4 on server. How should I configure it so that Skype on server can automatically receive the call from mobile phone and redirect it to Sphinx4, and server can answer from time to time to Skype? In other words: a) how to allow Skype to receive automatically call from mobile phone?, b) how to redirect speech from Skype to other application? (I also thought about Fring, which may be alternative for Skype).

Thanks very much for your answer in advance :-)!
Greetings :-)!

Recommended Answers

All 3 Replies

>Somebody suggested me that mobile phone with Skype would be cheap and good option.
Say thank you to that person but your application will not able to communicate with Skype as their API is still not publicly available. There been some suggestion that will open source soon, but soon can be 1 month up to years

As in regards of libraries to be used I do not think CMU Sphinx or PocketSpinx will help you as you need something that is capable to communicate with Java Microedition. Besides you already getting some advice here

Thanks for your answer :-)!

So what is the best way to establish connection between server and mobile phone?

You say that Skype has closed API. But what about applications with similar functionality, e.g. using Fring?

I am aware of the fact that CMU Sphinx libraries won't work with Java ME. But if I decide to establish speech connection between mobile phone and server and to send speech from phone to server so that the server can recognize the speech, there won't be any need to use Sphinx libraries in MIDlet (because those would be used in server). The crucial thing is how to send speech from phone to server. I considered some options: 1. ordinary call from phone, received by Digium - too expensive, 2. call with the use of internet, i.e. a) wifi - too expensive to create local wireless network, b) GSM/3G, CDMA - this is why I thought about Skype/Fring. (I also thought earlier that MIDlet can be somehow useful to involve that 2b option but I don't see how to do it, i.e. to have similar benefit from doing something in MIDlet to using Skype/Fringe. I guess it is not worth a trial to create sophisticated MIDlet because it may be something like building new Skype what is much too complex). Are there any other options than 1, 2a and 2b? What are other possibilities about 2b option, except Skype/Fring. And what about Fring, can it be what I'm looking for (i.e. similar to Skype with open API)?

You say that I'm already getting some advice at voxforge.org? I asked "Are you really sure I can run jvoicexml on mobile phone?" and he answered "No... I just gave you pointers on where to look in developing your own app" so it looks like his only one suggestion cannot be applied at all and this is why I decided to abandon this topic. Or maybe you can have any idea based on that topic about jvoicexml? Becase I don't see any pointers given to me. Asterisk/FreeSwith, as far as I understand, requires those Digium cards to receive call from mobile telephony network to server. In general I don't see how I can have advantage from this whole VoiceXML. I've been reading internet pages of CMU Sphinx and there were many things about XML as configuration and JSAPI as API from Sun Microsystems, but I don't remember them writing anything about VoiceXML. Do I miss something important? I looked at VoiceXML specification and I see it can be used to create algorithm of speech between computer and human. However I haven't found yet how to use this in Sphinx4. Never mind, I'm gonna worry about it later, after establishing connection and running demo (e.g. HelloDigits.jar) during this connection.

However the most important thing is still the same, i.e. how to establish real-time speech connection between mobile phone and server.


Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, learning, and sharing knowledge.