Hi DW.

Is there anyone know how to detect a pre-recorded sound on an audio? What I mean is taking a face detection as an example. To recorgnize that face you need to first have the image you want to use to match or compare with on other image or the image you want to find from other images, there are examples of that using OpenSURF, as well as OpenCV. But in my case I want to find a sound/audio within an audio.

What I mean is that let say I have a pre recorded audio or special sound and I want to detect this from any audio file playing. How can I?

What you describe could be one of many different types of projects, each with a drastically different approach. For instance, the NSA has certain words, accents, and voiceprints that it wants to detect in the trillions of conversations it has stored. It's an inexact science and they need efficient algorithms to data mine terabytes of conversations to detect words like "bomb", try to match voices to known terrorists, etc., etc. in order to figure out what files need to be analysed by humans. In that scenario, they need to figure out what is "close enough" and they have so much data to go through that that processing resources per file is a key factor. That's one type of project.

Another type of project is where you can devote lots of processing resources to one file and you know you are looking for an exact match, not a "close enough" match.

A third type of project might be a type of copyright infringement detection where you have a known sound effect and you are analyzing a file to see if it is used one channel of input in a multi-channel recording. In that case, you might have a one second sound that you are looking for, but you would have to be able to separate the different channels and compare them rather than the aggregate sound. You might also have a sound, but there has been some changing/mixing going on. For example, the decibel level might have changed, some echoing or delaying could have been added, or the sound could have been speeded up or slowed down. In that case, it might not be as simple as simply comparing the frequencies in the file being tested to the frequencies of the sound you are comparing it to. In these scenarios, you should consult a musician or a sound effect/mixing expert who is familiar with the different software that does all of this. An excellent computer programmer could not do the job adequately without consulting the folks I listed and without the software like Adobe Audition or whatever that does all that mixing/tweaking, along with knowledge of the various sound formats. Also, don't reinvent the wheel. It's a common enough task that almost certainly there are libraries out there already. You just need to know what you need, find an API that corresponds to that need, and use it.

I'm guessing, given your prior posts, that you are trying to detect the exact unmodified, unmixed sound for piracy reasons and that you would be comparing the contents of several consecutive mp3 frames for a match. If that's the case, you would need to parse the mp3 file, extract the relevant sound information, change those frames if needed so that the sampling rate, etc. matches, and do a byte by byte comparison using memcmp if you are programming in C or whatever the corresponding function is in VB.NET.

But again, first things first. Define the exact problem and go from there. I could give you pointers on good data mining techniques only to find out that that is irrelevant.

In advance, I am not a sound guy, so if you are looking for advice on what libraries to use, etc., I'm not the guy to tell you. I have worked on such projects before as part of a team, and my contribution was the algorithm/data mining/statistics/probability/math part. We wasted a LOT of time on these projects because we did not share the same understanding of the project goals and assumptions, so nail that down first. And make sure your team (if any) and the client (if any) all have the same understanding of what those are.

You're back!

That's quite the project you are undertaking. If I ignore part of what you asked, my thoughts is that you need a cloud person to build out a cloud to do the analysis grunt work. This does not sound like anything you would run full time on a PC except to test a slice of the detection system you'll eventually move to your megawatt cloud sound smashing check system. (I'm not kidding.)

Thank everyone and @Null you are right and based on those types you matchined I think mine fall on the 3rd type. Well there is one sound that I want to detect, and I can also have it saved in many speed rates as possible to ensure that even if a song is playing in a fast speed I can also detect this sound within it if its available or not.

I don't know what this kind of process is called I once saw something talking about signaling which is also used in testing some scientific devices or someting like that not sure if this can also help me in this situation.

"Signalling" could mean all sorts of things, including oscilloscopes, signal generators, and all sorts of cool, expensive, fun to play with toys, all very likely useless in helping you solve this problem. Maybe not, but that's my guess from the peanut gallery.

Attack the problem. To do that, figure out what the problem is, then mold a solution to that problem. If you're dealing with a 16 year old illegally uploading a copyrighted song to YouTube, that's one problem. If you're involved in a multi-million dollar lawsuit where a musician stole your musician's sound, but says he created it on his own and you need to analyze the sound file to prove that, that's something else. If someone stole or is trying to steal or break your intellectual property / MP3-scrambling algorithm, that's a third kind of problem. These all require different approaches, and in many of them, the software coding/algorithm is but one part of the solution (ie you've figured out someone is illegally stealing your music and you've sent them a cease and desist order, which they've ignored. Now what?) My guess is, again speculating, you'll need to budget a few bucks for some lawyers and a variety of consultants in addition to programmers.

Edited 1 Week Ago by AssertNull: spelling

Comments
Signal up.

Mmmm well what you have refered to or mentioned above is not quite what I want to do. I have a special sound (beep) which I want to analyze or check each file playing if it has this beep sound. This can be done maybe by just analyzing the mp3 file without playing it if that is possible or perhaps open and play it but the sound be not sent out to speakers while the sound is playing wait to see if this beep will be heard within the playing sound. If this beep is heard a message box can be displayed or if 20 sec passes without hearing this tone/beep then the application can exit playing this file because this beep wasn't detected.

That what I'm trying to archieve here.

The mp3 format isn't that complicated. Your prior thread has a link to the layout. It's not trivial, but it's also not too complicated compared to a lot of the stuff I was talking about earlier, which clearly don't apply here. For your project to work, you need to be intimately familiar with the mp3 format and how to manipulate it, re-order frames, mono versus stereo, and the different ways of converting things. You are, after all, trying to make mp3's playable and unplayable, as well as possibly creating your own mp3 player. Your project seems to suggest that you feel that you will be facing skilled opponents trying to crack all of this. Many of your earlier posts suggested that you are NOT very familiar with how to isolate, analyze and manipulate mp3 frames and files, or at least you weren't at the time you wrote those posts. If that's still the case, I suggest putting this question and many of your other questions aside and learning regular old mp3 manipulation till you have that down before going to the extra, more complicated steps that you need to do for your project.

This can be done maybe by just analyzing the mp3 file without playing it if that is possible or perhaps open and play it but the sound be not sent out to speakers while the sound is playing wait to see if this beep will be heard within the playing sound.

That means you are writing your own mp3 player. If you're using someone else's player, the sound will be sent out to the speakers.

Actually, I think you might be able to do this with Audacity - open source audio processing tool. If you know the wave form of the sound you are interested in (your beep in this case), then you can use that tool to scan for it in an audio stream. It also has the ability to filter it out if you wish.

The article starter has earned a lot of community kudos, and such articles offer a bounty for quality replies.