hey guys,
Can any body tell me how shall i start with the project.
I will be using Java and C together(JNI) in netbeans for coding. I will be
embedding the C code in Java(Interface).

First i want to read the wav file in C and sample it and apply FFT algorithm.. And output should be an spectrogram.


can any body tell me how to start.

Recommended Answers

All 11 Replies

There must be a load of open source WAV readers around. In case you don't want to dig into other people's code, here is a brief description of a WAV format.

I guess that's enough to start with. If you have more specific questions, do ask.

I am doing the same thing right now. your project will consists of three things

1) reading the PCM data:

u can find libraries or implement it urself.

heres what u will need to implement it urself:

best wav file documentation - http://www.sonicspot.com/guide/wavefiles.html

u can really ignore everything except the fmt and data chunks, the fact chunk appears a lot in wav files and i haven't stumbled upon other types yet. BUT good coding habits mean u should still have your program account for them (or at least throw an exception like mine does now)

one of my biggest problems was the whole endian issue because most of the data is encoded in little endian write here if u need help with it.

k so once u can read the pure sample data store it in an array in your WavReader class (or whatever u want to call it, by the way dynamically allocating 2d arrays is tricky to) and have it return a pointer. ( u can implement this in other ways i did it this way)


This is as far as i got as of now just finished my WavReader class today

2) now u can apply the FFT on your raw data. and u will get a three dimensional array of data [channel][frequency][time] something like that.

3) now u will have to display this data. this is gui programming, also u might want to program your program to actually play the sound and display the FFT diagram at that moment.

things to be aware of is FFT takes a lot of processing power so start with an FFT on about 3-5 millisecond of your sound data. thats every 132.3-220.5 samples at 44100 sample rate.

write here if you have more questions

thank u for the expalanation.......i got the technique....was explained in a good manner.....


Can you please pass on the code to me....so that i can refer it a properly.....

thank u for the expalanation.......i got the technique....was explained in a good manner.....


Can you please pass on the code to me....so that i can refer it a properly.....

hey guys...
could you please pass on the code to me tooo.....i am really struggling with reading the data chunk part

thanks in advance

I am doing the same thing right now. your project will consists of three things

1) reading the PCM data:

u can find libraries or implement it urself.

heres what u will need to implement it urself:

best wav file documentation - http://www.sonicspot.com/guide/wavefiles.html

u can really ignore everything except the fmt and data chunks, the fact chunk appears a lot in wav files and i haven't stumbled upon other types yet. BUT good coding habits mean u should still have your program account for them (or at least throw an exception like mine does now)

one of my biggest problems was the whole endian issue because most of the data is encoded in little endian write here if u need help with it.

k so once u can read the pure sample data store it in an array in your WavReader class (or whatever u want to call it, by the way dynamically allocating 2d arrays is tricky to) and have it return a pointer. ( u can implement this in other ways i did it this way)


This is as far as i got as of now just finished my WavReader class today

2) now u can apply the FFT on your raw data. and u will get a three dimensional array of data [channel][frequency][time] something like that.

3) now u will have to display this data. this is gui programming, also u might want to program your program to actually play the sound and display the FFT diagram at that moment.

things to be aware of is FFT takes a lot of processing power so start with an FFT on about 3-5 millisecond of your sound data. thats every 132.3-220.5 samples at 44100 sample rate.

write here if you have more questions

I have started the coding part.....
Firstly I wanted to know how and where shall I read and store the data from the wave file.......
Secondly I can perform FFT on numerical values. These numerical Values will be present in the file which we created to store the data. Will that be stored in the form of an array..........

If possible can you paste some part of your code.....if not this then can you email it to me....

I have started the coding part.....
Firstly I wanted to know how and where shall I read and store the data from the wave file.......
Secondly I can perform FFT on numerical values. These numerical Values will be present in the file which we created to store the data. Will that be stored in the form of an array..........

If possible can you paste some part of your code.....if not this then can you email it to me....

I have started the coding part.....
Firstly I wanted to know how and where shall I read and store the data from the wave file.......
Secondly I can perform FFT on numerical values. These numerical Values will be present in the file which we created to store the data. Will that be stored in the form of an array..........

If possible can you paste some part of your code.....if not this then can you email it to me....My email ID <<snip>>

Here is some code for u guys. its not coded with the best coding techniques... so if u have any question post here (i started c++ a month ago hehe)

*THIS CODE MIGHT HAVE MISTAKES... but it works on my computer*

it will also tell you what is wrong with your file if its not working.
i commented the code but not very well so again ask me if you have questions.

Class is called WavReader

it supports uncompressed (compression code = 1) PCM data of 8/16/24/32 bits

list of WavReader public methods:

WavReader(char * filename);
~WavReader();
void ClearData();
void LoadNew(char * filename);			
int GetNumberOfChannels();
int GetSampleRate();
int GetBitsPerSample();
int GetNumberOfSamples();
int *** GetDataPointer();

here is the header file:

#ifndef WavReader_h
#define WavReader_h
#include <string>

using namespace std;


unsigned int UnsignedLittleToBigEndianConvert( char input[], int start, int length);

int SignedLittleToBigEndianConvert( char input[], int start, int length);

bool StringCompare(char input[], int start, int length, string str);


	struct RiffChunk
 	{
  	 	  bool present;
		  unsigned int size;
		  char type[4];		
	};
	
	struct DataChunk
 	{
		  unsigned int size;
		  bool present;
		  unsigned int StartPosition;		
	};
	
	struct FmtChunk
 	{
	 	  bool present;
	 	  unsigned int StartPosition;
		  unsigned int size;
		  unsigned int CompressionCode;
		  unsigned int NumberOfChannels;
		  unsigned int SampleRate;
		  unsigned int BytesPerSecond;
		  unsigned int BlockAllignment;
		  unsigned int BitsPerSample;
		  unsigned int BytesPerSample;
		  
	};
	
	struct FactChunk
 	{
	 	  bool present;
	 	  unsigned int StartPosition;
		  unsigned int size;
		  unsigned int NumberOfSamples;
		  		
	};

	class WavReader
	{
	 	  private:
		  		  RiffChunk headerr;
		  		  FmtChunk format;
		  		  FactChunk facts;
		  		  DataChunk RawBinaryData;
		  		  int  **ProcessedData;
				  bool DataIsGood;
		          
				  void read(char binaryfile[]);
				  				  
				  void ProcessData(char binarydata[]);


		 public:

		 		  WavReader(char * filename);

		 		  ~WavReader();

                                                  void ClearData();

		 		  void LoadNew(char * filename);			
				  
				  int GetNumberOfChannels(){return format.NumberOfChannels;}

				  int GetSampleRate(){ return format.SampleRate;}
				  
				  int GetBitsPerSample(){ return format.BitsPerSample;}

				  int GetNumberOfSamples(){
				  	  if(format.NumberOfChannels == 0 || RawBinaryData.size ==0 || format.BytesPerSample == 0){
					  	return 0;
					  }else{
					  	return RawBinaryData.size / (format.NumberOfChannels * format.BytesPerSample);
					  }
				  }
				  
				  int *** GetDataPointer(){ return &ProcessedData;}		  		  
	};

#endif

and here is the cpp file:

#include <iostream>
#include <fstream>
#include "WavReader.h"

using namespace std;

// name self explanatory
unsigned int UnsignedLittleToBigEndianConvert( char input[], int start, int length){
		 
		 unsigned int answer = 0;
    	 unsigned int help = 0;
    	 unsigned int help2 = 1;
    
         	 for (int n=0; n<length; n++) {
             	 help = (int)(unsigned char)input[start+n];
 	 	       	 answer += (help*help2);
            	 help2 *= 256;   
        	 }
        	 
         return answer;
 		 }

// name self explanitory
int SignedLittleToBigEndianConvert( char input[], int start, int length){
	
	int answer = 0;
    int help = 0;
    int help2 = 1;
    
    for (int n=0; n<length-1; n++){
			 help = (int)(unsigned char)input[start+n];
  	 		 answer += (help*help2);
     	 	 help2 *= 256;
		}
		answer += (int)input[start+length-1]*help2;
		
	return answer;
	}

/* Compares the string of length 'length' generated from the 'start' position
 in the char array input to the first 'length' bytes of the string 'str'.
 Returns true if equal.
 If 'str' is shorter then 'length' returns false. */	
bool StringCompare(char input[], int start, int length, string str){
	if( length > str.length() ){return false;}
	for( int i = 0; i < length; i++){
		 if(input[start+i] != str[i]){return false;}
		 }
	return true;
	}


		  		  
		  		  WavReader::WavReader(char * filename)
	              {
						LoadNew(filename);
			      }	
			    
  				    
		          WavReader::~WavReader()
		          {
				   	ClearData();
				  }


				  void WavReader::ClearData(){

				  	    headerr.present = false;
		  		  		format. present = false;
		  		  		facts.present = false;
		  		  		RawBinaryData.present = false;
		  		  		DataIsGood = false;
					    format.CompressionCode = 0;
					    format.NumberOfChannels = 0;
						format.SampleRate = 0;
						format.BytesPerSecond = 0;
						format.BlockAllignment = 0;
					    format.BitsPerSample = 0;
						format.BytesPerSample = 0;
						format.size = 1000;
						format.StartPosition = 0;
						RawBinaryData.size = 0;
						facts.NumberOfSamples = 0;
						facts.size = 0;

						int width = sizeof(ProcessedData)/sizeof(*ProcessedData);
							for (int i = 0; i < width; i++)
								{
								 delete[]	 *(ProcessedData+i);
								}

						delete[] ProcessedData;
				  }	


				  void WavReader::LoadNew(char * filename){
				  	   
					 ifstream::pos_type size;
					 char * memblock;
					 
					 ClearData();

					 ifstream file (filename, ios::in|ios::binary|ios::ate);
  					 if (file.is_open())
  					 {
    				 size = file.tellg();
    				 memblock = new char [size];
    				 file.seekg (0, ios::beg);
    				 cout << "Loading file...  \n";
    				 file.read (memblock, size);
    				 file.close();
    				 cout << "The complete file content is in memory. \n";
					 
					 read(memblock);

    				 delete[] memblock;
  					 }
  					 else cout << "Unable to open file. \n";
				  }
				    
				  void WavReader::read(char binaryfile[])
				  {
				   	   unsigned int CurrentPosition = 0, size = 0, maxsize = 0, ChunkName = 0;
				   	   
				   	   if ( !StringCompare( binaryfile, 0, 4, "RIFF")){
					   cout << "not a RIFF/WAV file. \n";
					   DataIsGood = false;
					   /* this is one of the only times when its ok to use goto,
					    trying to end the execution of the read function while in the if statement */
					   goto EndOfRead;
					   }else{
					   headerr.present = true;
					   headerr.size = UnsignedLittleToBigEndianConvert( binaryfile, 4, 4);
					   maxsize = headerr.size + 8;
					   }

					   CurrentPosition += 8;

					   if ( !StringCompare( binaryfile, CurrentPosition, 4, "WAVE")){
					   cout << "not a WAV file. \n";
					   DataIsGood = false;
					   goto EndOfRead;
					   }

					   CurrentPosition += 4;
				   	   
				   	   
   				       while(true)
		   				  {
						  if ( CurrentPosition + 8 > maxsize ){cout << "file end. \n"; goto CheckData;}
						  ChunkName = UnsignedLittleToBigEndianConvert( binaryfile, CurrentPosition , 4);
						  size = UnsignedLittleToBigEndianConvert( binaryfile, CurrentPosition + 4, 4);
						  if(size%2 != 0){size++;}
						  if ( CurrentPosition + 8 + size > maxsize){cout << "file ends ubruptly. \n"; DataIsGood = false; goto EndOfRead;} 


						  switch (ChunkName){
						  /* since the switch statement can only be used on int values I converted all the chunk names no unsigned int.
						  	 since a chunk name is 4 bytes and an int is 4 bytes it works. for example the case below '544501094' 
							 is what you would get if u read the 4 bytes "fmt " as and unsigned integer.
							 You can find a guide for all these wav chunks at http://www.sonicspot.com/guide/wavefiles.html
							 sometimes the website is down for 1 or 2 hrs. come back if its down. */
						  case 544501094: // fmt chunk
						  	   cout << "fmt chunk detected \n";
						  	   if( format.present){cout << "more then 2 fmt chunks.... wtf? where did u get this file? \n"; DataIsGood = false; goto EndOfRead;}
							   if(size > 18){cout << "format not supported.... yet! \n"; DataIsGood = false; goto EndOfRead;}
							   format.present= true;
							   format.CompressionCode = UnsignedLittleToBigEndianConvert( binaryfile, CurrentPosition + 8, 2);
							   format.NumberOfChannels = UnsignedLittleToBigEndianConvert( binaryfile, CurrentPosition + 10, 2);
							   format.SampleRate = UnsignedLittleToBigEndianConvert( binaryfile, CurrentPosition + 12, 4);
							   format.BytesPerSecond = UnsignedLittleToBigEndianConvert( binaryfile, CurrentPosition + 16, 4);
							   format.BlockAllignment = UnsignedLittleToBigEndianConvert( binaryfile, CurrentPosition + 20, 2);
							   format.BitsPerSample = UnsignedLittleToBigEndianConvert( binaryfile, CurrentPosition + 22, 2);
							   format.BytesPerSample = UnsignedLittleToBigEndianConvert( binaryfile, CurrentPosition + 22, 2) / 8;
							   format.size = size;
							   format.StartPosition = CurrentPosition + 8;
						  	   break;
						  case 1635017060: // data chunk
						  	   cout << "data chunk detected \n";
						  	   if( RawBinaryData.present){cout << "more then 2  data chunks. not supported. \n"; DataIsGood = false; goto EndOfRead;}
							   RawBinaryData.present = true;
							   RawBinaryData.size = size;
							   RawBinaryData.StartPosition = CurrentPosition + 8;
						  	   break;
						  case 1952670054: // fact chunk
						  	   cout << "fact chunk detected \n";
						  	   if( facts.present){cout << "more then 2 fact chunks.... where did u get this file? \n"; DataIsGood = false; goto EndOfRead;}
							   facts.present = true;
							   facts.size = size;
							   facts.StartPosition = CurrentPosition + 8;
						  	   break;
						  case 1819697527: // wavl chunk
						  	   cout << "wavl chunk found, not supported ... yet! exiting \n";
							   DataIsGood = false;
							   goto EndOfRead;
						  case 1953393779: // slnt - silent chunk , noone uses these anymore
						  	   cout << "Silent chunk found, not supported ... yet! exiting \n";
							   DataIsGood = false;
							   goto EndOfRead;
						  case 543520099: // cue point
						  	   cout << "cue point chunk found, not supported ... yet! exiting \n";
							   DataIsGood = false;
							   goto EndOfRead;
						  case 1953721456: // playlist chunk, coding support for these is a B***h
						  	   cout << "playlist chunk found, not supported ... yet! exiting \n" ;
							   DataIsGood = false;
							   goto EndOfRead;
						  case 1953720684: // list chink 
						  	   cout << "list chunk found, not supported ... yet! exiting \n";
							   DataIsGood = false;
							   goto EndOfRead;
						  case 1818386796: // label chunk 
						  	   cout << "label chunk found, not supported ... yet! exiting \n";
							   DataIsGood = false;
							   goto EndOfRead;
						  case 1702129518: // note chunk
						  	   cout << "note chunk found, not supported ... yet! exiting \n";
							   DataIsGood = false;
							   goto EndOfRead;
						  case 1954051180: // text chunk
						  	   cout << "text label chunk found, not supported ... yet! exiting \n";
							   DataIsGood = false;
							   goto EndOfRead;
						  case 1819307379: // sample
						  	   cout << "Sample chunk found, not supported ... yet! exiting \n";
							   DataIsGood = false;
							   goto EndOfRead;
						  case 1953721961: // inst
						  	   cout << "Instrument chunk found, not supported ... yet! exiting \n";
							   DataIsGood = false;
							   goto EndOfRead;
						  default:
						  	   cout << "UNKNOW CHUNK FOUND: \n";
							   cout << "Chunk Name: " << binaryfile[CurrentPosition] << binaryfile[CurrentPosition+1] << binaryfile[CurrentPosition+2] << binaryfile[CurrentPosition+3] << "\n";
							   cout << "Chunk Size: " << size << "\n";
							   cout << "Errors may occur with the presense of this chunk \n";
							   break;
						  };		   
						 CurrentPosition += (8 + size);		 		   
		   				   }
				   				
				   			
				  		CheckData:
						if ( RawBinaryData.present && format.present && format.CompressionCode == 1 && (format.BitsPerSample == 8 || format.BitsPerSample == 16 || format.BitsPerSample == 24 || format.BitsPerSample == 32 )){ 
						   	 DataIsGood = true; 
						}else{
 							  cout << "data format problem \n rawbinary data: " <<  RawBinaryData.present << " \n format: " << format.present << " \n compression code : " << format.CompressionCode << "\nBits per sample: " << format.BitsPerSample << "\n"; 
							  DataIsGood = false;
						}

						ProcessData(binaryfile);
						EndOfRead:	
						if( !DataIsGood){cout << "End of read. \n";}			
				  }
				  
				  
				  
				  
				  void WavReader::ProcessData(char binarydata[])
				  {
				   	int k = 0;
				   	if ( DataIsGood )
					   {
					   	 	ProcessedData = new int* [format.NumberOfChannels]; // this line and the following for loop dynamicly allocates a 2 dimentional array of data
							for (int i = 0; i < format.NumberOfChannels; i++)
								{
								 	 *(ProcessedData+i) = new int[RawBinaryData.size / (format.NumberOfChannels * format.BytesPerSample)]; 	 
								}						 					 
					      
				   	   
				   	for( int i = 0; i < RawBinaryData.size / (format.NumberOfChannels * format.BytesPerSample); i++ )
					   {	
					   	 	 if(i%(RawBinaryData.size / (format.NumberOfChannels * format.BytesPerSample)/10) == 0){cout << "Converting data: " << k*10 << "% done.\n"; k++;}
					   	 	 for (int j = 0; j < format.NumberOfChannels; j++)
								 {
								  	  int start = RawBinaryData.StartPosition+(j*(format.BytesPerSample))+ ( i * (format.NumberOfChannels * format.BytesPerSample));
								  	  ProcessedData[j][i] = SignedLittleToBigEndianConvert(binarydata, start , format.BytesPerSample);
								 } 	 	 
					   }
					   }
					      		  
				  }

And here's how to use it in main:
( i know that trial1.wav has 2 channels when i output the data.)

#include <iostream>
#include "WavReader.h"

int main ()
{
  int *** cleandata; 
  
  WavReader wavfile("trial1.wav");
  cleandata = wavfile.GetDataPointer();

// outputs the first 100 samples
	for( int i = 0; i < 100; i++){
	cout << i << " sample: " << (*cleandata)[0][i] << "  " << (*cleandata)[1][i] << "\n";
	}
   
  system("pause");
  return 0;
}

THINGS TO BE AWARE OF
unless the file you are trying to read is in the same folder as your .exe specify the full name.
GetNumberOfSamples() returns the number of samples PER channel.
You can not use the pointer after u call the ClearData() because it erases the data and the pointer points to nothing.
CleanData() should only be used if u have RAM memory space concerns.
Also if you load a new file using LoadNew() the pointer will now point to the new data.
(this can actually be useful sometimes, like if u are cycling through many files)

for example the following code WILL NOT WORK:

#include <iostream>
#include "WavReader.h"

int main ()
{
  int *** cleandata; 
  
  WavReader wavfile("trial1.wav");
  cleandata = wavfile.GetDataPointer();
 wavfile.ClearData();

// outputs the first 100 samples
	for( int i = 0; i < 100; i++){
	cout << i << " sample: " << (*cleandata)[0][i] << "  " << (*cleandata)[1][i] << "\n";
	}
   
  system("pause");
  return 0;
}

oh yeah many things in this code can be done differently like the endian convert can be done using bit shifters... but i think the way i coded it it is very simple to understand even to a begging programmer... and it makes u understand some basic ideas about how data is stored in a computer and in files.

I went through the code.......but i think that you haven't pasted some of the header files............can you please paste the code with the file name and header file name..........
you can even zip and mail me the code on my email id....

Thank you, SwiftSilver, your info is a huge help. I just face one problem while running your program code. When I try to sample my wav files, I get message :"more then 2 fmt chunks.... wtf? where did u get this file?". I'm trying to process a wav file received from the NOAA satellite.

Maybe you could add here your trial1.wav file, so I could see what results should I get by processing it.

Thank's again

I think the output of this program should be an image containing the spectrogram. But it just prints pairs of numbers as output. How can the spectrogram be derived?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.