Calculate modal value of integer vector

ravenous 0 Tallied Votes 1K Views Share

I have to calculate the modal value of std::vector< int > quite often, so I thought I'd share my general method for people to look at, use and/or comment. In my case, the vectors usually contain values in the range 0 - 255.

The code first creates a std::vector< int > of random integers, to use as sample data. Here, I'm using the std::transform() function to populate a vector with random values, via a simple function that takes one argument that determines in what range the random number will be.

A second std::vector< int > is made that will hold a histogram of the values in the original vector, then values in this vector are initialised to 0 . The histogram is populated by the loop on line 22. The mode of the data is found by using std::max_element() to find the most populated bin in the histogram. std::max_element() returns an iterator to the element with the maximum value in it, we can find which bins this is by subtracting histogram.begin() from it.

This approach works well when the range of the integers in the vector is quite small, since the memory taken by the histogram vector is proportional to this. If the range is large and/or sparse a std::map< int,int > can be used to store the histogram instead of a std::vector . This saves memory, but finding the maximum bin has to be done manually. (unless someone has a good STL-ish way to do this?)

Anyway, hope that's useful to someone

#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>

int getRand(int range){   return rand()%(range);  }

int main()
{
    /* Seed the random number generator */
    srand(0);

    /* Generate vector of random int values */
    int range = 10;
    std::vector< int > v(100,range);
    std::transform(v.begin(),v.end(),v.begin(),getRand);

    /* Make a histogram of the data */
    std::vector< int > histogram(range,0);
    std::vector< int >::iterator it = v.begin();
    
    while(it != v.end())    histogram[*it++]++;

    /* Print out the frequencies of the values in v */
    std::copy(histogram.begin(),histogram.end(),std::ostream_iterator< int >(std::cout, " "));
    std::cout << std::endl;

    /* Find the mode */
    int mode = std::max_element(histogram.begin(),histogram.end()) - histogram.begin();
    std::cout << "mode = " << mode << std::endl;

    return 0;
}
vijayan121 1,152 Posting Virtuoso

A generalized (any uni-modal or multi-modal input sequence) version of essentially the same algorithm:

#include <map>
#include <iterator>
#include <algorithm>
#include <iostream>

// C++1x
template< typename INPUT_ITERATOR, typename OUTPUT_ITERATOR >
void copy_modal_values( INPUT_ITERATOR begin, INPUT_ITERATOR end, // input sequence
                        OUTPUT_ITERATOR result ) // modes to be placed in this sequence
{
  typedef typename std::iterator_traits<INPUT_ITERATOR>::value_type key_type ;
  std::map< key_type, int> frequency_map ;

  while( begin != end ) ++frequency_map[*begin++] ;

  int max_freq = 0 ;
  for( auto iter = frequency_map.begin() ; iter != frequency_map.end() ; ++iter )
    if( iter->second > max_freq ) max_freq = iter->second ;

  for( auto iter = frequency_map.begin() ; iter != frequency_map.end() ; ++iter )
    if( iter->second == max_freq ) *result++ = iter->first ;
}

// simple test driver
int main()
{
  const int a[] = { 1, 2, 6, 7, 6, 2, 8, 8, 6, 8, 2, 9 } ;
  enum { N = sizeof(a)/sizeof(*a) } ;
  std::cout << "modes of [ " ;
  std::for_each( a, a+N, [](int i ) { std::cout << i << ' ' ; } ) ;
  std::cout << "] are [ " ;
  copy_modal_values( a, a+N, std::ostream_iterator<int>( std::cout, " " ) ) ;
  std::cout << "]\n" ;
}
ravenous 266 Posting Pro in Training

Nice generalisation :o)

I use std::map for this purpose sometimes too, but not in such a generalised way though. Although, this code doesn't actually compile for me, it complains about line 30, so I changed that line to

std::copy(a, a+N, std::ostream_iterator<int>( std::cout, " " ));

and it has the desired effect.

Thanks again!

vijayan121 1,152 Posting Virtuoso

Although, this code doesn't actually compile for me, it complains about line 30

Yeah, the code is C++1X. I guess I should have mentioned that.

If you are using C++98, you should have also got errors for lines 17 and 20. For C++98, you would need to modify them to something like:

typedef typename std::map< key_type, int>::iterator iterator ;

for( iterator iter = frequency_map.begin() ; iter != frequency_map.end() ; ++iter )
// etc.
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.