there are 2 common ways to loop through a vector

1:

vector<int> myvec; // let's assume it's initialized
for(int i = 0; i != myvec.size( ); i++)
{
    // code
    // access by 'myvec[i]'
}

2:

vector<int> myvec; // let's assume it's initialized
for(vector<int> :: iterator i = myvec.begin( ); i != myvec.end( ); i++)
{
    // code
    // access by '*i'
}

what's the difference between the 2 methods, in what concerns me the first one is far more convenient since is cleaner and easier to use(no overhead with the '*', witch is also confusing sometimes, typically when u store pointers in your vector);
but in most of cases i've seen people using the second method. why?

Recommended Answers

All 14 Replies

Iterators are essentially pointers. They are more powerful, and safer, than accessing an element directly through its index.

They're roughly equivalent, although the second way has the advantage that the loop will look the same for many types of STL container, some of which don't have operator[] .

Depending on how you STL library implements things, I think the second version could involve two less addition/subtractions per iteration. In the version of the STL that I'm using, std::vector stores pointer to the beginning and (one-past) the end of the data. This means that when you call size() it does end - beginning to calculate the size. Also, when you do v[i] this effectively translates to *(v + i) , so there's another addition there. You have to do both of these on every iteration. The first version is storing the actual pointer (effectively) to the data that you want to get at, so you just have to dereference, which you have to do with the first version anyway. And the call to end() will just return the end pointer directly, so that just costs a function call, without the subtraction that you need to do size() .

Personally, I like use std::for_each() if I can:

std::vector< int > v(10);
std::for_each( v.begin(), v.end(), MyFunction);

In combination with lambda expressions I think that this can be clearer in many circumstances, and also more efficient in some circumstances.

Hope that helps :o)

It is also more efficient (don't use random-access, via an index, if it is not needed). It is also more idiomatic.

Also, the new C++ standard makes it a bit cleaner with the "auto" keyword, as so:

vector<int> myvec; // let's assume it's initialized
for(auto it = myvec.begin( ); it != myvec.end( ); ++it)
{
    // code
    // access by '*it'
}

Also, note that C++11 introduces yet a third and nicer way to do this for-loop, using the range-based syntax:

vector<int> myvec; // let's assume it's initialized
for(int& x : myvec)
{
    // code
    // access by 'x'
}

@ravenous: If the vector is not declared as "volatile", the evaluation of "size" is probably not done more than once. And since STL containers are templates (and thus have function definitions available to the compiler), functions like "begin()" "end()" or "size()" are most probably inlined (so you don't even get the cost of a function call). It is also possible that a clever compiler will be able to optimize the first version (with index) such that it boils down to exactly the same as the second version (with iterators), but, in theory, the iterator version has the best chance of being more efficient.

@ravenous: If the vector is not declared as "volatile", the evaluation of "size" is probably not done more than once. And since STL containers are templates (and thus have function definitions available to the compiler), functions like "begin()" "end()" or "size()" are most probably inlined (so you don't even get the cost of a function call). It is also possible that a clever compiler will be able to optimize the first version (with index) such that it boils down to exactly the same as the second version (with iterators), but, in theory, the iterator version has the best chance of being more efficient.

All true. Good work :o)

As a side note, how does the range-based for-loop do for efficiency? I'm guessing that it's going to be as good as it can be?

>>As a side note, how does the range-based for-loop do for efficiency? I'm guessing that it's going to be as good as it can be?

I believe the range-based for-loop is just syntactic sugar for the iterator version of the for-loop. By that I mean that these two loops are probably exactly the same:

for(int& x : myvec)
{
    // access by 'x'
}

// The above is most probably exactly the same as this:
for(auto it = myvec.begin(); it != myvec.end(); ++it)
{
    int& x = *it;

    // access by 'x'
}

//which is also exactly the same as:
for(auto it = myvec.begin(); it != myvec.end(); ++it)
{
    // access by '*it'
}

how exactly is it more safer to use iterators than indexes?
at this point,

vector<int> myvec; // let's assume it's initialized
for(auto it = myvec.begin( ); it != myvec.end( ); ++it)
{
    // code
    // access by '*it'
}

this seems the best one to me..(if iterators are really one step ahead indexes)

ps: thx mike, i didn't know auto could be used this way, i thought that keyword is useless

first off, sry for double post.

now, i've done a bit of testing with the 2 methods to see with one is faster, i'll let you see for yourself

//util.h

inline void CONSOLE_Print(const string& message)
{
	cout << message << endl;
}

// time

inline uint32_t GetTime( )
{
	return clock( ) / 1000;
}

inline uint32_t GetTicks( )
{
	return clock( );
}
#include <conio.h>
#include <Windows.h>
#include "util.h"

//#define ARGS

int main(int argc, char* argv[])
{
#ifdef ARGS
	for(uint32_t i = 0; i != argc; i++)
		CONSOLE_Print(argv[i]);
#endif
	uint32_t StartTime;
	MILLISLEEP(1500);
	
	CONSOLE_Print("Started initializing an empty vector of 100.000.000 ints");
	StartTime = GetTicks( );
	vector<int> vect(100000000, 0);
	CONSOLE_Print("Time: " + ToString(GetTicks( ) - StartTime) + " milliseconds");
	StartTime = GetTicks( );

	CONSOLE_Print("Started looping a vector of 100.000.000 ints(using '*it')");
	StartTime = GetTicks( );
	for(auto it = vect.begin( ); it != vect.end( ); it++)
	{
		*it = 1;
	}
	CONSOLE_Print("Time: " + ToString(GetTicks( ) - StartTime) + " milliseconds");
	StartTime = GetTicks( );

	CONSOLE_Print("Started looping a vector of 100.000.000 ints(using 'vect[it]')");
	StartTime = GetTicks( );
	for(int it = 0; it != vect.size( ); it++)
	{
		vect[it] = 1;
	}
	CONSOLE_Print("Time: " + ToString(GetTicks( ) - StartTime) + " milliseconds");
	StartTime = GetTicks( );

	getch( );
	return 0;
}

output:
http://www.picz.ro/show-image.php?id=40a3333d734f08e754162bc11c900ad4

it seems to me like the index method is both cleaner and faster, so what does this safety-nes consist in?

EDIT:

The main advantage is that iterator code works for all stl containers, while the array indexing operator [] is only available for vectors and deques. This means you are free to change the underlying container if you need to without having to recode every loop. It also means you can put your iteration code in a template and it will work for any container, not just for deques and vectors (and arrays of course).

this seems reasonable, but i still want to know about the safetyness

ps: thx mike, i didn't know auto could be used this way, i thought that keyword is useless

That is C++11

Here is the another way to do it in c++11 :)
using for_each and lambda

for_each(v.begin(), v.end(), [](int n) { cout << n << " "; });

and also you could use functor .. but as I have given example of lambda I think I dont need to give example of functor.

>>it seems to me like the index method is both cleaner and faster

I think your time measuring method is not good and your optimizations are not turned on, because your performance results are not correct.

I tried this code:

#include <iostream>
#include <vector>

#include "boost/date_time/posix_time/posix_time.hpp"

int main(int argc, char** argv) {
  
  boost::posix_time::ptime t = boost::posix_time::microsec_clock::local_time();
  std::vector<int> vect(100000000,0);
  boost::posix_time::time_duration dt = boost::posix_time::microsec_clock::local_time() - t;
  std::cout << "Initializing an empty vector of 100,000,000 ints in: " << dt.total_milliseconds() << " milliseconds" << std::endl;

  t = boost::posix_time::microsec_clock::local_time();
  for(int i = 0; i != vect.size(); ++i)
    vect[i] = 1;
  dt = boost::posix_time::microsec_clock::local_time() - t;
  std::cout << "Looping a vector 100,000,000 with vect[i] in: " << dt.total_milliseconds() << " milliseconds" << std::endl;

  t = boost::posix_time::microsec_clock::local_time();
  for(auto it = vect.begin(); it != vect.end(); ++it)
    *it = 1;
  dt = boost::posix_time::microsec_clock::local_time() - t;
  std::cout << "Looping a vector 100,000,000 with *it in: " << dt.total_milliseconds() << " milliseconds" << std::endl;

  t = boost::posix_time::microsec_clock::local_time();
  for(int& x : vect)
    x = 1;
  dt = boost::posix_time::microsec_clock::local_time() - t;
  std::cout << "Looping a vector 100,000,000 with range-based for-loop in: " << dt.total_milliseconds() << " milliseconds" << std::endl;

  t = boost::posix_time::microsec_clock::local_time();
  std::for_each(vect.begin(), vect.end(), [](int& x) { x = 1; });
  dt = boost::posix_time::microsec_clock::local_time() - t;
  std::cout << "Looping a vector 100,000,000 with for-each loop in: " << dt.total_milliseconds() << " milliseconds" << std::endl;

  return 0;
};

And compiled with the following:

$ g++ for_loop_perf.cpp -std=c++0x -O3 -lboost_date_time -o for_loop_perf

And the output is, as expected:

Initializing an empty vector of 100,000,000 ints in: 128 milliseconds
Looping a vector 100,000,000 with vect[i] in: 100 milliseconds
Looping a vector 100,000,000 with *it in: 43 milliseconds
Looping a vector 100,000,000 with range-based for-loop in: 43 milliseconds
Looping a vector 100,000,000 with for-each loop in: 43 milliseconds

My guess is that you didn't enable the optimizations and you are using MSVC (which is the worst optimizing compilers amongst the main-stream compilers (with GCC and ICC)).

yes, i have never considered optimizating(because i don't know how, so far, all i know is to code, i don't have much experience with compiler activities) and yes i am using MSVC since it was my very first complier used and i got confortable with. i might consider changing it tho. is GCC free? if yes, can you give me a direct link to it?
thx in advance

>>i have never considered optimizating

It shouldn't be your concern either. You shouldn't have to worry about optimizing your code until you have strong evidence that your performance is below what you need.

>>i don't have much experience

Gain more experience, worry about other things, like good software design, coding practices, expanding your knowledge of C++ syntax, etc.

>>i might consider changing it tho. is GCC free?

Well, this is no reason to switch compilers or IDE, but it is certainly good to learn to work with different compilers.

Yes, GCC is free. Under windows, you can use MinGW or Cygwin, for other OSes you can use GCC directly. Some IDEs can also be downloaded with MinGW included, like CodeBlocks.

wow, i didn't know gcc is known as mingw in windows, actually, i worked in mingv at school but i don't like it because it's interface is very poor designed, also i can hardly understand what errors and warnings are about there. so far, MSVC seems better from my point of view; tho thx for the effort, i really appreciate
ps: by "i don't have much experience" i meant that i don't have much experience of how the compiler works (linkage, transforming .cpp into .obj etc); right now i am focusing on soft desing coding practice and c++ knowledge(i've been since i started learning c++), but i don't feel so good when i have to deal with such things like linkage, debugging(although i know to insert breakpoints and such), i usually need to master thing to feel comfortable...

> there are 2 common ways to loop through a vector

Perhaps four.

#include <vector>
#include <algorithm>

void one( std::vector<int>& seq ) { for( auto& i : seq ) ++i ; }

void two( std::vector<int>& seq )
{ for( auto iter = seq.begin() ; iter != seq.end() ; ++iter ) ++*iter ; }

void three( std::vector<int>& seq )
{ for( std::vector<int>::size_type i = 0 ; i < seq.size() ; ++i ) ++seq[i] ; }

void four( std::vector<int>& seq )
{ std::for_each( seq.begin(), seq.end(), []( int& i ) { ++i ; } ) ; }

It is clearly a quality of implementation issue; but one could expect the generated code to be almost identical for all four.

Well not quite.

The GNU compiler generates identical code for one, two and four, but trips up on three (array subscript).

#include <vector>
#include <algorithm>

/**********************************************************************
>g++ --std=c++0x -Wall -Wextra -Werror --pedantic -c -S -O3 -fomit-frame-pointer
**********************************************************************/


void one( std::vector<int>& seq )
{ for( auto& i : seq ) ++i ; }
/*********************************************************************
    movl    4(%esp), %edx
    movl    (%edx), %eax
    movl    4(%edx), %edx
    cmpl    %edx, %eax
    je  L1
L3:
    addl    $1, (%eax)
    addl    $4, %eax
    cmpl    %eax, %edx
    jne L3
L1:
    rep
    ret
**********************************************************************/


void two( std::vector<int>& seq )
{ for( auto iter = seq.begin() ; iter != seq.end() ; ++iter ) ++*iter ; }
/*********************************************************************
    movl    4(%esp), %edx
    movl    (%edx), %eax
    movl    4(%edx), %edx
    cmpl    %edx, %eax
    je  L6
L8:
    addl    $1, (%eax)
    addl    $4, %eax
    cmpl    %eax, %edx
    jne L8
L6:
    rep
    ret
**********************************************************************/


void three( std::vector<int>& seq )
{ for( std::vector<int>::size_type i = 0 ; i < seq.size() ; ++i ) ++seq[i] ; }
/*********************************************************************
    movl    4(%esp), %eax
    movl    (%eax), %edx
    movl    4(%eax), %ecx
    subl    %edx, %ecx
    sarl    $2, %ecx
    testl   %ecx, %ecx
    je  L10
    sall    $2, %ecx
    xorl    %eax, %eax
L12:
    addl    $1, (%edx,%eax)
    addl    $4, %eax
    cmpl    %ecx, %eax
    jne L12
L10:
    rep
    ret
**********************************************************************/


void four( std::vector<int>& seq )
{ std::for_each( seq.begin(), seq.end(), []( int& i ) { ++i ; } ) ; }
/*********************************************************************
    movl    4(%esp), %eax
    movl    4(%eax), %edx
    movl    (%eax), %eax
    cmpl    %eax, %edx
    je  L17
L19:
    addl    $1, (%eax)
    addl    $4, %eax
    cmpl    %eax, %edx
    jne L19
L17:
    rep
    ret
**********************************************************************/

Haven't tested this with the Microsoft compiler, but if my memory serves me right it generates optimal code for three (array subscript), but trips up on four (algorithm).

Take your pick.

thx for the info vijayan121,
i also tried the fourth way and it doesn't work, it says 'for_each' is undefined.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.