memory issues with C++ algorithm in Windows

Question

azencot 0 Newbie Poster

14 Years Ago

Hello,

I am not sure if this is the place to ask.

I develop an algorithm in C++ under Windows XP with Visual studio 2008,

the algorithm is proved to be linear ( O(n) ),

there is a usage of memory during the algorithm ( all of it is being allocated dynamically before the algorithm starts )

and I measure the time it took to complete the computation and I observed the following -

1. using 4 MB of memory take X time of running.

2. using 40 MB of memory, only 4 MB is used, but it is scattered through all of the 40 MB,

( e.g. I mean that I can use any part of the 40 MB allocated, but guaranteed to use only 4 MB )

and this takes 1.3X time of running.

Does anyone have an explanation to this ? or references to read ?

maybe the amount of page faults involved ?

NOTE: this question was first posted here -
http://www.cplusplus.com/forum/windows/35330/
with no reply, so I am sorry if someone double viewed it.

Thanks,

Omri

algorithm c++ visual-studio

3 Contributors
10 Replies
186 Views
1 Week Discussion Span
Latest Post 14 Years Ago Latest Post by azencot

All 10 Replies

thelamb 163 Posting Pro in Training

14 Years Ago

1.3 times as long doesn't seem very significant, how long does the 'fastest' algorithm take? How many times did you run both algorithms.

Can you show us the exact difference in code between both approaches, your explanation is a little unclear to me.

Edited 14 Years Ago by thelamb because: n/a

thelamb 163 Posting Pro in Training

14 Years Ago

I didn't say 1.3 is not so long, but a difference of 30% on 0.2 seconds is not that much.

However since you ran it many times it is meaningful - so you can ignore that comment ;).

What gonefission posted is likely to be part of the 'problem'.
A cache miss won't generate a page fault though, so you won't see an increase of page faults with increasing cache misses.

Also, this may be silly, but just checking... you allocate the memory before starting the algorithm - do you then hand over a pointer to this block of memory? Or is it copied somewhere.

thelamb 163 Posting Pro in Training

14 Years Ago

Google gets me to this page:
http://oprofile.sourceforge.net/docs/intel-P4-events.php

I don't know how good it is.

I don't know how closed source your algorithm is, but I have VS2010 through uni and I wouldn't mind running the profile for you, if it's not too much of a hassle to get it to compile.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

gonefission 0 Newbie Poster · Answer 1 · 2011-02-06T22:12:26+00:00

gonefission 0 Newbie Poster

14 Years Ago

http://en.wikipedia.org/wiki/Cache_miss#Cache_misses

azencot 0 Newbie Poster · Answer 2 · 2011-02-07T00:26:39+00:00

1.3 times as long doesn't seem very significant, how long does the 'fastest' algorithm take? How many times did you run both algorithms.
Can you show us the exact difference in code between both approaches, your explanation is a little unclear to me.

What do you mean 1.3 times is not so long ? it is 30% percent more than I expect.
Do you have any example to simple and known algorithms that behave in this way ?

The 'fastest' algorithm runs for 0.2 seconds and the 'slower' runs for 0.26 seconds.

The number of times I ran this checks ( if I understand you correct ) is estimated
to be some hundreds times.

It is a problem for me to show the differences between the executions using my code,
but I can explain a bit more.
The best way to describe my situation, is to take 10,000 points in an XY plane
and then to take 100,000 points in the same plane, these points gets their coordinate on the plane in a random fashion. The algorithm runs only on the first 10,000 points in the points array, and "sums" up the values in the 8 physically closest points. which means that this points can be anywhere in the points array, and most likely not neighbors in the points array.

I hope this helps and thanks for the quick interest.
Omri

azencot 0 Newbie Poster · Answer 3 · 2011-02-07T00:29:39+00:00

http://en.wikipedia.org/wiki/Cache_miss#Cache_misses

This is an interesting direction and I am going to check it,
BTW: the number of page faults hasn't changed dramatically between runs.

Thanks,
Omri

azencot 0 Newbie Poster · Answer 4 · 2011-02-07T00:55:07+00:00

I didn't say 1.3 is not so long, but a difference of 30% on 0.2 seconds is not that much.
However since you ran it many times it is meaningful - so you can ignore that comment ;).
What gonefission posted is likely to be part of the 'problem'.
A cache miss won't generate a page fault though, so you won't see an increase of page faults with increasing cache misses.

Also, this may be silly, but just checking... you allocate the memory before starting the algorithm - do you then hand over a pointer to this block of memory? Or is it copied somewhere.

The memory is allocated at the beginning and a pointer to the memory
is being used. There are some copies during the algorithm run, these copies are
needed for the algorithm and cannot be reduced.
I don't have currently any estimation of the amount of copied data in each run,
but it should not be too much comparing to the amount of allocated data.

Omri

azencot 0 Newbie Poster · Answer 5 · 2011-02-07T11:13:45+00:00

http://en.wikipedia.org/wiki/Cache_miss#Cache_misses

Do you know if it possible to get the cache misses counter value
for a specific process using windows API ?

The only reference I have found is here -
http://msdn.microsoft.com/en-us/library/bb385772.aspx
but it requires a specific version of visual studio which I don't
have and I don't plan to buy it only for this cause.

Thanks,
Omri

azencot 0 Newbie Poster · Answer 6 · 2011-02-07T15:43:27+00:00

Google gets me to this page:
http://oprofile.sourceforge.net/docs/intel-P4-events.php
I don't know how good it is.
I don't know how closed source your algorithm is, but I have VS2010 through uni and I wouldn't mind running the profile for you, if it's not too much of a hassle to get it to compile.

OProfile is a nice tool :), but unfortunately they don't support Windows,
and my code isn't supported under Linux currently.. :(

My code is pretty closed :(, so I can't share it,
but thanks a lot for the offer.

Omri

azencot 0 Newbie Poster · Answer 7 · 2011-02-18T12:36:37+00:00

http://en.wikipedia.org/wiki/Cache_miss#Cache_misses

Hello,

The issue has been solved !
After I arranged the data in the arrays in the same
way I access the arrays, the jitter in times has disappeared.

Thanks for the pointer,
Omri

memory issues with C++ algorithm in Windows

Recommended Answers Collapse Answers

All 10 Replies

Recommended Answers