| | |
How to get a thread's execution time
Please support our C++ advertiser: Intel Parallel Studio Home
![]() |
•
•
Join Date: Sep 2008
Posts: 31
Reputation:
Solved Threads: 0
Hi everyone
I've been looking into boost and c++ reference, as well as some googling, and I cannot find a way to get a thread's execution time, all options I found are related to system time and I need only the execution time for the thread (inside the thread's context). Anyone knows how to get this?
Thanks in advance
I've been looking into boost and c++ reference, as well as some googling, and I cannot find a way to get a thread's execution time, all options I found are related to system time and I need only the execution time for the thread (inside the thread's context). Anyone knows how to get this?
Thanks in advance
depends on the thread, but many times execution time is not measurable because it happens too quickly.
But generally you would call time functions when the thread starts and again when it ends, then execution time is the difference between those two times. Call clock() to get time in milliseconds (for most operating systems). On MS-Windows you can use QueryPerformanceCounter()
But generally you would call time functions when the thread starts and again when it ends, then execution time is the difference between those two times. Call clock() to get time in milliseconds (for most operating systems). On MS-Windows you can use QueryPerformanceCounter()
Last edited by Ancient Dragon; Jul 3rd, 2009 at 1:37 pm.
You are in a multi-threaded environment. Other threads in your application are taking time slices. Other applications are pre-empting and taking time slices. What I usually do is bump my application priority up to real time, set that's thread priority to real time., and then call the function N=10,000 times or so and divide by N to get an approximate time.
The RDTSC assembly instruction can be called before and after and the difference taken!
The RDTSC assembly instruction can be called before and after and the difference taken!
C Syntax (Toggle Plain Text)
static uint tclkl, tclkh; void CpuDelaySet(void) { __asm { rdtsc ; Read time-stamp counter mov tclkl,eax ; Save low 32bits mov tclkh,edx ; Save high 32bits }; } uint CpuDelayCalc(void) { uint v; __asm { rdtsc ; Read time-stamp counter sub eax,tclkl sbb edx,tclkh ; edx:eax = total elapsed interval mov v,eax }; return v; }
In its simplest form....
Note that I'm only returning a 32-bit value so the idea is to not overflow it! 0xffffffff ( 4,294,967,295)
The idea is to repeat the same test N times. In this example I chose 10,000 times. But I recommend to start around 1000 and work up until total time doesn't exceed a 32-bit unsigned value!
This is essentially an average result. You are still being pre-empted, then numbers will be all over the place each run. But they'll be mostly in the ball park. I use this technique to see if optimizations to my function make the function's time increase or decrease!
The RDTSC instruction is a 64-bit value, which contains the number of clock cycles that have elapsed and is accessible from the Application Ring meaning non-system software have access to it! Win32 hasn't blocked access so it is available for reading. It is set to zero at processor reset and merely rolls over to zero when the high count is reached!
C Syntax (Toggle Plain Text)
uint nRepeat = 10000; uint nTotTime; double fTime; CpuDelaySet(); for (uint n = 0; n < nRepeat; n++) { vD = MyFunc( vA, vB ); } nTotTime = CpuDelayCalc(); nOnce = nTotTime / nRepeat; or nOnce = (nTotTime + (nRepeat>>1)) / nRepeat; or fTime = ((double)nTotTime) / ((double)nRepeat);
Note that I'm only returning a 32-bit value so the idea is to not overflow it! 0xffffffff ( 4,294,967,295)
The idea is to repeat the same test N times. In this example I chose 10,000 times. But I recommend to start around 1000 and work up until total time doesn't exceed a 32-bit unsigned value!
This is essentially an average result. You are still being pre-empted, then numbers will be all over the place each run. But they'll be mostly in the ball park. I use this technique to see if optimizations to my function make the function's time increase or decrease!
The RDTSC instruction is a 64-bit value, which contains the number of clock cycles that have elapsed and is accessible from the Application Ring meaning non-system software have access to it! Win32 hasn't blocked access so it is available for reading. It is set to zero at processor reset and merely rolls over to zero when the high count is reached!
Last edited by wildgoose; Jul 4th, 2009 at 1:39 pm. Reason: code twiddle
•
•
Join Date: Sep 2008
Posts: 31
Reputation:
Solved Threads: 0
Hi,
Ok, I don't need to totally understand how you do it to use it, but my purposes are different from benchmarking. This averaging is a very clever idea indeed, but I will not run the same function for N times. I have N threads running the same function and every thread needs to know for how long it has executed already. I'm thinking that this method is not applicable to such case...
Ok, I don't need to totally understand how you do it to use it, but my purposes are different from benchmarking. This averaging is a very clever idea indeed, but I will not run the same function for N times. I have N threads running the same function and every thread needs to know for how long it has executed already. I'm thinking that this method is not applicable to such case...
AFAIK it is not possible to time a specific thread as if that thread were the only thing running on the operating system (windows in this case). One reason for that is because the os will perform thousands of context switches while the thread is running, so any time you try to calculate will include the time all those other things are doing as well. So any profiling you attempt will only be approximates, not absolutes.
Anything you try will be ballpark. One work around is to run test like I said, then encode into your program the approximate average that gets added to a bucket for each worker thread doing the same task. It won't be accurate, but will be in the ballpark. I think its the best you're going to do.
But keep in mind that it won't be accurate. Don't forget to get the samplings in a release build with your optimization turned on but make sure it is outside the scope or the optimizer will re-arrange your code and the tracking tags won't be where you think they are!
But keep in mind that it won't be accurate. Don't forget to get the samplings in a release build with your optimization turned on but make sure it is outside the scope or the optimizer will re-arrange your code and the tracking tags won't be where you think they are!
If you're trying to monitor worker thread usage, then keeping a task count per thread would be just as effective!
You mentioned several threads doing the same job thus that indicates worker threads. I'm assuming you have a number crunching task so find the number of CPU's you have then multiply by two. That is the number of worker threads you'll need for that one task to be most efficient and to run your processor dry. You can request which processor a thread is spawned from but the processor decides. Though you can override it. Over request your threads the read which CPU it is running on. Once you have the distribution you want, then release the ones you don't want! Kind of crude but its the only way I know to override the Operating System logic. Because as I mentioned, you're only requesting a CPU, that doesn't mean it has to give it to you!
You mentioned several threads doing the same job thus that indicates worker threads. I'm assuming you have a number crunching task so find the number of CPU's you have then multiply by two. That is the number of worker threads you'll need for that one task to be most efficient and to run your processor dry. You can request which processor a thread is spawned from but the processor decides. Though you can override it. Over request your threads the read which CPU it is running on. Once you have the distribution you want, then release the ones you don't want! Kind of crude but its the only way I know to override the Operating System logic. Because as I mentioned, you're only requesting a CPU, that doesn't mean it has to give it to you!
•
•
Join Date: Sep 2008
Posts: 31
Reputation:
Solved Threads: 0
A task counter would not solve the problem because what I want is the workers to stop after x elapsed time.
In the meanwhile I found this thread:
http://www.linuxforums.org/forum/lin...cess-time.html
which obviously is for linux only. I don't want to have to read the /proc... file every time I need the execution time so I am trying to use clock_gettime(...) method. There is still the issue system/user time. Assuming this will not make a difference for me I tried then the posix method, but sometimes the second value read is bigger than the first (diff is negative), which does not make much sense. I tried to find the reference for this function in order to know more details on why this happens but I didn't find it. Any idea (or link)?
From what I could understand there is no such (or similar) thing for windows, so I am still limited, since I'm building a supposedly cross-platform library...
In the meanwhile I found this thread:
http://www.linuxforums.org/forum/lin...cess-time.html
which obviously is for linux only. I don't want to have to read the /proc... file every time I need the execution time so I am trying to use clock_gettime(...) method. There is still the issue system/user time. Assuming this will not make a difference for me I tried then the posix method, but sometimes the second value read is bigger than the first (diff is negative), which does not make much sense. I tried to find the reference for this function in order to know more details on why this happens but I didn't find it. Any idea (or link)?
From what I could understand there is no such (or similar) thing for windows, so I am still limited, since I'm building a supposedly cross-platform library...
Last edited by rmlopes; Jul 6th, 2009 at 2:17 pm.
![]() |
Similar Threads
- using empty() with while loop gives maximum execution time error (PHP)
- Very Very Urgent...Need Code for Calculating Execution Time For Jsp Page (JSP)
- to find the execution time of a code (C)
- Maximum execution time exceeded. (PHP)
- Execution time(plz Help) (Shell Scripting)
- date time issue in JSP (JSP)
- calculate the execution time of a program (C)
Other Threads in the C++ Forum
- Previous Thread: opening a file
- Next Thread: c++ application deployment
| Thread Tools | Search this Thread |
api array arrays based beginner binary c++ c/c++ calculator char class classes code compile compiler console conversion count delete deploy desktop directshow dll download dynamic dynamiccharacterarray encryption error file forms fstream function functions game getline givemetehcodez google graph gui homeworkhelp homeworkhelper iamthwee ifstream input int integer java lib linkedlist linker linux list loop looping loops map math matrix memory news number output parameter pointer problem program programming project proxy python random read recursion recursive reference return rpg string strings struct temperature template templates test text text-file tree unix url variable vector video visual visualstudio win32 windows winsock word wordfrequency wxwidgets






