Re: [boost] Boost Timer Update

Stephan Kaiser

31 Oct 2006 31 Oct '06

9:12 a.m.

Hello Philippe, Do you also plan to tackle the problem with QueryPerformanceCounter() on multi-core systems? QPC reports problematic/mismatching values on certain multi-core CPUs (e.g. Athlon X2). Cheers, Stephan ----- Original Message ----- From: "Philippe Vaucher" <philippe.vaucher@gmail.com> Newsgroups: gmane.comp.lib.boost.devel Sent: Sunday, October 29, 2006 6:16 PM Subject: Boost Timer Update

...

Hello guys,

I've been quite away from home in the last months so I wasn't able to work on the timer, but now I'm back so I'd have some time now. Today I've implemented Gennaro's device idea for QueryPerformanceCounter() / timeGetTime().

As the vault's down, you can find it at http://www.unitedsoft.ch/boost/timer_2006_10_29.zip.

Are you guys still using QuickBook for the documentation ? Before going away I started using it but it was quite painful.

Philippe Vaucher _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Show replies by date

Philippe Vaucher

31 Oct 31 Oct

9:32 a.m.

New subject: Boost Timer Update

...

Do you also plan to tackle the problem with QueryPerformanceCounter() on multi-core systems? QPC reports problematic/mismatching values on certain multi-core CPUs (e.g. Athlon X2).

Hello, If I get it right there's nothing much I can do about it ? I don't think multi-core CPU defines some macro which I could test to indicate the user he'd not use the QueryPerformanceCounter timer.... Anyway 99% of the users are supposed to use microsec_timer which use posix_time::microsec_clock... the QPC, timeGetTime() will be there for people really needing those. What's your opinion ? Philippe

Philippe Vaucher

9:55 a.m.

New subject: Boost Timer Update

I forgot to add my plan was simply to mention the issue in the documentation, and emphasize the fact that microsec_timer is the "main" timer and that the others ones are there for completeness' sake. Philippe

Michael Fawcett

4:25 p.m.

New subject: Boost Timer Update

On 10/31/06, Philippe Vaucher <philippe.vaucher@gmail.com> wrote:

...

...
Do you also plan to tackle the problem with QueryPerformanceCounter() on multi-core systems? QPC reports problematic/mismatching values on certain multi-core CPUs (e.g. Athlon X2).

If I get it right there's nothing much I can do about it ? I don't think multi-core CPU defines some macro which I could test to indicate the user he'd not use the QueryPerformanceCounter timer....

You could use SetThreadAffinity to force QPC to only run on one core, although I'm not sure what other ramifications to the timer's design and use this might have. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/directx9_c/... --Michael Fawcett

Philippe Vaucher

4:41 p.m.

New subject: Boost Timer Update

...

You could use SetThreadAffinity to force QPC to only run on one core, although I'm not sure what other ramifications to the timer's design and use this might have.

This is interesting but unfortunately it'd mean that the whole thread runs on one core, which very likely most programmer won't be happy with.... and running the QPC timer in a thread of his own just looks overkill to me. Maybe a mid solution would be to provide some macro allowing the user to make the lib automatically use SetThreadAffinityMask... but I think that simply mentionning the issue in the documentation is better. Is there really that much of a need for a QPC based timer ? In my current state of mind I really provide it as an alternative to the microsec_timer for those who specifically need it, but microsec_timer is portable and offers the same resolution than QPC... Philippe

Michael Fawcett

5:27 p.m.

New subject: Boost Timer Update

On 10/31/06, Philippe Vaucher <philippe.vaucher@gmail.com> wrote:

...

...
You could use SetThreadAffinity to force QPC to only run on one core, although I'm not sure what other ramifications to the timer's design and use this might have.

This is interesting but unfortunately it'd mean that the whole thread runs on one core, which very likely most programmer won't be happy with.... and running the QPC timer in a thread of his own just looks overkill to me.

Definitely something the user should be made aware of. I agree it's overkill for most, but it might be something user's want (see below).

...

Maybe a mid solution would be to provide some macro allowing the user to make the lib automatically use SetThreadAffinityMask... but I think that simply mentionning the issue in the documentation is better.

Is there really that much of a need for a QPC based timer ? In my current state of mind I really provide it as an alternative to the microsec_timer for those who specifically need it, but microsec_timer is portable and offers the same resolution than QPC...

Is that really the case? Microsoft's own documentation states: "The default precision of the timeGetTime function can be five milliseconds or more, depending on the machine. You can use the timeBeginPeriod and timeEndPeriod functions to increase the precision of timeGetTime. If you do so, the minimum difference between successive values returned by timeGetTime can be as large as the minimum period value set using timeBeginPeriod and timeEndPeriod. Use the QueryPerformanceCounter and QueryPerformanceFrequency functions to measure short time intervals at a high resolution." I have not done any tests to verify that QPC is indeed more accurate over short intervals, but if that is the case, it should be provided I think. Note that games often base their physics calculations off of elapsed time per frame and they need to behave the same no matter the framerate. These intervals are often as small as 0.003 seconds, sometimes smaller. Perhaps of interest, NVIDIA has a Timer Function Performance test app that shows the performance of various timing methods. I have no clue if the benchmark is written well, but speed of the actual timing function may be of interest to some users as well as its precision. http://developer.nvidia.com/object/timer_function_performance.html --Michael Fawcett

Philippe Vaucher

5:42 p.m.

New subject: Boost Timer Update

...

Is that really the case? Microsoft's own documentation states:

"The default precision of the timeGetTime function can be five milliseconds or more, depending on the machine. You can use the timeBeginPeriod and timeEndPeriod functions to increase [snipped]

microsec_clock doens't use timeGetTime()... it uses GetSystemTime() if I remember correctly. QueryPerformanceCounter has indeed a better resolution than timeGetTime(), it also has less overhead... but unfortunately I don't know how it compares to GetSystemTime(). I will have to run some tests to determinate. At the moment my code offers : - microsec_timer, which uses boost::posix_time::microsec_clock which is itself based on GetSystemTime on windows and gettimeofday() on linux. I think that'd be the timer that most of the users should use. - second_timer, which uses boost::posix_time::second_clock, which I forgot what it was using. - qcp_timer, only available under windows, which uses QueryPerformanceCounter. - tgt_timer, only available under windows, which uses timeGetTime. And then I plan to add clock_timer which would use std::clock... about GetTickCount() I don't think it'd be worth adding it as it's the worse win32 timer that exists. I'll give a shot to the nvidia timer test thing in the next days. Philippe

Michael Fawcett

5:51 p.m.

New subject: Boost Timer Update

On 10/31/06, Philippe Vaucher <philippe.vaucher@gmail.com> wrote:

...

At the moment my code offers :

- microsec_timer, which uses boost::posix_time::microsec_clock which is itself based on GetSystemTime on windows and gettimeofday() on linux. I think that'd be the timer that most of the users should use. - second_timer, which uses boost::posix_time::second_clock, which I forgot what it was using. - qcp_timer, only available under windows, which uses QueryPerformanceCounter. - tgt_timer, only available under windows, which uses timeGetTime.

And then I plan to add clock_timer which would use std::clock... about GetTickCount() I don't think it'd be worth adding it as it's the worse win32 timer that exists.

I'll give a shot to the nvidia timer test thing in the next days.

Thanks for all your hard work, Philippe! It sounds like your timer will be quite useful. Please let us know the results of your timing tests. --Michael Fawcett

Edward Lam

10:39 p.m.

New subject: Boost Timer Update

Philippe Vaucher wrote:

...

microsec_clock doens't use timeGetTime()... it uses GetSystemTime() if I remember correctly.

I haven't tried it but according to various links, GetSystemTime() only gives ~10ms or ~15ms precision (eg. http://discuss.fogcreek.com/joelonsoftware3/default.asp?cmd=show&ixPost=85520). After Googling, I found this old but useful article on timers: http://www.ddj.com/dept/windows/184416651 In particular, this table summarizes the resolution: http://www.ddj.com/showArticle.jhtml?documentID=win0305a&pgno=17 Maybe GetSystemTime() should be renamed "centisec_timer" :) My personal vote would for microsec_timer to be implemented based on QueryPerformanceCounter() and just make sure we point to the MSDN documentation. Regards, -Edward

Philippe Vaucher

1 Nov 1 Nov

12:02 a.m.

New subject: Boost Timer Update

...

I haven't tried it but according to various links, GetSystemTime() only gives ~10ms or ~15ms precision

Thank you for the links! Some results looks bizarre tho, like timeGetTime() at the same level than GetTickCount() seems quite weird to me. The graphs also seems to indicate QPC has one of the largest call overhead, would it be a concern ? Anyway, this makes me wonder what is the rational behind using GetSystemTimeAsFileTime() to implement date_time::microsec_clock ? Does anyone know (or could an author entlighten us) ? This also raise another question, if we chose to implement microsec_timer with QPC, what will we name the timer implemented with posix_time::microsec_clock (which is just a typedef for date_time::microsec_clock) ? Philippe

Jeff Garland

1:03 a.m.

New subject: Boost Timer Update

Philippe Vaucher wrote:

...

...
I haven't tried it but according to various links, GetSystemTime() only gives ~10ms or ~15ms precision

Thank you for the links!

Some results looks bizarre tho, like timeGetTime() at the same level than GetTickCount() seems quite weird to me. The graphs also seems to indicate QPC has one of the largest call overhead, would it be a concern ?

Anyway, this makes me wonder what is the rational behind using GetSystemTimeAsFileTime() to implement date_time::microsec_clock ? Does anyone know (or could an author entlighten us) ?

Sorry to be so quiet...it's not that I'm not interested, just too busy to really contribute. Anyway, the reason for GetSystemTimeAsFileTime is that it works. And it doesn't suffer from the rollover issue and the stability issues that the QPC stuff does in the face of multi-threading. So it just works and it gives enough precision for 90% of the applications.

...

This also raise another question, if we chose to implement microsec_timer with QPC, what will we name the timer implemented with posix_time::microsec_clock (which is just a typedef for date_time::microsec_clock) ?

I'll be happy to replace the current microsec_clock implementation as long as you can deal with rollover and multi-threading. Oh, and you can't add any data members to the ptime which might expand it's size. And since the microsec_clock is all static there could be some complications if you need to store data to handle rollover. QPC based implementation been discussed for years, but I've never seen anyone actually implement something that works as reliably as GetSystemTimeAsFileTime.... Jeff

Edward Lam

3:09 a.m.

New subject: Boost Timer Update

Jeff Garland wrote:

...

I'll be happy to replace the current microsec_clock implementation as long as you can deal with rollover and multi-threading.

I guess it depends on what you're using the timer for. I really don't mind either way as long as the behaviour is well-documented. When I'm using a "microsecond timer", it is because I want to measure short intervals, not long ones. But perhaps there should be clearer ways of specifying the precision/accuracy/resolution trade-offs desired? Philippe Vaucher wrote:

...

This also raise another question, if we chose to implement microsec_timer with QPC, what will we name the timer implemented with posix_time::microsec_clock (which is just a typedef for date_time::microsec_clock) ?

On second thought, apparently Java calls their QPC implementation is called "nano" instead. Perhaps we could use that terminology as well? On the QPC implementation, someone here (http://channel9.msdn.com/ShowPost.aspx?PostID=156175) suggested that it's might be good enough to call SetThreadAffinityMask() to cpu 0 and then set it back to the old value when done. I have no idea what type of overhead it imposes though. For more robust QPC implementations, the following two links have some more ideas. As Jeff suggests here, this is not new but this is the first time I've really looked seriously at the issue as QPC always worked well enough for my profiling purposes. http://msdn.microsoft.com/msdnmag/issues/04/03/HighResolutionTimer/ http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q274323& Regards, -Edward

Philippe Vaucher

10:47 a.m.

New subject: Boost Timer Update

...

I guess it depends on what you're using the timer for. I really don't mind either way as long as the behaviour is well-documented. When I'm using a "microsecond timer", it is because I want to measure short intervals, not long ones. But perhaps there should be clearer ways of specifying the precision/accuracy/resolution trade-offs desired?

At the moment I think about an api like this : - portable: microsec_timer: documented as most robust timer, not the best resolution one, good enough for 90% of the timings. - portable: second_timer: this one is kindof obvious, no need to describe much - portable: clock_timer: timer based on std::clock, documentation about clock() issue and resolutions - windows: qpc_timer: documented as best resolution, not thread safe & multi core problem. Hability to set a macro that automatically calls SetThreadAffinityMask() - windows: tgt_timer: documented as a good alternative to qpc ( I think tgt > GetSytemTimeAsFileTime ) - To discuss but I may add those timers : - windows: gst_timer: uses GetSystemTimeAsFileTime (actually microsec_clock already uses gst but it may change so why not offer an explicit one). - windows: gtc_timer: uses GetTickCount() - linux: gtod_timer: uses gettimeofday() (actually microsec_clock already uses gtod but it may change so why not offer an explicit one). If you know of other linux timers that'd be worth using please mention them thank you. I think that if each timer type is well documented about resolution, thread safety, multi-core issues and all the others gotcha's those timers are giving us headaches about, then the user can choose the timer he wants and take the risk himself. Of course I think we should suggest the users to use microsec_timer for portable code. I kinda like this approach because the user is free to make his choices and aware of the issues... At the moment I'm not very happy about using QPC for microsec_timer because it looks like the pros (satisfies some people, not the majority?) doesn't outweight the cons (all the issues about QPC). On second thought, apparently Java calls their QPC implementation is called

...

"nano" instead. Perhaps we could use that terminology as well?

I already thought a bit about this and at start my timers were named nanosec_timer or smth, but then this idea collided with date_time::microsec terminology and I decided to be consistent with other parts of boost. Philippe

Edward Lam

2 Nov 2 Nov

1:50 p.m.

New subject: Boost Timer Update

Philippe Vaucher wrote:

...

- portable: microsec_timer: documented as most robust timer, not the best resolution one, good enough for 90% of the timings. - portable: second_timer: this one is kindof obvious, no need to describe much - portable: clock_timer: timer based on std::clock, documentation about clock() issue and resolutions

Interestingly, on MSVC 2003, clock() is implemented using GetSystemTimeAsFileTime(). This means that for Windows, microsec_timer is the same as clock_timer then. :) -Edward

Philippe Vaucher

3 Jan 3 Jan

7:33 p.m.

New subject: Boost Timer Update

Hello guys, First of all, happy new year ! :) I thought it would be time for a little update. I structured the whole a bit, and now there are 8 different timers : Portable : - typedef timer<microsec_device> microsec_timer; // boost::bla::microsec_clock - typedef timer<second_device> second_timer; // boost::bla::second_clock - typedef timer<clock_device> clock_timer; // std::clock() Windows: - typedef timer<qpc_device> qpc_timer; // QueryPerformanceCounter() - typedef timer<tgt_device> tgt_timer; // timeGetTime() - typedef timer<gstaft_device> gstaft_timer; // GetSystemTimeAsFileTime() - typedef timer<gtc_device> gtc_timer; // GetTickCount() POSIX: - typedef timer<gtod_device> gtod_timer; // gettimeofday() device I created a tree like this at the moment : boost/timer.hpp boost/timer/devices.hpp boost/timer/implementation.hpp boost/timer/typedefs.hpp boost/timer/devices/clock.hpp boost/timer/devices/date_time.hpp boost/timer/devices/GetSystemTimeAsFileTime.hpp boost/timer/devices/GetTickCount.hpp boost/timer/devices/gettimeofday.hpp boost/timer/devices/QueryPerformanceCounter.hpp boost/timer/devices/timeGetTime.hpp And this brought me to some questions : What do you guys think about the structure ? Should we pollute the boost namespace ? Should I create a "timer" namespace inside the boost one ? should I create a "devices" namespace inside the timer one ? I also have another question, can someone point me to another POSIX/linux timing api beside gettimeofday ? Also, at start I wanted to provide some way to get the overhead/resolution of each devices from within the code, but then I removed it because I realized it's more trouble than it's worth, not to mention it probably won't be used. I decided to describe the overhead, resolution, pros/cons and issues in each device's header and in the incoming documentation instead. Thank you, Philippe p.s: win32 apis like GetProcessTimes() or GetThreadTimes() weren't supported because there isn't much benefit from having them and they don't exist on win9x. Tell me if you think I'd add them anyway.

Philippe Vaucher

7:37 p.m.

New subject: Boost Timer Update

Oh, I forgot to ask about the name of the devices. "gstaft_device" for GetSystemTimeAsFileTime(), "qpc_device" for QueryPerformanceCounter(), "gtod_device" for gettimeofday()... do you guys think it's ok ? I'm not very happy with this naming but I can't find anything that sounds better. I'm listening for ideas :) Philippe

Edward Lam

8 Jan 8 Jan

3:54 p.m.

New subject: Boost Timer Update

Philippe Vaucher wrote:

...

"gstaft_device" for GetSystemTimeAsFileTime(), "qpc_device" for QueryPerformanceCounter(), "gtod_device" for gettimeofday()... do you guys think it's ok ? I'm not very happy with this naming but I can't find anything that sounds better.

I'm less concerned with the device name as the timer name. Do we even need defined timers for the more obscure devices? -Edward

Philippe Vaucher

1 Nov 1 Nov

10:32 a.m.

New subject: Boost Timer Update

...

Sorry to be so quiet...it's not that I'm not interested, just too busy to really contribute. Anyway, the reason for GetSystemTimeAsFileTime is that it works. And it doesn't suffer from the rollover issue and the stability issues that the QPC stuff does in the face of multi-threading. So it just works and it gives enough precision for 90% of the applications.

I somehow guessed that answer, and I think I agree with it. Personnaly I'm for letting it as it is, and more or less let my code as it is, that means that people will use microsec_timer that uses microsec_clock for robust & good quality code. If they really want more precision, they can use qpc_timer, which will be documented so people know the issues about it. What do you think of this guideline ?

...

I'll be happy to replace the current microsec_clock implementation as long as you can deal with rollover and multi-threading. Oh, and you can't add any data members to the ptime which might expand it's size. And since the microsec_clock is all static there could be some complications if you need to store data to handle rollover. QPC based implementation been discussed for years, but I've never seen anyone actually implement something that works as reliably as GetSystemTimeAsFileTime....

Well with all the conditions and what I know from QPC, it looks like an unpossible challenge, and if I succeed it's likely that the overhead will be huge. I think I'll just go with the "convenience" solution where I offer to already do the SetThreadAffinityMask calls for the users if he wants to etc. Philippe

Daniel Wesslén

8:41 a.m.

New subject: Boost Timer Update

Michael Fawcett wrote:

...

Perhaps of interest, NVIDIA has a Timer Function Performance test app that shows the performance of various timing methods. I have no clue if the benchmark is written well, but speed of the actual timing function may be of interest to some users as well as its precision.

http://developer.nvidia.com/object/timer_function_performance.html

For what it's worth, I tried running the test on a dual processor Xeon, a dual core Athlon 64, and a single core Celeron D. In all cases was QueryPerformanceCounter the slowest by at least a factor 5 to the closest. GetTickCount was the fastest, and timeGetTime and the Pentium counter traded places in the middle depending on computer. -- Daniel Wesslén

Philippe Vaucher

10:48 a.m.

New subject: Boost Timer Update

...

For what it's worth, I tried running the test on a dual processor Xeon, a dual core Athlon 64, and a single core Celeron D. In all cases was QueryPerformanceCounter the slowest by at least a factor 5 to the closest. GetTickCount was the fastest, and timeGetTime and the Pentium counter traded places in the middle depending on computer.

Yes, but this test seems to measure the api overheard and not the timer's precision... I don't know how much having a big api overhead causes trouble over timing small intervals, but I expect that the better resolution of QPC outweight its api overhead. Tell me if I didn't understand something. Philippe

Philippe Vaucher

11:18 a.m.

New subject: Boost Timer Update

Me again ! :) I did a small wiki for boost::timer which is better than nothing about its current documentation :) You can see it at http://www.unitedsoft.ch/boost/wiki/ Philippe

Daniel Wesslén

12:02 p.m.

New subject: Boost Timer Update

Philippe Vaucher wrote:

...

...
For what it's worth, I tried running the test on a dual processor Xeon, a dual core Athlon 64, and a single core Celeron D. In all cases was QueryPerformanceCounter the slowest by at least a factor 5 to the closest. GetTickCount was the fastest, and timeGetTime and the Pentium counter traded places in the middle depending on computer.

Yes, but this test seems to measure the api overheard and not the timer's precision...

Indeed it does.

...

I don't know how much having a big api overhead causes trouble over timing small intervals, but I expect that the better resolution of QPC outweight its api overhead.

As would I. It was surprising to me that the method that seems to be meant to be used for small intervals has such a comparatively large overhead.

...

Tell me if I didn't understand something.

Nono. I didn't mean much by it, just thought I'd provide the information for completeness. Hence "for what it's worth." -- Daniel Wesslén

Michael Fawcett

1:49 p.m.

New subject: Boost Timer Update

On 11/1/06, Daniel Wesslén <daniel@wesslen.org> wrote:

...

Michael Fawcett wrote:

...
Perhaps of interest, NVIDIA has a Timer Function Performance test app that shows the performance of various timing methods. I have no clue if the benchmark is written well, but speed of the actual timing function may be of interest to some users as well as its precision.

http://developer.nvidia.com/object/timer_function_performance.html

For what it's worth, I tried running the test on a dual processor Xeon, a dual core Athlon 64, and a single core Celeron D. In all cases was QueryPerformanceCounter the slowest by at least a factor 5 to the closest. GetTickCount was the fastest, and timeGetTime and the Pentium counter traded places in the middle depending on computer.

I just ran it on my laptop (Pentium M 1.7GHz) with similar results. QPC seems to have the highest overhead by a lot. --Michael Fawcett

Michael Marcin

5:12 p.m.

New subject: Boost Timer Update

I'm not sure if you've read these already or not but: http://www.gamedev.net/reference/programming/features/timing/ and http://www.ddj.com/dept/windows/184416651 Thanks, Michael Marcin

Michael Fawcett

6:08 p.m.

New subject: Boost Timer Update

On 11/1/06, Michael Marcin <mike@mikemarcin.com> wrote:

...

http://www.ddj.com/dept/windows/184416651

That's a great link. It illustrates in tables the results of many topics of discussion in this thread, notably the accuracy of the timing functions, and the call overhead itself. Clearly the user must make the choice based on his situation. I think Philippe was already working with that in mind. Thanks, --Michael Fawcett

6770

Age (days ago)

6839

Last active (days ago)

List overview

Download

24 comments

7 participants

participants (7)

Daniel Wesslén
Edward Lam
Jeff Garland
Michael Fawcett
Michael Marcin
Philippe Vaucher
Stephan Kaiser