Le 03/05/13 04:19, Anurag Kalia a écrit :
Rob Stewart-6 wrote
On May 1, 2013, at 4:33 PM, Anurag Kalia < anurag.kalia@ > wrote:
Vicente Botet wrote
Anurag Kalia wrote
As I have already mentioned, an application requiring absolute performance from date class would use its operations. And nothing can beat a serial representation at comparisons, assignments and day/week arithmetic.
ymd represntations, OTOH, optimize I/O. And an application requiring such optimization can simply employ a representation like: struct date { int y, m, d; };
which serves their purpose well enough. Optimizing I/O, which is a relatively slow process anyway, is possibly misguided. The YMD format could be extracted from the serial format to reuse, but I imagine a given object being formatted for I/O in a single function call, à la strftime(), rather than in a series of formatting calls. Still, even the latter is possible with a formatter object which extracts the YMD values once and applies various formatting operations, as desired. So, you are trying to say that optimizing I/O is not really rewarding since I/O by itself is already orders of magnitude slower? It makes sense to me. But might there not be cases where we have to store the output not to some output device, but to a string in memory only? I don't see how the format at construction could improve the
Le 02/05/13 23:35, Anurag Kalia a écrit : performances of I/O? COul dyou elaborate? I am not at all suggesting that there would be any improvement in
Vicente Botet wrote performance in I/O; it would be chugging at its own pace. What I meant to say was that a suboptimal output function is not actually going to affect I/O in significant measures. Since our output function, even after taking two more CPU cycles, would be significantly fast than I/O which would be taking hundreds of CPU cycles.
But I did come to the conclusion after the fact that optimizing I/O is a distraction. We are optimizing CPU cycles and in that scenario, efficient I/O operations are significant after all. It is mostly a big, never mind.
For the vast majority however, serial representation is good enough. After caching as you know, performance nears ymd in these scenarios too. Caching suggests larger objects and, possibly lower locality of reference. By caching, I meant to say retrieve y,m,d values explicitly as well as some other properties of the date. Please, could you elaborate on the cache mechanism you have in mind? Could you tell us from where would you retrieve these informations? Would you use memoization? Or is the cache static and build at startup? Would the user be able to configure how many mappings could be stored? What would be the keys of the catch?
I commented about it under my GSoC application; but I never hit upon some of the questions that you have asked here so here I go.
My lazy, default choice is to have some fixed number of dates in the cache (they are not user-configurable). The algorithm by which they would be kept and the hashing function have not been decided.
But it wouldn't give me much. My attempt is at memoization. I want to infer data from other, already-accessed dates. For example, there can be at most 14 types of year-calendars. There are a number of properties due to recurring nature of date arithmetic. Thus, I want to, say, store 1st days of every year in the table and infer y, m, d, weekday, iso-year etc of another date in the same year from the first day of year. It would be done lazily. And the key used would be the srial itself because they are the only property we store in our date objects as well as they uniquely identify any date. You would need at least the ymd key and the days key. The first one to build your days date class and the other to get the user land year/month/day attributes.
But I said before in those comments, I am little successful upril now. Dates already repeat their properties every 400 years. Let us see how much we can cut that period short.
What do you see wrong with
date d(day(7),month(12),year(2013));
respect to
date d = day(7) / month(12) / year(2013); I think the first version is wordy. There is just one additional character ;-) Some may think the second version is "too cute" (someone in the Kona-date-lib thought so), I think it is perfectly normal to try to push forward the syntax. The "<<" operator was overloaded too, and how! Dates are already used with separators. So, it is not like this library is trying to set a trend; it is merely following the age-old conventions. I think that for the standard this syntax must be proposed separately,
Have you explored the cache mechanism the Bloomberg implemented. It would be great if you can request them if you can have their implementation and the performances test they do for their paper " Towarda Standard C++’Date’Class" http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3344.pdf Memoization would be a bottleneck on multi-threaded systems. that is it is a SHOULD feature. For Boost I think it is a MUST so that we can experiment with it.
Take in account that this is not just about naming parameters but about defining the design space. Note that
day(32)
will fail and throw an exception. If you don't want the check to be done you have the possibility to use a no_check tag
day(7, no_check)
In the same way the library provided some constant object for month we could add constant objects days for days, e.g. d_07 which is correct by definition. So the preceding examples could become
date d(d_7,dec,year(2013,no_check)); date d = d_7 / december / year(2013);
If no check is desired the following could construct a date efficiently.
date d(d_7,dec,2013,no_check);
Note that the operator/() factory don't allows you to avoid the validity check. I realize it. But there is no need for bounds to be so intricately defined. A person either wants to define a date, or not. So, I suggest to leave out the checks from day, month and year and let them be checked only when constructing the actual date.
In my chrono::date library I check for the bounds on all the date related types allowing to don't check them with the specific no_check_t parameter.
We are not expecting, after all, that these day or month classes would be used independently after all. In fact, one of the strengths of functional approach [ day(2013, 05, 03) ] is that it lets us drop the redundant usage of day, month and year in the syntax.
The problem is the ordering of the parameters and performances. That is how you make the interface not ambiguous without decreasing the performances. Best, Vicente