Re: [Boost-users] SOLVED! Re: multi_index 1.33.0, known problems with RHEL 3 compiler (gcc 3.2.3)?

Hi Bert,
----- Mensaje original -----
De: bert hubert
On Sat, Mar 11, 2006 at 10:03:21PM +0100, JOAQUIN_LOPEZ_MU?Z wrote:
2. You might try the equivalent expression:
waiters_by_ttd_index_t& ttdindex= boost::multi_index::get<KeyTag>(d_waiters);
Indeed, that solved the problem! I think what confused the older 'weaker'gcc is that all this code is within a template class. Thank you Joaquin! See http://mailman.powerdns.com/pipermail/pdns-users/2006- March/003140.html [...]
Good! Thanks for using Boost.MultiIndex, hope you're enjoying it.
Another question, I now have this: struct CacheEntry { CacheEntry(const string& name, const vector<StoredRecord>& records) : d_name(name), d_records(records) {} string d_name; typedef vector<StoredRecord> records_t; records_t d_records; uint32_t getTTD() const { uint32_t earliest=numeric_limits
::max(); for(records_t::const_iterator i=d_records.begin(); i != d_records.end(); ++i) earliest=min(earliest, i->d_ttd); return earliest; } }; typedef multi_index_container< CacheEntry, indexed_by <
hashed_unique
>,
ordered_non_unique
>
cache_t;
I then fill up a cache_t object, but I find that pruning it takes O(n^2)time based on how many entries I prune in one go. I'm now pruning once every 10 seconds, which takes 10 to 20ms per try, if I prune once every 3 minutes, pruning takes 10 seconds!
The odd thing is that it appears to matter how much you prune in one go, without inserting something in the container in between. Perhaps the cost is amortized?
The pruning code:
[...] So, the prunning code goes like: /* 1 */ for(j=ttdindex.begin();j!=ttdindex.end();){ // leave to-be-pruned elements at the beginning // of ttdindex } /* 2 */ ttdindex.erase(ttdindex.begin(), j); So as to make sure, where do you observe the O(n^2) behavior, /* 1 */ or /* 2 */ (or both) ? Can you time both parts separately? Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

On Sun, Mar 12, 2006 at 04:52:34PM +0100, JOAQUIN_LOPEZ_MU?Z wrote:
So, the prunning code goes like:
/* 1 */ for(j=ttdindex.begin();j!=ttdindex.end();){ // leave to-be-pruned elements at the beginning // of ttdindex }
/* 2 */ ttdindex.erase(ttdindex.begin(), j);
I've timed it separately, /* 1 */ is 'walk took', /* 2 */ is 'erase took'. It appears both are more or less related, but that /* 2 */ dominates the stats. The number of items deleted is looked - partial, where partial is the number of records that were partially pruned and begat a higher ttd because of the operation. So /* 1 */ is more than leaving elements behind, it also kicks some elements upwards. The list below gets interesting near the end. Walk took 183usec Erase took 838 usec, looked: 166, quick: 101, full: 53, partial: 12, no: 0 Walk took 227usec Erase took 1793 usec, looked: 228, quick: 145, full: 66, partial: 17, no: 0 Walk took 175usec Erase took 1031 usec, looked: 199, quick: 116, full: 75, partial: 8, no: 0 Walk took 218usec Erase took 1316 usec, looked: 197, quick: 116, full: 67, partial: 14, no: 0 Walk took 181usec Erase took 1569 usec, looked: 202, quick: 126, full: 64, partial: 12, no: 0 Walk took 228usec Erase took 1405 usec, looked: 188, quick: 109, full: 63, partial: 16, no: 0 Walk took 182usec Erase took 1424 usec, looked: 192, quick: 128, full: 54, partial: 10, no: 0 Walk took 389usec Erase took 6870 usec, looked: 520, quick: 361, full: 143, partial: 16, no: 0 Walk took 287usec Erase took 6294 usec, looked: 473, quick: 344, full: 121, partial: 8, no: 0 Walk took 338usec Erase took 7356 usec, looked: 541, quick: 413, full: 120, partial: 8, no: 0 Walk took 349usec Erase took 5000 usec, looked: 403, quick: 272, full: 115, partial: 16, no: 0 Walk took 353usec Erase took 4873 usec, looked: 410, quick: 288, full: 103, partial: 19, no: 0 Walk took 355usec Erase took 6346 usec, looked: 450, quick: 330, full: 103, partial: 17, no: 0 Walk took 324usec Erase took 6318 usec, looked: 494, quick: 377, full: 103, partial: 14, no: 0 Walk took 389usec Erase took 6535 usec, looked: 463, quick: 339, full: 107, partial: 17, no: 0 Walk took 256usec Erase took 4742 usec, looked: 401, quick: 302, full: 90, partial: 9, no: 0 Walk took 187usec Erase took 1878 usec, looked: 214, quick: 151, full: 51, partial: 12, no: 0 Walk took 196usec Erase took 1501 usec, looked: 213, quick: 143, full: 56, partial: 14, no: 0 Walk took 491usec Erase took 7775 usec, looked: 601, quick: 412, full: 171, partial: 18, no: 0 Walk took 533usec Erase took 9407 usec, looked: 427, quick: 328, full: 89, partial: 10, no: 0 Walk took 2400usec Erase took 49265 usec, looked: 838, quick: 586, full: 232, partial: 20, no: 0 Walk took 2511usec Erase took 56049 usec, looked: 807, quick: 620, full: 175, partial: 12, no: 0 Walk took 4478usec Erase took 87982 usec, looked: 1699, quick: 1274, full: 372, partial: 53, no: 0 I'll try to graph this behaviour a bit. Thanks! -- http://www.PowerDNS.com Open source, database driven DNS Software http://netherlabs.nl Open and Closed source services

bert hubert ha escrito:
On Sun, Mar 12, 2006 at 04:52:34PM +0100, JOAQUIN_LOPEZ_MU?Z wrote:
So, the prunning code goes like:
/* 1 */ for(j=ttdindex.begin();j!=ttdindex.end();){ // leave to-be-pruned elements at the beginning // of ttdindex }
/* 2 */ ttdindex.erase(ttdindex.begin(), j);
I've timed it separately, /* 1 */ is 'walk took', /* 2 */ is 'erase took'. It appears both are more or less related, but that /* 2 */ dominates the stats.
The number of items deleted is looked - partial, where partial is the number of records that were partially pruned and begat a higher ttd because of the operation. So /* 1 */ is more than leaving elements behind, it also kicks some elements upwards.
The list below gets interesting near the end.
[...] Certainly, your figures show an unexpected rise in execution time over linear when the number of looked up elements grow. I've done some local tests with a similar data structure (one hashed index plus one ordered index) and erase() behaves, as it should, as O(n); there's no amortization in erase(): basically, erase(it0,it1) erases the range elements one by one. The /*1*/ part should perform as O(n log n), the log n part due to the usage of replace(), still way below quadratic. I'd check the following for anomalies: 1. Could you (just for tests), replace index #0, which now is hashed, with an ordered_unique index and rerun? A quadratic behavior could be the result of a very bad hashing scheme: this is extremely unlikely for your current code, but who knows. 2. Is the size of CacheEntry::d_records roughly the same between prunes? Could it be that the average number of records in CacheEntry elements is disproportionately larger on those prunes that take longer? 3. What percentage of elements of d_cache are visited on each prune? I.e. what is the typical value of looked/d_cache.size()? The nonvisited elements shoudn't have any impact on phase /*2*/, but they have a (minimal) impact on those operations of /*1*/ performing as O(n log n), such as replace(). Anyway, no amount on nonvisited elements can have such a huge impact on figures. Looking fwd to your feedback, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

On Mon, Mar 13, 2006 at 05:22:35PM +0100, Joaqu?n M? L?pez Mu?oz wrote:
I'd check the following for anomalies:
Joaquin, I'm busy tonight with other things so I'm only able to give you some approximate answers, for which I apologise. I hope to be able to give more definitive answers Wednesday.
1. Could you (just for tests), replace index #0, which now is hashed, with an ordered_unique index and rerun? A quadratic behavior could be the result of a very bad hashing scheme: this is extremely unlikely for your current code, but who knows.
The original container had ordered_unique as its first index and showed a similar (large) increase in pruning time when pruning more cache entries. I did not look for a power law or anything.
2. Is the size of CacheEntry::d_records roughly the same between prunes? Could it be that the average number of records in CacheEntry elements is disproportionately larger on those prunes that take longer?
The cache remains almost identical in size, around 50 million entries. The cache represents DNS records expiring, each run expires all DNS entries that have a 'time to die' ttd that is larger than the current time. This holds, I think, a vital clue to the problem at hand. Records are inserted with ever increasing ttds, generally. There is a number of common time-to-live numbers for DNS records, ie, 5, 60, 300, 3600, 7200, 86400 seconds. This means that we have a number of 'striding' ttd's being inserted: now + 5, now + 60, now +300 .. now +86400. Might this trigger a badly balanced tree? On a related note, I wonder if chopping off the lower half of any sort of ordered tree might trigger worst case behaviour. Note that CacheEntry::getTTD() which forms one index might be slow to call.
3. What percentage of elements of d_cache are visited on each prune? I.e. what is the typical value of looked/d_cache.size()? The nonvisited elements shoudn't have any impact on phase /*2*/, but they have a (minimal) impact on those operations of /*1*/ performing as O(n log n), such as replace(). Anyway, no amount on nonvisited elements can have such a huge impact on figures.
Typically 500 elements are visited of a 50 million element structure, so 1e-5. Things get crazy once > 1000 elements are visited. Thanks Joaquin, I hope to provide more measurements soon. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://netherlabs.nl Open and Closed source services
participants (3)
-
bert hubert
-
JOAQUIN LOPEZ MU?Z
-
Joaquín Mª López Muñoz