Re: [Boost-users] SOLVED! Re: multi_index 1.33.0, known problems withRHEL 3 compiler (gcc 3.2.3)?

----- Mensaje original ----- De: bert hubert <bert.hubert@netherlabs.nl> Fecha: Lunes, Marzo 13, 2006 8:32 pm Asunto: Re: [Boost-users] SOLVED! Re: multi_index 1.33.0,known problems withRHEL 3 compiler (gcc 3.2.3)?
On Mon, Mar 13, 2006 at 05:22:35PM +0100, Joaqu?n M? L?pez Mu?oz wrote:
I'd check the following for anomalies:
Joaquin, I'm busy tonight with other things so I'm only able to give you some approximate answers, for which I apologise. I hope to be able to give more definitive answers Wednesday.
Of course, come back when you can.
1. Could you (just for tests), replace index #0, which now is hashed,> with an ordered_unique index and rerun? A quadratic behavior could be the result of a very bad hashing scheme: this is extremely unlikely> for your current code, but who knows.
The original container had ordered_unique as its first index and showed a similar (large) increase in pruning time when pruning more cache entries. I did not look for a power law or anything.
OK, let's rule out hashing problems, then.
2. Is the size of CacheEntry::d_records roughly the same between prunes? Could it be that the average number of records in
CacheEntry
elements is disproportionately larger on those prunes that take longer?
The cache remains almost identical in size, around 50 million entries. The cache represents DNS records expiring, each run expires all DNS entries that have a 'time to die' ttd that is larger than the current time.
I'm not referring here to d_entry.size(), but to the average x.d_records.size() for x in d_entry. How many records does a cache entry have on the average? Does this experience large variations?
This holds, I think, a vital clue to the problem at hand. Records are inserted with ever increasing ttds, generally. There is a number of common time-to-live numbers for DNS records, ie, 5, 60, 300, 3600, 7200, 86400seconds.
This means that we have a number of 'striding' ttd's being inserted: now + 5, now + 60, now +300 .. now +86400. Might this trigger a badly balancedtree?
It shouldn't, otherwise it would be a bug in Boost.MultiIndex. Not that I categorically deny this possibility, but I tend to think the problem might lie elsewhere. If we're left with nothing else, we could arrange a test for balancedness easily enough, please tell me so when you're ready to instrument that.
On a related note, I wonder if chopping off the lower half of any sort of ordered tree might trigger worst case behaviour. Note that CacheEntry::getTTD() which forms one index might be slow to call.
The erase() part is not affected, as it doesn't invoke the comparison predicate nor the key extractor (getTTD in your case.)
3. What percentage of elements of d_cache are visited on each prune? I.e. what is the typical value of looked/d_cache.size()? The nonvisited elements shoudn't have any impact on phase /*2*/, but they have a (minimal) impact on those operations of /*1*/ performing as O(n log n), such as replace(). Anyway, no amount on nonvisited elements can have such a huge impact on figures.
Typically 500 elements are visited of a 50 million element structure, so 1e-5. Things get crazy once > 1000 elements are visited.
Doesn't seem like large variatons on d_entry.size() are to be accounted for the problem, then.
Thanks Joaquin, I hope to provide more measurements soon.
This problem is intriguing me... specially the fact that you observe quadratic behavior in a linearly-decomposable (and hence O(n)-behaved) problem. I think the following, easy test, can shed some light: to state facts, what you get is reasonably fast pruning when pruning period (and thus looked value) is low, with an explosion when pruning period reaches a given threshold, around say looked=800, right? And this happens even (quote) "without inserting something in the container in between", right? Why don't you try then the following: when pruning time is above a threshold, make doPrune do the pruning in two stages (and time each separately): 1. up to now/2. 2. up to now. According to the stated facts, making doPrune in two stages is about equivalent to making it in one fell swoop, wich implies the summed time won't change much. On the other hand, it is also equivalent to having a halved, below-threshold pruning period, thus implying contradictorily that the summed time will be much less than with one stage pruning. Why don't you do the test? Interesting things might come up... Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

On Mon, Mar 13, 2006 at 10:43:34PM +0100, JOAQUIN_LOPEZ_MU?Z wrote:
I'm not referring here to d_entry.size(), but to the average x.d_records.size() for x in d_entry. How many records does a cache entry have on the average? Does this experience large variations?
No - the average will be 2 entries. 1 will be popular too, and a very small number of them will have 40 or so. Everything in between too.
This problem is intriguing me... specially the fact that you observe quadratic behavior in a linearly-decomposable (and hence O(n)-behaved) problem. I think the following, easy test, can shed some light: to state facts, what you get is reasonably fast pruning when pruning period (and thus looked value) is low, with an explosion when pruning period reaches a given threshold, around say looked=800, right? And this happens even (quote) "without inserting something in the container in between", right?
No, I was not clear enough then. If I prune more often, I note things are a lot faster. The only difference, as seen from the multi_index_container, is that items get inserted in between deletes.
Why don't you try then the following: when pruning time is above a threshold, make doPrune do the pruning in two stages (and time each separately):
I'll try to do one better. Testing for me is hard as filling up the cache takes around 6 ours of 2mbit/s network bandwidth, so I've added serialization to my cache. I'll try to dump a large cache and write a small wrapper that reads it in so we can do experiments ad nauseam.
1. up to now/2. 2. up to now.
According to the stated facts, making doPrune in two stages is about equivalent to making it in one fell swoop, wich implies the summed time won't change
I'll try this, let's see what happens. Thanks again! -- http://www.PowerDNS.com Open source, database driven DNS Software http://netherlabs.nl Open and Closed source services

The problem of quadratically slowing erase from boost::multi_index has been resolved to a FreeBSD specific allocator problem, which is as yet unsolved. I'm trying to make this problem easy to reproduce but haven't managed to do so yet. Kind regards, bert On Mon, Mar 13, 2006 at 10:43:34PM +0100, JOAQUIN_LOPEZ_MU?Z wrote:
----- Mensaje original ----- De: bert hubert <bert.hubert@netherlabs.nl> Fecha: Lunes, Marzo 13, 2006 8:32 pm Asunto: Re: [Boost-users] SOLVED! Re: multi_index 1.33.0,known problems withRHEL 3 compiler (gcc 3.2.3)?
On Mon, Mar 13, 2006 at 05:22:35PM +0100, Joaqu?n M? L?pez Mu?oz wrote:
I'd check the following for anomalies:
Joaquin, I'm busy tonight with other things so I'm only able to give you some approximate answers, for which I apologise. I hope to be able to give more definitive answers Wednesday.
Of course, come back when you can.
1. Could you (just for tests), replace index #0, which now is hashed,> with an ordered_unique index and rerun? A quadratic behavior could be the result of a very bad hashing scheme: this is extremely unlikely> for your current code, but who knows.
The original container had ordered_unique as its first index and showed a similar (large) increase in pruning time when pruning more cache entries. I did not look for a power law or anything.
OK, let's rule out hashing problems, then.
2. Is the size of CacheEntry::d_records roughly the same between prunes? Could it be that the average number of records in
CacheEntry
elements is disproportionately larger on those prunes that take longer?
The cache remains almost identical in size, around 50 million entries. The cache represents DNS records expiring, each run expires all DNS entries that have a 'time to die' ttd that is larger than the current time.
I'm not referring here to d_entry.size(), but to the average x.d_records.size() for x in d_entry. How many records does a cache entry have on the average? Does this experience large variations?
This holds, I think, a vital clue to the problem at hand. Records are inserted with ever increasing ttds, generally. There is a number of common time-to-live numbers for DNS records, ie, 5, 60, 300, 3600, 7200, 86400seconds.
This means that we have a number of 'striding' ttd's being inserted: now + 5, now + 60, now +300 .. now +86400. Might this trigger a badly balancedtree?
It shouldn't, otherwise it would be a bug in Boost.MultiIndex. Not that I categorically deny this possibility, but I tend to think the problem might lie elsewhere. If we're left with nothing else, we could arrange a test for balancedness easily enough, please tell me so when you're ready to instrument that.
On a related note, I wonder if chopping off the lower half of any sort of ordered tree might trigger worst case behaviour. Note that CacheEntry::getTTD() which forms one index might be slow to call.
The erase() part is not affected, as it doesn't invoke the comparison predicate nor the key extractor (getTTD in your case.)
3. What percentage of elements of d_cache are visited on each prune? I.e. what is the typical value of looked/d_cache.size()? The nonvisited elements shoudn't have any impact on phase /*2*/, but they have a (minimal) impact on those operations of /*1*/ performing as O(n log n), such as replace(). Anyway, no amount on nonvisited elements can have such a huge impact on figures.
Typically 500 elements are visited of a 50 million element structure, so 1e-5. Things get crazy once > 1000 elements are visited.
Doesn't seem like large variatons on d_entry.size() are to be accounted for the problem, then.
Thanks Joaquin, I hope to provide more measurements soon.
This problem is intriguing me... specially the fact that you observe quadratic behavior in a linearly-decomposable (and hence O(n)-behaved) problem. I think the following, easy test, can shed some light: to state facts, what you get is reasonably fast pruning when pruning period (and thus looked value) is low, with an explosion when pruning period reaches a given threshold, around say looked=800, right? And this happens even (quote) "without inserting something in the container in between", right? Why don't you try then the following: when pruning time is above a threshold, make doPrune do the pruning in two stages (and time each separately):
1. up to now/2. 2. up to now.
According to the stated facts, making doPrune in two stages is about equivalent to making it in one fell swoop, wich implies the summed time won't change much. On the other hand, it is also equivalent to having a halved, below-threshold pruning period, thus implying contradictorily that the summed time will be much less than with one stage pruning. Why don't you do the test? Interesting things might come up...
Joaqu?n M L?pez Mu?oz Telef?nica, Investigaci?n y Desarrollo _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
!DSPAM:4415e83f181571719714943!
-- http://www.PowerDNS.com Open source, database driven DNS Software http://netherlabs.nl Open and Closed source services

Ok, regarding my previous message, I've now boiled down the problem of quadratically slower erases to something that does happen on FreeBSD 6.0 and doesn't happen on my Linux machines. The problem may be due to: 1) boost::multi_index container (highly unlikely) 2) g++ allocator 3) FreeBSD libc, if the g++ allocator relies on it 4) FreeBSD kernel And I don't even know where to start. Output of attached program on FreeBSD 6.0, g++ 4.1: $ ./cache-test Creating.. Copying 499950 nodes 100 162 usec 1.62 usec/erase 300 2547 usec 8.49 usec/erase 500 8132 usec 16.26 usec/erase 700 68262 usec 97.52 usec/erase 900 42949 usec 47.72 usec/erase 1100 201873 usec 183.52 usec/erase 1300 82376 usec 63.37 usec/erase 1500 398186 usec 265.46 usec/erase 1700 110701 usec 65.12 usec/erase On Ubunty Breezy, g++ 4.1: $ ./cache-test Creating.. Copying 499950 nodes 100 78 usec 0.78 usec/erase 300 238 usec 0.79 usec/erase 500 409 usec 0.82 usec/erase 700 527 usec 0.75 usec/erase 900 747 usec 0.83 usec/erase 1100 812 usec 0.74 usec/erase 1300 1112 usec 0.86 usec/erase 1500 1084 usec 0.72 usec/erase 1700 1545 usec 0.91 usec/erase To reproduce, compile the following, which first seeds a container with ~50000 entries, copies it, erases n nodes, restores the original, erases n+200 nodes etc. #include <boost/multi_index_container.hpp> #include <boost/multi_index/ordered_index.hpp> #include <boost/multi_index/key_extractors.hpp> #include <boost/multi_index/hashed_index.hpp> #include <boost/format.hpp> #include <vector> #include <boost/utility.hpp> #include <boost/lexical_cast.hpp> #include <sys/time.h> #include <time.h> #include <iostream> using namespace std; using namespace boost; using namespace boost::multi_index; class DTime { public: void set() { gettimeofday(&d_set,0); } unsigned int udiff() { struct timeval now; gettimeofday(&now,0); return 1000000*(now.tv_sec-d_set.tv_sec)+(now.tv_usec-d_set.tv_usec); } private: struct timeval d_set; }; struct StoredRecord { mutable uint32_t d_ttd; string d_string; bool operator<(const StoredRecord& rhs) const { return d_string < rhs.d_string; } }; struct CacheEntry { CacheEntry(){} CacheEntry(const string& name, const vector<StoredRecord>& records) : d_name(name), d_records(records) {} string d_name; typedef vector<StoredRecord> records_t; records_t d_records; uint32_t getTTD() const { uint32_t earliest=numeric_limits<uint32_t>::max(); for(records_t::const_iterator i=d_records.begin(); i != d_records.end(); ++i) earliest=min(earliest, i->d_ttd); return earliest; } }; typedef multi_index_container< CacheEntry, indexed_by < hashed_unique<member<CacheEntry,string,&CacheEntry::d_name> >, ordered_non_unique<const_mem_fun<CacheEntry,uint32_t,&CacheEntry::getTTD> > >
cache_t;
int main() { cache_t cache1, cache2; CacheEntry ce; StoredRecord sr; cout<<"Creating..\n"; for(unsigned int n=0; n < 500000; ++n) { ce.d_name=lexical_cast<string>(random()); sr.d_ttd=n/100 + (random() % 4)*300; ce.d_records.clear(); ce.d_records.push_back(sr); sr.d_ttd=n/100 + (random() % 4)*300; ce.d_records.push_back(sr); cache1.insert(ce); } cout<<"Copying "<<cache1.size()<<" nodes\n";; cache2=cache1; typedef cache_t::nth_index<1>::type cache_by_ttd_t; cache_by_ttd_t& ttdindex=cache1.get<1>(); cache_by_ttd_t::iterator limit=ttdindex.begin(); unsigned int udiff; for(unsigned int n=100; n < 5000; n+=200) { DTime dt; dt.set(); limit=boost::next(ttdindex.begin(), n); ttdindex.erase(ttdindex.begin(), limit); udiff=dt.udiff(); cout << format("%d %|30t|%d usec %|50t|%.02f usec/erase\n") % n % udiff % (1.0*udiff/n); cache1=cache2; } } -- http://www.PowerDNS.com Open source, database driven DNS Software http://netherlabs.nl Open and Closed source services
participants (2)
-
bert hubert
-
JOAQUIN LOPEZ MU?Z