On Mon, Mar 13, 2006 at 10:43:34PM +0100, JOAQUIN_LOPEZ_MU?Z wrote:
I'm not referring here to d_entry.size(), but to the average x.d_records.size() for x in d_entry. How many records does a cache entry have on the average? Does this experience large variations?
No - the average will be 2 entries. 1 will be popular too, and a very small number of them will have 40 or so. Everything in between too.
This problem is intriguing me... specially the fact that you observe quadratic behavior in a linearly-decomposable (and hence O(n)-behaved) problem. I think the following, easy test, can shed some light: to state facts, what you get is reasonably fast pruning when pruning period (and thus looked value) is low, with an explosion when pruning period reaches a given threshold, around say looked=800, right? And this happens even (quote) "without inserting something in the container in between", right?
No, I was not clear enough then. If I prune more often, I note things are a lot faster. The only difference, as seen from the multi_index_container, is that items get inserted in between deletes.
Why don't you try then the following: when pruning time is above a threshold, make doPrune do the pruning in two stages (and time each separately):
I'll try to do one better. Testing for me is hard as filling up the cache takes around 6 ours of 2mbit/s network bandwidth, so I've added serialization to my cache. I'll try to dump a large cache and write a small wrapper that reads it in so we can do experiments ad nauseam.
1. up to now/2. 2. up to now.
According to the stated facts, making doPrune in two stages is about equivalent to making it in one fell swoop, wich implies the summed time won't change
I'll try this, let's see what happens. Thanks again! -- http://www.PowerDNS.com Open source, database driven DNS Software http://netherlabs.nl Open and Closed source services