[intrusive] first impression and constant-time removal
Hello, I like the intrusive library. I'm refactoring my code to use it in many places. One of the benefits of boost::intrusive is the constant-time removal of items from a collection. The documentation doesn't tell you how to remove an element from a list in constant time, so I looked through the API. At first, "remove" looked like a good idea, but this function removes all instances that match a given value - so it is O(n). The unwary programmer will inadvertently use this function, not realizing the huge waste of CPU time.
From what I can tell, the way to remove an element in constant time is this:
if (element.is_linked()) list.remove(list.iterator_to(element)); It requires 3 function calls. How about adding the something like the following function to list? void remove_member(reference value) It might follow the same pattern as std::set's erase, which returns the erasure count (0 or 1) as a size_t. Summary: 1) I found the remove() function's name ambiguous. It isn't clear that the complexity is linear time - that it compares elements to a value rather than taking a reference to an actual element. The only hint is the const_reference (rather than a reference) argument. Would "remove_all_matches" would be a better name? 2) Removing an element from a list requires 3 function calls. Wouldn't an std::set-style erase(reference) be a good addition to the API? It could be called "remove_member" or simply "erase". 3) If nothing else, it might be wise to have a small section in the documentation that demonstrates constant-time element removal and warns against improper use of "remove". -Erik
On Thu, Feb 05, 2009 at 06:44:19AM -0800, Erik Cassel wrote:
1) I found the remove() function's name ambiguous. It isn't clear that the
It fully agrees with the use of the remove() name in the standard <algorithm> header. Calling it something else would break the convention and introduce unnecessary extra learning effort for people acquainted with the standard library. I am actually replying to add item 4: iterator_to is an extremely unintuitive and little informative name which I'm continually stumbling over. The convention is to name the functions according to what they return, so "to_iterator" or "get_iterator" or "iterator_of"[1] would make much more sense. But I guess it's too late to complain about that "bug" now that the library has undergone review and gotten accepted... [1] Note how "iterator_of" is declarative, indicating return value, whereas "iterator_to" is imperative and leads the reader to expect that something will be done with the iterator that is passed in as argument (compare with functions of similar names than one encounters, e.g. "fixed_string_to_int", which takes string as a source argument; functions returning X are conventionally named to_X -- e.g., browse through DateTime library [just a random example]). I see the logic behind a statement like iterator_to(X), but each time I have to use that function, I have to stop, think twice and assure myself that the naming logic really is the opposite of the convention and that the function really does what I need.
Zeljko, Thank you for your reply. I'm glad to learn that intrusive remove() follows a known convention in std. I also like your "iterator_of" suggestion. So that leaves me with 2 remaining suggestions, which I'll restate for the record: 1) Since constant-time removal is a rather common task, shouldn't the docs have an example of how to do it properly? You have to dig pretty deeply into the API to figure out a O(0) way to do it. 2) Wouldn't a wrapper function for constant-time element removal be a good addition to the library? Thanks again, -Erik -----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Zeljko Vrba Sent: Thursday, February 05, 2009 7:44 AM To: boost-users@lists.boost.org Subject: Re: [Boost-users] [intrusive] first impression and constant-time removal On Thu, Feb 05, 2009 at 06:44:19AM -0800, Erik Cassel wrote:
1) I found the remove() function's name ambiguous. It isn't clear that the
It fully agrees with the use of the remove() name in the standard <algorithm> header. Calling it something else would break the convention and introduce unnecessary extra learning effort for people acquainted with the standard library. I am actually replying to add item 4: iterator_to is an extremely unintuitive and little informative name which I'm continually stumbling over. The convention is to name the functions according to what they return, so "to_iterator" or "get_iterator" or "iterator_of"[1] would make much more sense. But I guess it's too late to complain about that "bug" now that the library has undergone review and gotten accepted... [1] Note how "iterator_of" is declarative, indicating return value, whereas "iterator_to" is imperative and leads the reader to expect that something will be done with the iterator that is passed in as argument (compare with functions of similar names than one encounters, e.g. "fixed_string_to_int", which takes string as a source argument; functions returning X are conventionally named to_X -- e.g., browse through DateTime library [just a random example]). I see the logic behind a statement like iterator_to(X), but each time I have to use that function, I have to stop, think twice and assure myself that the naming logic really is the opposite of the convention and that the function really does what I need. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Erik Cassel wrote:
Hello,
I like the intrusive library. I'm refactoring my code to use it in many places.
Thanks.
One of the benefits of boost::intrusive is the constant-time removal of items from a collection. The documentation doesn't tell you how to remove an element from a list in constant time, so I looked through the API.
Just like you erase it in std::list: erase(), you need an iterator. From the documentation: http://www.boost.org/doc/libs/1_37_0/doc/html/boost/intrusive/list.html#id30... iterator erase(const_iterator i) ; Effects: Erases the element pointed by i of the list. No destructors are called. Returns: the first element remaining beyond the removed element, or end() if no such element exists. Throws: Nothing. Complexity: Constant. Note: Invalidates the iterators (but not the references) to the erased element.
From what I can tell, the way to remove an element in constant time is this:
if (element.is_linked()) list.remove(list.iterator_to(element));
If you have a reference to an element: list.erase(list.iterator_to(element));
It requires 3 function calls. How about adding the something like the following function to list?
void remove_member(reference value)
I think it's redundant.
It might follow the same pattern as std::set's erase, which returns the erasure count (0 or 1) as a size_t.
That can be a good idea, since many lists have no constant-time size, so the user should be informed of how many elements have been erased. It's not std compliant but it does not break any code, since the function returned nothing.
2) Removing an element from a list requires 3 function calls. Wouldn't an std::set-style erase(reference) be a good addition to the API? It could be called "remove_member" or simply "erase".
You need to know if the element is linked or not. is_linked is not enough because the element could be linked in another list. The programmer must be sure that the element is in the list to form a correct iterator.
3) If nothing else, it might be wise to have a small section in the documentation that demonstrates constant-time element removal and warns against improper use of "remove".
Ok. That's a good idea.
-Erik
Regards, Ion
On Thu, Feb 05, 2009 at 07:11:03PM +0100, Ion Gaztañaga wrote:
That can be a good idea, since many lists have no constant-time size, so the user should be informed of how many elements have been erased. It's not std compliant but it does not break any code, since the function returned nothing.
I would be careful here because C++ allows void returns. An example: template<typename T> void f(T &t) { return t.remove(..); } If you change remove to return something else than void, the code above will fail to compile. This example is contrived, but things could break in more elaborate generic algorithm implementations. Heck, the simplest example of the usefulness of the above is early exit from a function, e.g.: if(something) return t.remove(..); which avoids compound statement.
Zeljko Vrba wrote:
On Thu, Feb 05, 2009 at 07:11:03PM +0100, Ion Gaztañaga wrote:
That can be a good idea, since many lists have no constant-time size, so the user should be informed of how many elements have been erased. It's not std compliant but it does not break any code, since the function returned nothing.
I would be careful here because C++ allows void returns. An example:
template<typename T> void f(T &t) { return t.remove(..); }
If you change remove to return something else than void, the code above will fail to compile. This example is contrived, but things could break in more elaborate generic algorithm implementations. Heck, the simplest example of the usefulness of the above is early exit from a function, e.g.:
Thanks for the warning. However, I doubt Interprocess is being used in generic code because unlike standard containers, I don't think there are no generic equivalents for these type of containers. If this only breaks very little code I think the change could be worth of it. Otherwise, for non-constant-time lists this operation would be very inefficient since users would need to call O(N) count() to know how many instances have been erased. Regards, Ion
Ion Gaztañaga wrote:
Thanks for the warning. However, I doubt Interprocess is being used in generic code because unlike standard containers, I don't think there are no generic equivalents for these type of containers.
Sorry, I obviously meant "intrusive" instead of "interprocess". Regards, Ion
Ion,
is_linked is not enough because the element could be linked in another list. The programmer must be sure that the element is in the list to form a correct iterator.
Good point. I see that it is incumbent upon the developer to make these decisions. I'll encapsulate my logic as needed. Thank you for your comments! -Erik
participants (3)
-
Erik Cassel
-
Ion Gaztañaga
-
Zeljko Vrba