Container optimized for "insert once"

Hi all, I'm looking for an associative container which is optimized for one-time insertion and avoids much of the overhead of std::map therefore. IIRC Andrei once published something like this: a vector of pairs that was sorted once and accessed by binary search. Is there something similar in boost? Stefan

From: Stefan Slapeta <stefan@slapeta.com>
I'm looking for an associative container which is optimized for one-time insertion and avoids much of the overhead of std::map therefore. IIRC Andrei once published something like this: a vector of pairs that was sorted once and accessed by binary search.
std::vector + std::sort + std::binary_search? (You'd populate the vector with your data, run std::sort on it, and then just use std::binary_search from that point forward.) Other search algorithms which might prove more efficient for your data type, of course.
Is there something similar in boost?
Not that I'm aware of, but that doesn't mean there isn't. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Rob Stewart wrote:
From: Stefan Slapeta <stefan@slapeta.com>
I'm looking for an associative container which is optimized for one-time insertion and avoids much of the overhead of std::map therefore. IIRC Andrei once published something like this: a vector of pairs that was sorted once and accessed by binary search.
std::vector + std::sort + std::binary_search?
(You'd populate the vector with your data, run std::sort on it, and then just use std::binary_search from that point forward.)
Other search algorithms which might prove more efficient for your data type, of course.
FWIW, if you decide to go with the above, be sure to profile it against a std::map first. You may be surprised. Not to mention hash_map, if you have one.

Peter Dimov wrote:
Rob Stewart wrote:
[...] std::vector + std::sort + std::binary_search?
(You'd populate the vector with your data, run std::sort on it, and then just use std::binary_search from that point forward.)
Other search algorithms which might prove more efficient for your data type, of course.
FWIW, if you decide to go with the above, be sure to profile it against a std::map first. You may be surprised. Not to mention hash_map, if you have one.
I'm not so sure speed is the issue vs. size. Your typical std::map<> implementation will take about 3 pointers + int + sizeof(data) per node, whereas std::vector<> typically only has whole-container overhead. Unless your nodes are extremely large, that's not an insignificant amount of space overhead if you are only going to insert once. Dave

David B. Held wrote:
I'm not so sure speed is the issue vs. size. Your typical std::map<> implementation will take about 3 pointers + int + sizeof(data) per node, whereas std::vector<> typically only has whole-container overhead. Unless your nodes are extremely large, that's not an insignificant amount of space overhead if you are only going to insert once.
Yes. Furthermore, a default constructed vector is usually small and cheap, a map isn't (there is always at least one node instantiated). Stefan

Stefan Slapeta wrote:
David B. Held wrote:
I'm not so sure speed is the issue vs. size. Your typical std::map<> implementation will take about 3 pointers + int + sizeof(data) per node, whereas std::vector<> typically only has whole-container overhead. Unless your nodes are extremely large, that's not an insignificant amount of space overhead if you are only going to insert once.
Yes. Furthermore, a default constructed vector is usually small and cheap, a map isn't (there is always at least one node instantiated).
Your call. I was just saying that you shouldn't take claims that a vector<pair> is faster than a map for granted. BTW, pair<vector> has better cache coherency and hence slightly faster lookups, and you get to use the non-predicate version of lower_bound, which has lower abstraction penalty.

compressed_vector<> from boost::numeric::ublas could be appropriate... don't get much of a sense of how much from the original post. It is implemented with two vectors, one of indices and one of values. -t On Sep 8, 2004, at 10:17 AM, Stefan Slapeta wrote:
David B. Held wrote:
I'm not so sure speed is the issue vs. size. Your typical std::map<> implementation will take about 3 pointers + int + sizeof(data) per node, whereas std::vector<> typically only has whole-container overhead. Unless your nodes are extremely large, that's not an insignificant amount of space overhead if you are only going to insert once.
Yes. Furthermore, a default constructed vector is usually small and cheap, a map isn't (there is always at least one node instantiated).
Stefan
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

I'm looking for an associative container which is optimized for one-time insertion and avoids much of the overhead of std::map therefore. IIRC Andrei once published something like this: a vector of pairs that was sorted once and accessed by binary search.
std::vector + std::sort + std::binary_search?
[snip]
Is there something similar in boost?
Not that I'm aware of, but that doesn't mean there isn't.
I think there's a template called 'AssocVector' in Andrei's Loki library, which was used in place of std::map in Modern C++ Design. It was a vector of pairs, overlaid with the map interface, implemented as suggested above. Matt
participants (6)
-
David B. Held
-
Matthew Vogt
-
Peter Dimov
-
Rob Stewart
-
Stefan Slapeta
-
troy d. straszheim