[Multi-Index] Skip-Scan "virtual" ordered-non-unique index on top of composite ordered-unique index

We have BMIs of entries with complex composite "natural-keys". For example: typedef boost::multi_index_container< FooIdPtr, // FYI: boost::shared_ptr<FooId> bmi::indexed_by< EntityIndex<ByUid>::unique, // FYI: Surrogate/Primary Key EntityIndex<ByNKey>::of<FooId>::unique, // FYI: Natural Key // additional indexes by type, name and parent EntityIndex<ByType>::non_unique, EntityIndex<ByName>::non_unique, EntityIndex<ByParent>::non_unique > > Impl; The EntityIndex template utility simply factors the index definitions used in several BMIs. For example template <> struct EntityIndex<ByNKey> { template <typename T_IDENT> struct of { typedef hashed_unique< // natural key (logical primary key) tag<ByNKey>, identity< shared_ptr<T_IDENT> >, EntityIdHash<T_IDENT>, EntityIdPred<T_IDENT> > unique; }; }; template <> struct EntityIndex<ByParent> { typedef hashed_non_unique< tag<ByParent>, BOOST_MULTI_INDEX_CONST_MEM_FUN( EntityId, const Uid&, parent_uid // FYI: EntityId is the base class of FooId and co. ) > non_unique; }; After recently reading http://www.sqlite.org/draft/optoverview.html#skipscan and its Skip-Scan optimization, I wondered about changing my ByNKey unique index from hashed to ordered, and thought that in theory, I could perhaps avoid the maintenance and memory cost of the additional type+name+parent indexes, since all three are subsets of the ByNKey unique index, and using a similar skip-scan strategy, which might/would still provide reasonable performance, at least better than no index at all (i.e. a full scan). I know BMI does not provide this of course, but do you think Joaquin that it would be possible to one-day add such "virtual" ordered non-unique indexes on top of an existing ordered (possibly unique) composite index? Of course in this case, the composite'ness is hidden inside FooId, and not exposed to BMI (it used to be a true BMI composite index, but modeling the identity as an object made more sense and simplified things), so whether an index is a subset of another would need to be "configurable" in some way, to yield a "compatible key" with the right "prefix" keys to skip to the next "sub-range" to scan (hopefully that makes sense...). I think I would be able to write custom code on top of a ByNKey ordered-non-unique index to implement a subset of the functionality a true BMI ordered non-unique index provides, for my own needs (with some difficulty, but feasible nonetheless), but the kind of template meta-programming BMI uses to implements its indexes is beyond my current skill level, which is why I'm posting to see what the experts (i.e. Joaquin) thinks about this Skip-Scan virtual indexes idea. Thanks, --DD

Dominique Devienne <ddevienne <at> gmail.com> writes:
We have BMIs of entries with complex composite "natural-keys". [...] After recently reading http://www.sqlite.org/draft/optoverview.html#skipscan and its Skip-Scan optimization, I wondered about changing my ByNKey unique index from hashed to ordered, and thought that in theory, I could perhaps avoid the maintenance and memory cost of the additional type+name+parent indexes, since all three are subsets of the ByNKey unique index, and using a similar skip-scan strategy, which might/would still provide reasonable performance, at least better than no index at all (i.e. a full scan).
This is an interesting idea! Without making any internal change to Boost.MultiIndex indices, a decent approximation to skip-scanning can be achieved. The code pasted at the bottom provides a function skip_scan that you can use like this // scan elements with keys *,1,2,3 skip_scan<1>( c,[](const element& v){std::cout<<v<<" ";}, 1,2,3); // scan elements with keys *,*,2,3 skip_scan<2>( c,[](const element& v){std::cout<<v<<" ";}, 2,3); Whether this is good enough would require some profiling, but at least it saves you some additional indices. If you decide to test the thing please tell me back. Code requires C++11 support :-/ Joaquín M López Muñoz Telefónica Digital #include <boost/tuple/tuple.hpp> #include <boost/ref.hpp> #include <utility> // index_sequence code stolen from // http://stackoverflow.com/questions/19783205/ // why-sizeof-t-so-slow-implement-c14-make-index-sequence-without-sizeof template<int...> struct index_sequence{using type=index_sequence;}; template<typename T> using invoke=typename T::type; template<typename T,typename U> struct concate; template<int ...i,int ... j> struct concate<index_sequence<i...>,index_sequence<j...>>: index_sequence<i...,(j+sizeof...(i))...>{}; template<int n> struct make_index_sequence_help : concate< invoke<make_index_sequence_help<n/2>>, invoke< make_index_sequence_help<n-n/2>>
{}; template<> struct make_index_sequence_help<0>:index_sequence<>{}; template<> struct make_index_sequence_help<1>:index_sequence<0>{}; template<int n> using make_index_sequence=invoke<make_index_sequence_help<n>>; template<int N,typename CompositeKey,typename Value> auto key(const CompositeKey& ck,const Value& v)->decltype( boost::tuples::get<N>(ck.key_extractors())(v)) { return boost::tuples::get<N>(ck.key_extractors())(v); } template<int N,typename Container,typename F,typename... Args,int... I> void skip_scan_helper( const Container& c,F f,index_sequence<I...>,Args&&... args) { for(auto it=c.begin(),it_end=c.end();it!=it_end;){ for(auto rng=c.equal_range( boost::make_tuple( boost::cref(key<I>(c.key_extractor(),*it))..., std::forward<Args>(args)...)); rng.first!=rng.second;++rng.first){ f(*rng.first); } it=c.upper_bound( boost::make_tuple(boost::cref(key<I>(c.key_extractor(),*it))...)); } } template< int N,typename Container,typename F, typename... Args,typename Indices=make_index_sequence<N>
void skip_scan(const Container& c,F f,Args&&... args) { skip_scan_helper<N>(c,f,Indices(),std::forward<Args>(args)...); } // testing #include <boost/multi_index_container.hpp> #include <boost/multi_index/ordered_index.hpp> #include <boost/multi_index/member.hpp> #include <boost/multi_index/composite_key.hpp> #include <iostream> using namespace boost::multi_index; struct element { int x0,x1,x2,x3,x4; }; std::ostream& operator<<(std::ostream& os,const element& v) { os<<"("<<v.x0<<","<<v.x1<<","<<v.x2<<","<<v.x3<<","<<v.x4<<")"; return os; } typedef multi_index_container< element, indexed_by< ordered_non_unique< composite_key< element, member<element,int,&element::x0>, member<element,int,&element::x1>, member<element,int,&element::x2>, member<element,int,&element::x3>, member<element,int,&element::x4> > >
multi_t;
int main() { multi_t c={ {0,1,2,3,4}, {0,1,2,3,5}, {0,1,3,4,0}, {1,0,0,0,0}, {1,1,2,3,0}, {1,1,2,3,0}, {2,1,2,1,0}, {2,2,2,3,0}, }; skip_scan<1>( c,[](const element& v){std::cout<<v<<" ";}, 1,2,3); std::cout<<"\n"; skip_scan<3>( c,[](const element& v){std::cout<<v<<" ";}, 3); std::cout<<"\n"; skip_scan<2>( c,[](const element& v){std::cout<<v<<" ";}, 2,3); std::cout<<"\n"; }

On Thu, Nov 28, 2013 at 8:37 PM, Joaquin M Lopez Munoz <joaquin@tid.es>wrote:
Code requires C++11 support :-/
Variadic templates, including sizeof...() decltype trailing return uniform initialization lambdas template alias perfect forwarding with rvalue-refs auto type deduction You've gone all in :). You "almost" could use range-for loop! Needless to say, this is going to require more than my current VS2010. Thanks for your interest on this, and the concrete example. I'll try to digest it, at my own pace though :). --DD

Dominique Devienne <ddevienne <at> gmail.com> writes:
On Thu, Nov 28, 2013 at 8:37 PM, Joaquin M Lopez Munoz <joaquin <at> tid.es> wrote: Code requires C++11 support :-/
Needless to say, this is going to require more than my current VS2010.
This can be backported to C++03 using the awesome Boost.Preprocessor library. Code above has been tried with VS2012 but should work in your compiler. Joaquín M López Muñoz Telefónica Digital #include <boost/preprocessor/arithmetic/add.hpp> #include <boost/preprocessor/arithmetic/sub.hpp> #include <boost/preprocessor/logical/and.hpp> #include <boost/preprocessor/punctuation/comma_if.hpp> #include <boost/preprocessor/repetition/enum.hpp> #include <boost/preprocessor/repetition/enum_binary_params.hpp> #include <boost/preprocessor/repetition/enum_params.hpp> #include <boost/preprocessor/repetition/repeat.hpp> #include <boost/ref.hpp> #include <boost/tuple/tuple.hpp> #ifndef SKIP_SCAN_MAX_KEYS // user configurable #define SKIP_SCAN_MAX_KEYS 5 #endif #define SKIP_SCAN_TUPLE_FIRST(z,n,_) \ boost::cref(key<n>(c.key_extractor(),*it)) #define SKIP_SCAN_CLASS_INVOKE(z,m,n) \ template< \ typename Container,typename F \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_PARAMS(m,typename T) \
\ static void invoke( \ const Container& c,F f \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_BINARY_PARAMS(m,const T, &t)) \ { \ for(auto it=c.begin(),it_end=c.end();it!=it_end;){ \ for(auto rng=c.equal_range( \ boost::make_tuple( \ BOOST_PP_ENUM(n,SKIP_SCAN_TUPLE_FIRST,~) \ BOOST_PP_COMMA_IF(BOOST_PP_AND(n,m)) \ BOOST_PP_ENUM_PARAMS(m,t))); \ rng.first!=rng.second;++rng.first){ \ f(*rng.first); \ } \ it=c.upper_bound( \ boost::make_tuple(BOOST_PP_ENUM(n,SKIP_SCAN_TUPLE_FIRST,~))); \ } \ }
#define SKIP_SCAN_CLASS_SPECIALIZE(z,n,_) \ template<> struct skip_scan_class<n> \ { \ BOOST_PP_REPEAT( \ BOOST_PP_ADD(BOOST_PP_SUB(SKIP_SCAN_MAX_KEYS,n),1), \ SKIP_SCAN_CLASS_INVOKE,n) \ }; #define SKIP_SCAN_OVERLOAD(z,m,_) \ template< \ int N,typename Container,typename F \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_PARAMS(m,typename T) \
\ void skip_scan( \ const Container& c,F f \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_BINARY_PARAMS(m,const T, &t)) \ { \ skip_scan_class<N>::invoke( \ c,f \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_PARAMS(m,t) \ ); \ }
template<int N,typename CompositeKey,typename Value> auto key(const CompositeKey& ck,const Value& v)->decltype( boost::tuples::get<N>(ck.key_extractors())(v)) { return boost::tuples::get<N>(ck.key_extractors())(v); } template<int N> struct skip_scan_class; BOOST_PP_REPEAT( BOOST_PP_ADD(SKIP_SCAN_MAX_KEYS,1),SKIP_SCAN_CLASS_SPECIALIZE,~) BOOST_PP_REPEAT( BOOST_PP_ADD(SKIP_SCAN_MAX_KEYS,1),SKIP_SCAN_OVERLOAD,~) #undef SKIP_SCAN_OVERLOAD #undef SKIP_SCAN_CLASS_SPECIALIZE #undef SKIP_SCAN_CLASS_INVOKE #undef SKIP_SCAN_TUPLE_FIRST // testing #include <boost/multi_index_container.hpp> #include <boost/multi_index/ordered_index.hpp> #include <boost/multi_index/member.hpp> #include <boost/multi_index/composite_key.hpp> #include <iostream> using namespace boost::multi_index; struct element { element(int x0,int x1,int x2,int x3, int x4): x0(x0),x1(x1),x2(x2),x3(x3),x4(x4){} int x0,x1,x2,x3,x4; }; std::ostream& operator<<(std::ostream& os,const element& v) { os<<"("<<v.x0<<","<<v.x1<<","<<v.x2<<","<<v.x3<<","<<v.x4<<")"; return os; } typedef multi_index_container< element, indexed_by< ordered_non_unique< composite_key< element, member<element,int,&element::x0>, member<element,int,&element::x1>, member<element,int,&element::x2>, member<element,int,&element::x3>, member<element,int,&element::x4> > >
multi_t;
static inline void print_element(const element& v){std::cout<<v<<" ";} int main() { multi_t c; c.emplace(0,1,2,3,4); c.emplace(0,1,2,3,5); c.emplace(0,1,3,4,0); c.emplace(1,0,0,0,0); c.emplace(1,1,2,3,0); c.emplace(1,1,2,3,0); c.emplace(2,1,2,1,0); c.emplace(2,2,2,3,0); skip_scan<1>( c,print_element, 1,2,3); std::cout<<"\n"; skip_scan<3>( c,print_element, 3); std::cout<<"\n"; skip_scan<2>( c,print_element, 2,3); std::cout<<"\n"; }

Joaquin M Lopez Munoz <joaquin <at> tid.es> writes:
Dominique Devienne <ddevienne <at> gmail.com> writes:
On Thu, Nov 28, 2013 at 8:37 PM, Joaquin M Lopez Munoz <joaquin <at> tid.es> wrote: Code requires C++11 support :-/
Needless to say, this is going to require more than my current VS2010.
This can be backported to C++03 using the awesome Boost.Preprocessor library. Code above has been tried with VS2012 but should work in your compiler.
Previous code still used auto, tailing return and decltype, fixed now. Joaquín M López Muñoz Telefónica Digital #include <boost/preprocessor/arithmetic/add.hpp> #include <boost/preprocessor/arithmetic/sub.hpp> #include <boost/preprocessor/logical/and.hpp> #include <boost/preprocessor/punctuation/comma_if.hpp> #include <boost/preprocessor/repetition/enum.hpp> #include <boost/preprocessor/repetition/enum_binary_params.hpp> #include <boost/preprocessor/repetition/enum_params.hpp> #include <boost/preprocessor/repetition/repeat.hpp> #include <boost/ref.hpp> #include <boost/tuple/tuple.hpp> #ifndef SKIP_SCAN_MAX_KEYS // user configurable #define SKIP_SCAN_MAX_KEYS 5 #endif #define SKIP_SCAN_TUPLE_FIRST(z,n,_) \ boost::cref(key<n>(c.key_extractor(),*it)) #define SKIP_SCAN_CLASS_INVOKE(z,m,n) \ template< \ typename Container,typename F \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_PARAMS(m,typename T) \
\ static void invoke( \ const Container& c,F f \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_BINARY_PARAMS(m,const T, &t)) \ { \ typedef typename Container::iterator iterator; \ for(iterator it=c.begin(),it_end=c.end();it!=it_end;){ \ for(std::pair<iterator,iterator> rng=c.equal_range( \ boost::make_tuple( \ BOOST_PP_ENUM(n,SKIP_SCAN_TUPLE_FIRST,~) \ BOOST_PP_COMMA_IF(BOOST_PP_AND(n,m)) \ BOOST_PP_ENUM_PARAMS(m,t))); \ rng.first!=rng.second;++rng.first){ \ f(*rng.first); \ } \ it=c.upper_bound( \ boost::make_tuple(BOOST_PP_ENUM(n,SKIP_SCAN_TUPLE_FIRST,~))); \ } \ }
#define SKIP_SCAN_CLASS_SPECIALIZE(z,n,_) \ template<> struct skip_scan_class<n> \ { \ BOOST_PP_REPEAT( \ BOOST_PP_ADD(BOOST_PP_SUB(SKIP_SCAN_MAX_KEYS,n),1), \ SKIP_SCAN_CLASS_INVOKE,n) \ }; #define SKIP_SCAN_OVERLOAD(z,m,_) \ template< \ int N,typename Container,typename F \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_PARAMS(m,typename T) \
\ void skip_scan( \ const Container& c,F f \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_BINARY_PARAMS(m,const T, &t)) \ { \ skip_scan_class<N>::invoke( \ c,f \ BOOST_PP_COMMA_IF(m) \ BOOST_PP_ENUM_PARAMS(m,t) \ ); \ }
template<int N,typename CompositeKey,typename Value> typename boost::tuples::element< N,typename CompositeKey::key_extractor_tuple
::type::result_type key(const CompositeKey& ck,const Value& v) { return boost::tuples::get<N>(ck.key_extractors())(v); }
template<int N> struct skip_scan_class; BOOST_PP_REPEAT( BOOST_PP_ADD(SKIP_SCAN_MAX_KEYS,1),SKIP_SCAN_CLASS_SPECIALIZE,~) BOOST_PP_REPEAT( BOOST_PP_ADD(SKIP_SCAN_MAX_KEYS,1),SKIP_SCAN_OVERLOAD,~) #undef SKIP_SCAN_OVERLOAD #undef SKIP_SCAN_CLASS_SPECIALIZE #undef SKIP_SCAN_CLASS_INVOKE #undef SKIP_SCAN_TUPLE_FIRST // testing #include <boost/multi_index_container.hpp> #include <boost/multi_index/ordered_index.hpp> #include <boost/multi_index/member.hpp> #include <boost/multi_index/composite_key.hpp> #include <iostream> using namespace boost::multi_index; struct element { element(int x0,int x1,int x2,int x3, int x4): x0(x0),x1(x1),x2(x2),x3(x3),x4(x4){} int x0,x1,x2,x3,x4; }; std::ostream& operator<<(std::ostream& os,const element& v) { os<<"("<<v.x0<<","<<v.x1<<","<<v.x2<<","<<v.x3<<","<<v.x4<<")"; return os; } typedef multi_index_container< element, indexed_by< ordered_non_unique< composite_key< element, member<element,int,&element::x0>, member<element,int,&element::x1>, member<element,int,&element::x2>, member<element,int,&element::x3>, member<element,int,&element::x4> > >
multi_t;
static inline void print_element(const element& v){std::cout<<v<<" ";} int main() { multi_t c; c.emplace(0,1,2,3,4); c.emplace(0,1,2,3,5); c.emplace(0,1,3,4,0); c.emplace(1,0,0,0,0); c.emplace(1,1,2,3,0); c.emplace(1,1,2,3,0); c.emplace(2,1,2,1,0); c.emplace(2,2,2,3,0); skip_scan<1>( c,print_element, 1,2,3); std::cout<<"\n"; skip_scan<3>( c,print_element, 3); std::cout<<"\n"; skip_scan<2>( c,print_element, 2,3); std::cout<<"\n"; }
participants (2)
-
Dominique Devienne
-
Joaquin M Lopez Munoz