[Multi-Index] New "hierarchical" indexing?
I'm using BMI containers with several indexes, notably an index on the parent of each entry. But some of my BMIs have a hierarchical parent-child relationship of their entries, and I need to answer hierarchical queries like is-child-of, is-descendant-of, is-ancestor-of, etc... In another use case, entries have a "scenario" key, where scenario is hierarchical, and I want all entries in BMI A from a given scenario and all its ancestors, with the scenarios being in BMI B (i.e. "cross-bmi"). Of course I can do a "full scan" but I'm looking for something hopefully faster/smarter. Has anyone experimented with providing such a specialized index? At least in one instance, I'm manually keeping a separate container with all the edges of the DAG to be able to answer such a query, but then I have to maintain this "index" separately from the BMI, which is precisely what BMI is designed to avoid. This is also a solution that is OK in this particular case (~ 1500 nodes, ~7500 edges), but may be too expensive for some other graphs with more nodes. Any input would be appreciated. Thanks, --DD
________________________________________ De: boost-users-bounces@lists.boost.org [boost-users-bounces@lists.boost.org] En nombre de Dominique Devienne [ddevienne@gmail.com] Enviado el: viernes, 10 de julio de 2009 20:29 Para: boost-users Asunto: [Boost-users] [Multi-Index] New "hierarchical" indexing?
In another use case, entries have a "scenario" key, where scenario is hierarchical, and I want all entries in BMI A from a given scenario and all its ancestors, with the scenarios being in BMI B (i.e. "cross-bmi"). Of course I can do a "full scan" but I'm looking for something hopefully faster/smarter.
Not entirely sure if the following is directly applicable to your
problem, but anyway. If one of the keys of an index has a hierarchical
nature, one can do child-of queries by taking advantage of compatible
sorting criteria. Let me introduce an example. Suppose we have entries
having an associated scenario which we simply model by a path
showing the hierarchical relations among them:
typedef std::vectorstd::string path_t;
typedef flyweight
entry_container_t;
Now we define: struct parent_t:scenario_t { parent_t(const scenario_t& s):scenario_t(s){} }; struct children { bool operator()(const parent_t& p,const scenario_t& s)const { return std::lexicographical_compare( p.get().begin(),p.get().end(), s.get().begin(),s.get().begin()+std::min(p.get().size(),s.get().size())); } bool operator()(const scenario_t& s,const parent_t& p)const { return std::lexicographical_compare( s.get().begin(),s.get().begin()+std::min(p.get().size(),s.get().size()), p.get().begin(),p.get().end()); } }; What is this? parent_t is just a sort of alias to scenario_t that allows us to differentiate between a "terminal" scenario and a scenario used for purposes of recursive lookup. children is a compatible (with the order of entry_container_t) compare predicate that levels off scenario paths in a way that, for instance, if we have parent "a" then "a/b", "a/d", "a/b/e" etc. are deemed equivalent to the parent. This allows us to locate and count children: // count children entries of scenario s c.count(parent_t(s),children()) Please find attached a complete example shwoing the technique. Is this of any help to your particular problem? Thanks for using Boost.MultiIndex. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo
On Sat, Jul 11, 2009 at 5:17 AM, JOAQUIN M. LOPEZ MUÑOZ
In another use case, entries have a "scenario" key, where scenario is hierarchical, and I want all entries in BMI A from a given scenario and all its ancestors, with the scenarios being in BMI B (i.e. "cross-bmi"). Of course I can do a "full scan" but I'm looking for something hopefully faster/smarter.
Not entirely sure if the following is directly applicable to your problem, but anyway.
Thank you for taking the time to provide a full exemple. That's really nice!
typedef multi_index_container< entry, indexed_by< ordered_non_unique
> > > entry_container_t;
I'm currently using hashed indices only. Does the technique of using a compatible key extend to them? (I suspect not in this case, but I thought I'd ask, just in case I'm misunderstanding.)
What is this? parent_t is just a sort of alias to scenario_t that allows us to differentiate between a "terminal" scenario and a scenario used for purposes of recursive lookup. children is a compatible (with the order of entry_container_t) compare predicate that levels off scenario paths in a way that, for instance, if we have parent "a" then "a/b", "a/d", "a/b/e" etc. are deemed equivalent to the parent.
I was thinking about such a Materialized Path impl, after reading http://www.dbazine.com/oracle/or-articles/tropashko4 recently. But I would not have been able to express it in an example as easily as you could. I appreciate the leg up.
Is this of any help to your particular problem?
I think it does. I'll need time to digest and adapt it, but it sounds like Materialized Path with an ordered index will avoid a full scan. Thanks again for your help. --DD
Dominique Devienne escribió:
I'm currently using hashed indices only. Does the technique of using a compatible key extend to them? (I suspect not in this case, but I thought I'd ask, just in case I'm misunderstanding.)
No, it doesn't extend to hashed indices, you have to stick with ordered ones. The crucial fact upon which the parent/children sorting criteria relies is that sorting on scenario_t induces a lexicographical order that lists subtrees together (if you think of paths as forming a directory tree, lexicographical order is equivalent to inorder traversal).
What is this? parent_t is just a sort of alias to scenario_t that allows us to differentiate between a "terminal" scenario and a scenario used for purposes of recursive lookup. children is a compatible (with the order of entry_container_t) compare predicate that levels off scenario paths in a way that, for instance, if we have parent "a" then "a/b", "a/d", "a/b/e" etc. are deemed equivalent to the parent.
I was thinking about such a Materialized Path impl, after reading http://www.dbazine.com/oracle/or-articles/tropashko4 recently. But I would not have been able to express it in an example as easily as you could. I appreciate the leg up.
I've only taken a cursory look at the reference you provide, but seems indeed that my proposal basically implements the materialized path idea.
Is this of any help to your particular problem?
I think it does. I'll need time to digest and adapt it, but it sounds like Materialized Path with an ordered index will avoid a full scan.
Good luck with your project, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo
participants (3)
-
Dominique Devienne
-
JOAQUIN M. LOPEZ MUÑOZ
-
joaquin@tid.es