Hello, I am interested in working at the Boost.Trie project which was started during GSoC 2013 and I am searching for someone who can give me some advice about what should I change / add to the current implementation, because I'm not too experienced in developing library code. However, I do not want do this during GSoC because I won't be available during the summer. I want to contribute during my spare time. If anyone thinks that can help me with this, please reply to this email. Thank you, Cosmin
2015-02-20 13:36 GMT+04:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>: <...>
However, I do not want do this during GSoC because I won't be available during the summer. I want to contribute during my spare time.
This sounds like a killer feature to me :) I'd gladly help you. As I see, you've already started the work on Trie project by removing the Compare template parameter from trie_node. That's a good start! Lets continue the code simplification. Looks like trie_reverse_iterator could be removed and std::reverse_iterator used instead. This will reduce the code size and simplify the maintainability. clear() and destroy() functions almost duplicate each other. Looks like clear() must work as destroy(), so fix the clear() method and use it everywhere in code instead of destroy(). Make sure that everything works well, and tests pass. In case of trouble or finishing the task - notify me. Good luck! Hope you'll enjoy the project! -- Best regards, Antony Polukhin
Hello, I didn't managed to run the tests on Windows using bjam and also couldn't manage to run any kind of boost test in visual studio. Any test I am trying to run receveies acces violation error . Cosmin On 20 February 2015 at 11:58, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-02-20 13:36 GMT+04:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>: <...>
However, I do not want do this during GSoC because I won't be available during the summer. I want to contribute during my spare time.
This sounds like a killer feature to me :) I'd gladly help you.
As I see, you've already started the work on Trie project by removing the Compare template parameter from trie_node. That's a good start!
Lets continue the code simplification. Looks like trie_reverse_iterator could be removed and std::reverse_iterator used instead. This will reduce the code size and simplify the maintainability.
clear() and destroy() functions almost duplicate each other. Looks like clear() must work as destroy(), so fix the clear() method and use it everywhere in code instead of destroy(). Make sure that everything works well, and tests pass.
In case of trouble or finishing the task - notify me.
Good luck! Hope you'll enjoy the project!
-- Best regards, Antony Polukhin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hello, Finally I have managed to run the tests. Cosmin On 20 February 2015 at 18:11, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
Hello,
I didn't managed to run the tests on Windows using bjam and also couldn't manage to run any kind of boost test in visual studio. Any test I am trying to run receveies acces violation error .
Cosmin
On 20 February 2015 at 11:58, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-02-20 13:36 GMT+04:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>: <...>
However, I do not want do this during GSoC because I won't be available during the summer. I want to contribute during my spare time.
This sounds like a killer feature to me :) I'd gladly help you.
As I see, you've already started the work on Trie project by removing the Compare template parameter from trie_node. That's a good start!
Lets continue the code simplification. Looks like trie_reverse_iterator could be removed and std::reverse_iterator used instead. This will reduce the code size and simplify the maintainability.
clear() and destroy() functions almost duplicate each other. Looks like clear() must work as destroy(), so fix the clear() method and use it everywhere in code instead of destroy(). Make sure that everything works well, and tests pass.
In case of trouble or finishing the task - notify me.
Good luck! Hope you'll enjoy the project!
-- Best regards, Antony Polukhin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hello, I have two observations about the removal of the reverse_iterator. The trie_reverse_iterator has a methd get_key() which calls the get_key() method from the base iterator. If std::reverse_iterator is used then this method can't be accessed anymore. If I remove it there is no way to retrieve the key of a specific object. Also, the trie_iterator doesn't act like a standard STL iterator over a mapped data structure. Those iterators return a pair<Key, Value> when they're derefferenced. trie_iterator returns only Value. In order to address the things I have mentioned above, I am thinking at changing the operators *, and -> from the trie_iterator to return pair<const std::vector<Key>, Value>. Also, I think about adding a new member into the trie_iterator class, named current_path, which would represent the current path in the Trie from the root to the node which the iterator is pointing to. By doing so, std::reverse_iterator can be used instead of trie_reverse_iterator and also the key associated with the current node can be retrieved in better running time. I would like to know your oppinion about the changes I am proposing. Thank you, Cosmin On 20 February 2015 at 20:21, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
Hello,
Finally I have managed to run the tests.
Cosmin
On 20 February 2015 at 18:11, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
Hello,
I didn't managed to run the tests on Windows using bjam and also couldn't manage to run any kind of boost test in visual studio. Any test I am trying to run receveies acces violation error .
Cosmin
On 20 February 2015 at 11:58, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-02-20 13:36 GMT+04:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>: <...>
However, I do not want do this during GSoC because I won't be available during the summer. I want to contribute during my spare time.
This sounds like a killer feature to me :) I'd gladly help you.
As I see, you've already started the work on Trie project by removing the Compare template parameter from trie_node. That's a good start!
Lets continue the code simplification. Looks like trie_reverse_iterator could be removed and std::reverse_iterator used instead. This will reduce the code size and simplify the maintainability.
clear() and destroy() functions almost duplicate each other. Looks like clear() must work as destroy(), so fix the clear() method and use it everywhere in code instead of destroy(). Make sure that everything works well, and tests pass.
In case of trouble or finishing the task - notify me.
Good luck! Hope you'll enjoy the project!
-- Best regards, Antony Polukhin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
2015-02-20 23:17 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have two observations about the removal of the reverse_iterator.
The trie_reverse_iterator has a methd get_key() which calls the get_key() method from the base iterator. If std::reverse_iterator is used then this method can't be accessed anymore. If I remove it there is no way to retrieve the key of a specific object.
Also, the trie_iterator doesn't act like a standard STL iterator over a mapped data structure. Those iterators return a pair<Key, Value> when they're derefferenced. trie_iterator returns only Value.
In order to address the things I have mentioned above, I am thinking at changing the operators *, and -> from the trie_iterator to return pair<const std::vector<Key>, Value>. Also, I think about adding a new member into the trie_iterator class, named current_path, which would represent the current path in the Trie from the root to the node which the iterator is pointing to. By doing so, std::reverse_iterator can be used instead of trie_reverse_iterator and also the key associated with the current node can be retrieved in better running time.
I would like to know your oppinion about the changes I am proposing.
Good points with reverse_iterator. Returning pair<> is STL like, which is good. I'm just not sure what the first template parameter must be, but std::vector<Key> is OK for first time. current_path looks like an overkill. Iterators must be small and simple to copy, I'd recommend to optimize get_key() instead: * make pass to the tree root, counting elements * resize vector to counted elements count * make one more pass to the root, filling vector from the end with values This algo will reduce memory allocations count and memory consumption. -- Best regards, Antony Polukhin
Hello, I have performed the modifications required in order to use std::reverse_iterator instead of trie_iterator. [0]. Also, I have optimized the get_key function [1]. The overall diff can be found here [2]. Please give me some feedback about the code when you have time. Also, it seems to be a problem with trie_iterator::operator->. It doesn't work and I think it may cause memory corruption. I think the problem is that it returns &(operator *()) which is a value allocated on stack that gets destroyed when it goes out of scope. However, I don't know how it should be implemented. Please tell me what do you think about this operator. As a workaround, instead of using this operator I have used operator * and the code passes the tests. Thank you, Cosmin
Hello, I have forgotten to add the links to the patches in the previous email. [0] https://github.com/cosminBoaca/boost.trie/commit/8ac00253dbf37ede23a399bb59a... [1] https://github.com/cosminBoaca/boost.trie/commit/8423ad3588eb8768f2f0424ca63... [2] https://github.com/cosminBoaca/boost.trie/compare/BoostGSoC13:master...maste... Thank you, Cosmin On 22 February 2015 at 15:23, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
Hello,
I have performed the modifications required in order to use std::reverse_iterator instead of trie_iterator. [0]. Also, I have optimized the get_key function [1]. The overall diff can be found here [2]. Please give me some feedback about the code when you have time.
Also, it seems to be a problem with trie_iterator::operator->. It doesn't work and I think it may cause memory corruption. I think the problem is that it returns &(operator *()) which is a value allocated on stack that gets destroyed when it goes out of scope. However, I don't know how it should be implemented. Please tell me what do you think about this operator. As a workaround, instead of using this operator I have used operator * and the code passes the tests.
Thank you, Cosmin
Hello, "clear() and destroy() functions almost duplicate each other. Looks like clear() must work as destroy(), so fix the clear() method and use it everywhere in code instead of destroy(). Make sure that everything works well, and tests pass." I have removed the destroy function [0]. I am ready for doing other tasks. I know from last year that the trie class needs a partial template specialization for Value = void in order to optimize the memory ussage of trie_set and trie_multiset. However, I think that this task is quite difficult and I would like to solve it later on when I gain some more experience with the project. If this is the easiest task that can be done I will start with it then. [0] https://github.com/cosminBoaca/boost.trie/commit/3ebcdf38d238919916a227273d8... Thank you, Cosmin
2015-02-22 16:23 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have performed the modifications required in order to use std::reverse_iterator instead of trie_iterator. [0]. Also, I have optimized the get_key function [1]. The overall diff can be found here [2]. Please give me some feedback about the code when you have time.
Good, added a few minor comments.
Also, it seems to be a problem with trie_iterator::operator->. It doesn't work and I think it may cause memory corruption. I think the problem is that it returns &(operator *()) which is a value allocated on stack that gets destroyed when it goes out of scope. However, I don't know how it should be implemented. Please tell me what do you think about this operator. As a workaround, instead of using this operator I have used operator * and the code passes the tests.
As a temporary workaround operator*() could return pair<const std::vector<Key>, Value& > by value. Ideal hard to implement solution would be to use Boost intrusive's slist or develop own slist and use slist in node and instead of std::vector: // pseudocode template <class Key> class slist: public slist_base_hook<> { slist* parent_; Key key_; slist_iterator begin() { return this; } slist_iterator end() { return 0; } // ... std::vector<Key> to_vector() { std::vector<Key> key_path; key_path.reserve(this->size()); std::copy(this->begin(), this->end(), std::back_inserter(key_path)); return key_path; } }; template <class Key, class Value> class node: public pair<const slist<const Key>, Value> {}; class trie_iterator { node* vnode; reference operator*() { return static_cast<pair<const slist<const Key>, Value>>(*vnode); } } But this can wait... Simple task: trie_iterator accepts 4 template parameters, while Key and Value types seem enough. Try to remove two template parameters, take care of almost identical iterator_type and iter_type typedefs, -- Best regards, Antony Polukhin
Hello,
As a temporary workaround operator*() could return pair<const std::vector<Key>, Value& > by value.
Operator*() works fine and is already returning that, the problem is with operator->. Also, I don't know what do you mean when you tell me to return it by value. Thank you, Cosmin On 22 February 2015 at 18:58, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-02-22 16:23 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have performed the modifications required in order to use std::reverse_iterator instead of trie_iterator. [0]. Also, I have optimized the get_key function [1]. The overall diff can be found here [2]. Please give me some feedback about the code when you have time.
Good, added a few minor comments.
Also, it seems to be a problem with trie_iterator::operator->. It doesn't work and I think it may cause memory corruption. I think the problem is that it returns &(operator *()) which is a value allocated on stack that gets destroyed when it goes out of scope. However, I don't know how it should be implemented. Please tell me what do you think about this operator. As a workaround, instead of using this operator I have used operator * and the code passes the tests.
As a temporary workaround operator*() could return pair<const std::vector<Key>, Value& > by value.
Ideal hard to implement solution would be to use Boost intrusive's slist or develop own slist and use slist in node and instead of std::vector:
// pseudocode template <class Key> class slist: public slist_base_hook<> { slist* parent_; Key key_;
slist_iterator begin() { return this; } slist_iterator end() { return 0; } // ...
std::vector<Key> to_vector() { std::vector<Key> key_path; key_path.reserve(this->size());
std::copy(this->begin(), this->end(), std::back_inserter(key_path));
return key_path; } };
template <class Key, class Value> class node: public pair<const slist<const Key>, Value> {};
class trie_iterator { node* vnode;
reference operator*() { return static_cast<pair<const slist<const Key>, Value>>(*vnode); } }
But this can wait... Simple task: trie_iterator accepts 4 template parameters, while Key and Value types seem enough. Try to remove two template parameters, take care of almost identical iterator_type and iter_type typedefs,
-- Best regards, Antony Polukhin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hello, I have encountered some problems when I have tried to remove the 2 template parameters. I have tried to do it this way : in trie_iterator class i have modified the following typedefs typedef trie_iterator<Key, Value> iterator typedef trie_iterator<Key, Value> iter_type; tpyedef trie_iterator<Key, const Value> const_iterator; typedef std::pair<std::vector<key_type>, Value&> reference; typedef std::pair<std::vector<key_type>, Value&>* pointer; And in trie class typedef detail::trie_iterator<Key, Value> iterator I thought this was the only thing I need to modify in order to solve the task but it doesn't work. Actually, the following code doesn't work and I can't figure out why : boost::tries::trie<char, int> t; boost::tries::trie<char, int>::const_iterator ci = t.end(); It raises compile time error that says it cannot convert from iterator to const_iterator. Basically it seems that it's called the end method that returns iterator instead of the const one and I can't figure out why. If i change from end to cend then it works. Thank you, Cosmin
On 2/23/2015 8:55 AM, Cosmin Boaca wrote:
Hello,
I have encountered some problems when I have tried to remove the 2 template parameters. I have tried to do it this way :
in trie_iterator class i have modified the following typedefs
typedef trie_iterator<Key, Value> iterator typedef trie_iterator<Key, Value> iter_type; tpyedef trie_iterator<Key, const Value> const_iterator; typedef std::pair<std::vector<key_type>, Value&> reference; typedef std::pair<std::vector<key_type>, Value&>* pointer;
And in trie class typedef detail::trie_iterator<Key, Value> iterator
I thought this was the only thing I need to modify in order to solve the task but it doesn't work. Actually, the following code doesn't work and I can't figure out why :
boost::tries::trie<char, int> t; boost::tries::trie<char, int>::const_iterator ci = t.end();
It raises compile time error that says it cannot convert from iterator to const_iterator. Basically it seems that it's called the end method that returns iterator instead of the const one and I can't figure out why.
Overloaded functionality in C++ is based on the types of the parameters being used in the call and not the type of the return value. if you have: boost::tries::trie<char, int> const t; boost::tries::trie<char, int>::const_iterator ci = t.end(); I will bet it works. Also the equivalent boost::tries::trie<char, int> t; boost::tries::trie<char, int>::const_iterator ci = (const_cast<boost::tries::trie<char, int> const>(t)).end(); should also work.
If i change from end to cend then it works.
Hello, So the reason for that working until now is the fact that const_iterator could have beem implicitely constructed from iterator ? Cosmin
Hello, I really don't know what do to. I'm getting some compile errors that make no sense for me. For instance : trie_map<int, int> tm; trie<int, int> t; trie_map<int, int>::const_iterator it(tm.cend()); // raises compile error trie<int, int>::const_iterator(t.cend()); // works fine I have no idea why the compile error is raised there. Should I push the changes so you may take a look ? Thank you, Cosmin
2015-02-23 22:04 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I really don't know what do to. I'm getting some compile errors that make no sense for me. For instance : trie_map<int, int> tm; trie<int, int> t; trie_map<int, int>::const_iterator it(tm.cend()); // raises compile error trie<int, int>::const_iterator(t.cend()); // works fine
I have no idea why the compile error is raised there. Should I push the changes so you may take a look ?
Push the changes, I'll take a look into the sources tomorrow -- Best regards, Antony Polukhin
2015-02-23 23:04 GMT+04:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I really don't know what do to. I'm getting some compile errors that make no sense for me. For instance : trie_map<int, int> tm; trie<int, int> t; trie_map<int, int>::const_iterator it(tm.cend()); // raises compile error trie<int, int>::const_iterator(t.cend()); // works fine
I have no idea why the compile error is raised there. Should I push the changes so you may take a look ?
First error is in trie_set at line 50. There trie_type t is used which is trie<key_type, value_type>. trie_set::iterator is a const iterator, while trie<key_type, value_type> can also return nonconst iterators.Take a look at line 50 in trie_set.hpp: iterator begin() { return t.begin(); } Here `iterator` is a `const_iterator`, while `t.begin();` returns nonconst iterator. Solution would be a call to `t.cbegin()` method. In that case const_iterator will be returned. Investigate the problem further, same errors are with non const qualified end(), rend(), rbegin() A few more notes: * line 238 at trie.hpp requires C++11 because of std::ref. This can be relaxed, if you'll rewrite the body: return std::pair<std::vector<key_type>, Value&>(get_key(), vnode->value); * Boost folders structures changed since 2013. Rearrange folders like it is described here: http://rrsd.com/blincubator.com/requirements_directory_structure/ . Also you may take a look into the https://github.com/apolukhin/Boost.DLL to see how the folders must look like. There in Installation section of Readme you'll find a simple way to install new library/trie into the existing boost installation. * When it's done, try to setup autotesting, using this manual: https://svn.boost.org/trac/boost/wiki/TravisCoverals . This will give you a simple way to auto test code on Linux platform. In case of any errors or problems do not hesitate to contact me and ask questions. -- Best regards, Antony Polukhin
Hello, I have performed the modifications in trie_set. However, do you think you have time to compile this piece of code ? #include <boost/trie/trie.hpp> int main() { boost::tries::trie<int, int> t; boost::tries::trie<int, int>::const_iterator it = t.cbegin(); return 0; } I just can't figure out how to get this code compiling. Please tell me what should I do. The compile errors reffer to trie_node_type from the trie_iterator class but I can't figure out how to define it other than the way it is. Thank you, Cosmin
Hello, I think this task is harder than I might expected. I have managed to compile the code in the last email, but there are problems with the assignment operators for the const iterator. Also, in order to compile it I needed to use std::remove_const. All this compile errors appear because of the following typedefs : typedef trie_iterator<Key, Value> iterator; typedef typename std::add_const<Value>::type const_value; typedef trie_iterator<Key, const_value> const_iterator.l I can't find any other way to define the const_iterator. If you have any other idea please tell me. If you haven't I think we should let the iterator with 4 template parameters as it was and maybe change it later and do some other stuff until then. I'm stuck with this for a few days and I feel I could implement more and maybe come back to this later. Thank you, Cosmin
Hello, I have managed to remove the template parameters and get all the tests compiling and running [0]. Also, I have fixed the begin, end, rbegin(), rend() functions from trie_set / trie_multiset [1] and I have used std::ref in operator* only when the compiler supports rvalue references[2]. The solution with make_pair<vector<key_type>, Value&> works only on non C++11 compilers, because on C++11 reference is removed from the return_type of make_pair in C++11. Please review the code if you have time and tell me what should I do next. I think a move constructor for C++11 compilers would be a nice addition. [0] https://github.com/cosminBoaca/boost.trie/commit/f9e0800c8dcc73a615e73a87175... [1] https://github.com/cosminBoaca/boost.trie/commit/a3eeae277470788cbec6ccba2aa... [2] https://github.com/cosminBoaca/boost.trie/commit/1bfda5490e18e3bcd51ec86883a... Thank you, Cosmin
2015-02-27 15:01 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have managed to remove the template parameters and get all the tests compiling and running [0]. Also, I have fixed the begin, end, rbegin(), rend() functions from trie_set / trie_multiset [1] and I have used std::ref in operator* only when the compiler supports rvalue references[2]. The solution with make_pair<vector<key_type>, Value&> works only on non C++11 compilers, because on C++11 reference is removed from the return_type of make_pair in C++11.
Please review the code if you have time and tell me what should I do next. I think a move constructor for C++11 compilers would be a nice addition.
Good progress, well done! Instead of using make_pair, just directly call the pair constructor in C++98 and C++11: return std::pair<std::vector<key_type>, Value&>(get_key(), vnode->value); Move constructors and operators are a good addition, they can be added at any time. Administration related tasks look more significant right now: * Boost folders structures changed since 2013. Rearrange folders like it is described here: http://rrsd.com/blincubator.com/requirements_directory_structure/ . Also you may take a look into the https://github.com/apolukhin/Boost.DLL to see how the folders must look like. There in Installation section of Readme you'll find a simple way to install new library/trie into the existing boost installation. * When it's done, try to setup autotesting, using this manual: https://svn.boost.org/trac/boost/wiki/TravisCoverals . This will give you a simple way to auto test code on Linux platform. Another significant and hard task: * trie_node structure looks too heavy: there are too many member variables. Try to simplify trie_node and value_list_node structures. Think of a possible optimizations of the trie_node structure for trie_set|trie_map|trie_multiset|trie_multimap. For example all the sets do not require values, so pointers to value list nodes can be removed from trie_set in that case. P.S.: I'll take a look into the "boost::tries::trie<int, int>::const_iterator it = t.cbegin();" issue on Monday or Tuesday. -- Best regards, Antony Polukhin
P.S.: I'll take a look into the "boost::tries::trie<int,int>::const_iterator it = t.cbegin();" issue on Monday or Tuesday.
Hello, I have fixed this issue. The iterators are working exactly the same way they have worked before. I will start working today on the other tasks. Thank you, Cosmin
Another significant and hard task: trie_node structure looks too heavy: there are too many member variables. Try to simplify trie_node and value_list_node structures. Think of a possible optimizations of the trie_node structure for
Hello, I have modified the directory strucure. I think it's ok right now. trie_set|trie_map|trie_multiset|trie_multimap.
For example all the sets do not require values, so pointers to value list nodes can be removed from trie_set in that case.
I have thought of this task too. I will share with you some ideas in order to get some feedback about them. 1. Value = void specialization for trie_node which would remove pointers to value_list_head/tail, self_value_count. 2. Another specialization or something like this would be ok for trie_map also. The map_node should only contain a member value_type value. value_list_head/tail, self_value_count should be removed in this case to but I don't know how to specialize the template (for which key, value) type. 3. Considering 2, trie_multiset could be easily implemented in terms of trie_map<key, int>, maybe using private inheritance. The int would keep track of the frequency of the key prefix. 4. value_count could be completely removed but this would greatly reduce the performance of count_prefix function. In my opinion this shouldn't be removed. I don't know too much about pred_node, next_node, and the child_iter_of_parrent. I need to take some more time to look at those things to see if any of them could be removed. I see that pred_node and next_node are used in trie_iterator and child_iter_of_parrent is used to maintain the values of pred_node and next_node. However, I don't know which would be the tradeofs of the removal for each of them. Looking forward to your feedback about the ideas above. Thank you, Cosmin
2015-03-05 2:55 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have modified the directory strucure. I think it's ok right now.
Another significant and hard task: trie_node structure looks too heavy: there are too many member variables. Try to simplify trie_node and value_list_node structures. Think of a possible optimizations of the trie_node structure for trie_set|trie_map|trie_multiset|trie_multimap. For example all the sets do not require values, so pointers to value list nodes can be removed from trie_set in that case.
I have thought of this task too. I will share with you some ideas in order to get some feedback about them.
1. Value = void specialization for trie_node which would remove pointers to value_list_head/tail, self_value_count.
+1
2. Another specialization or something like this would be ok for trie_map also. The map_node should only contain a member value_type value. value_list_head/tail, self_value_count should be removed in this case to but I don't know how to specialize the template (for which key, value) type.
+1 How about adding aditional template parameter to map node. Something like template <class Key, class Value, bool IsMulty> class map_node;
3. Considering 2, trie_multiset could be easily implemented in terms of trie_map<key, int>, maybe using private inheritance. The int would keep track of the frequency of the key prefix.
+1
4. value_count could be completely removed but this would greatly reduce the performance of count_prefix function. In my opinion this shouldn't be removed.
This is discussable. Let's solve this some time later.
I don't know too much about pred_node, next_node, and the child_iter_of_parrent. I need to take some more time to look at those things to see if any of them could be removed. I see that pred_node and next_node are used in trie_iterator and child_iter_of_parrent is used to maintain the values of pred_node and next_node. However, I don't know which would be the tradeofs of the removal for each of them.
Let's eat an elephant part by part. There's more than enough tasks right now, even without *_node and child_iter_of_parrent.
Looking forward to your feedback about the ideas above.
Good progress! Trie library becoming better and better. -- Best regards, Antony Polukhin
Is there place for one more contributor (for me) ? Should i someshow prove my skills? 2015-03-05 10:15 GMT+03:00 Antony Polukhin <antoshkka@gmail.com>:
2015-03-05 2:55 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have modified the directory strucure. I think it's ok right now.
Another significant and hard task: trie_node structure looks too heavy: there are too many member variables. Try to simplify trie_node and value_list_node structures. Think of a possible optimizations of the trie_node structure for trie_set|trie_map|trie_multiset|trie_multimap. For example all the sets do not require values, so pointers to value list nodes can be removed from trie_set in that case.
I have thought of this task too. I will share with you some ideas in order to get some feedback about them.
1. Value = void specialization for trie_node which would remove pointers to value_list_head/tail, self_value_count.
+1
2. Another specialization or something like this would be ok for trie_map also. The map_node should only contain a member value_type value. value_list_head/tail, self_value_count should be removed in this case to but I don't know how to specialize the template (for which key, value) type.
+1
How about adding aditional template parameter to map node. Something like
template <class Key, class Value, bool IsMulty> class map_node;
3. Considering 2, trie_multiset could be easily implemented in terms of trie_map<key, int>, maybe using private inheritance. The int would keep track of the frequency of the key prefix.
+1
4. value_count could be completely removed but this would greatly reduce the performance of count_prefix function. In my opinion this shouldn't be removed.
This is discussable. Let's solve this some time later.
I don't know too much about pred_node, next_node, and the child_iter_of_parrent. I need to take some more time to look at those things to see if any of them could be removed. I see that pred_node and next_node are used in trie_iterator and child_iter_of_parrent is used to maintain the values of pred_node and next_node. However, I don't know which would be the tradeofs of the removal for each of them.
Let's eat an elephant part by part. There's more than enough tasks right now, even without *_node and child_iter_of_parrent.
Looking forward to your feedback about the ideas above.
Good progress! Trie library becoming better and better.
-- Best regards, Antony Polukhin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Are there any already made outlined plans? Because I need to use a specialized radix tree (or caching trie) for a very highly performance critical task and I was wondering if I could somehow get my code into boost so that the code can be reused. The Trie type itself could be an NVI interface, and the specializations mere exchangable templates. What do you think? On Thu, Mar 5, 2015 at 11:21 AM, endight . <endight@gmail.com> wrote:
Is there place for one more contributor (for me) ? Should i someshow prove my skills?
2015-03-05 10:15 GMT+03:00 Antony Polukhin <antoshkka@gmail.com>:
2015-03-05 2:55 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have modified the directory strucure. I think it's ok right now.
Another significant and hard task: trie_node structure looks too heavy: there are too many member variables. Try to simplify trie_node and value_list_node structures. Think of a possible optimizations of the trie_node structure for trie_set|trie_map|trie_multiset|trie_multimap. For example all the sets do not require values, so pointers to value list nodes can be removed from trie_set in that case.
I have thought of this task too. I will share with you some ideas in order to get some feedback about them.
1. Value = void specialization for trie_node which would remove pointers to value_list_head/tail, self_value_count.
+1
2. Another specialization or something like this would be ok for trie_map also. The map_node should only contain a member value_type value. value_list_head/tail, self_value_count should be removed in this case to but I don't know how to specialize the template (for which key, value) type.
+1
How about adding aditional template parameter to map node. Something like
template <class Key, class Value, bool IsMulty> class map_node;
3. Considering 2, trie_multiset could be easily implemented in terms of trie_map<key, int>, maybe using private inheritance. The int would keep track of the frequency of the key prefix.
+1
4. value_count could be completely removed but this would greatly reduce the performance of count_prefix function. In my opinion this shouldn't be removed.
This is discussable. Let's solve this some time later.
I don't know too much about pred_node, next_node, and the child_iter_of_parrent. I need to take some more time to look at those things to see if any of them could be removed. I see that pred_node and next_node are used in trie_iterator and child_iter_of_parrent is used to maintain the values of pred_node and next_node. However, I don't know which would be the tradeofs of the removal for each of them.
Let's eat an elephant part by part. There's more than enough tasks right now, even without *_node and child_iter_of_parrent.
Looking forward to your feedback about the ideas above.
Good progress! Trie library becoming better and better.
-- Best regards, Antony Polukhin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
2015-03-05 19:34 GMT+03:00 Kenneth Adam Miller <kennethadammiller@gmail.com> :
Are there any already made outlined plans? Because I need to use a specialized radix tree (or caching trie) for a very highly performance critical task and I was wondering if I could somehow get my code into boost so that the code can be reused. The Trie type itself could be an NVI interface, and the specializations mere exchangable templates.
What do you think?
Boost.Trie is currently not optimized and probably has some issues. Any good suggestions, patches, feature requests and documentation improvements are welcomed. This is an open source, so all you need to do to help - just contact the developers and show the code. Cosmin Boaca is actively developing the library, so he could probably give you some ideas about stuff that he needs help with. -- Best regards, Antony Polukhin
Ok, where is the repo? Cosmin: Are you open to the idea of having the trie act as a container and sort of adopt a superset of the operations that work on traditional STL containers? Are you open to the idea of having the trie be parameterizable with concurrency strategies? With value strategies that can even be an additional data structures? Are you open to the idea of having one specialization that is highly optimized for integer keys, with separate implementations for x32/x64? I'm interested in each of these. And I have ideas about how to move forward implementing them. I'd also like to work on applying what we make toward several already open libraries, by writing some bindings to python and some other languages. I'm interested in writing some protobuf/capnproto specifications and some data-type translations between the vernacular too. Can we discuss a work partitioning and road map to move forward? Are there any shared documents, like google docs that discuss plans and ideas? On Thu, Mar 5, 2015 at 12:13 PM, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-03-05 19:34 GMT+03:00 Kenneth Adam Miller < kennethadammiller@gmail.com> :
Are there any already made outlined plans? Because I need to use a specialized radix tree (or caching trie) for a very highly performance critical task and I was wondering if I could somehow get my code into boost so that the code can be reused. The Trie type itself could be an NVI interface, and the specializations mere exchangable templates.
What do you think?
Boost.Trie is currently not optimized and probably has some issues. Any good suggestions, patches, feature requests and documentation improvements are welcomed. This is an open source, so all you need to do to help - just contact the developers and show the code.
Cosmin Boaca is actively developing the library, so he could probably give you some ideas about stuff that he needs help with.
-- Best regards, Antony Polukhin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 5 March 2015 at 19:15, Kenneth Adam Miller <kennethadammiller@gmail.com> wrote:
Ok, where is the repo?
Cosmin: Are you open to the idea of having the trie act as a container and sort of adopt a superset of the operations that work on traditional STL containers?
Are you open to the idea of having the trie be parameterizable with concurrency strategies? With value strategies that can even be an additional data structures?
I am not quite experienced in concurrent data structures. Honestly I have never thought of that but I am open to any discussion about implementation strategies. Maybe Anthony could help us taking design decisions and some plan of implementation. Are you open to the idea of having one specialization that is highly
optimized for integer keys, with separate implementations for x32/x64?
When you say integer keys you mean some set of integers like {2, 3, 4} or just one integer and the string in trie being it's bits ?
I'm interested in each of these. And I have ideas about how to move forward implementing them. I'd also like to work on applying what we make toward several already open libraries, by writing some bindings to python and some other languages. I'm interested in writing some protobuf/capnproto specifications and some data-type translations between the vernacular too.
I am not familiar with protobuf / canproto.
Can we discuss a work partitioning and road map to move forward? Are there any shared documents, like google docs that discuss plans and ideas?
We could start a discussion but I can't take any deadlines for implementation of certain features for myself. I'm developing this data structure during my spare time and I have a full undergraduate coursework + being undergraduate TA.
2015-03-05 20:48 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
On 5 March 2015 at 19:15, Kenneth Adam Miller <kennethadammiller@gmail.com
wrote:
Ok, where is the repo?
Cosmin: Are you open to the idea of having the trie act as a container and sort of adopt a superset of the operations that work on traditional STL containers?
Are you open to the idea of having the trie be parameterizable with concurrency strategies? With value strategies that can even be an additional data structures?
I am not quite experienced in concurrent data structures. Honestly I have never thought of that but I am open to any discussion about implementation strategies. Maybe Anthony could help us taking design decisions and some plan of implementation.
Well, usually you make a data structure concurrent after you've polished the implementation, removed the unnecessary fields, tuned structure for size. Big problem with concurrent data structures - is that generic solution would probably be much slower than a problem-tuned solution. Making a concurrent trie is discussable, but in any case tuning the current implementation must be done first.
I'm interested in each of these. And I have ideas about how to move forward
implementing them. I'd also like to work on applying what we make toward several already open libraries, by writing some bindings to python and some other languages. I'm interested in writing some protobuf/capnproto specifications and some data-type translations between the vernacular too.
I am not familiar with protobuf / canproto.
Trie's API is not stable yet, so do not rush with that task. -- Best regards, Antony Polukhin
Sure, agreed. On Fri, Mar 6, 2015 at 3:44 AM, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-03-05 20:48 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
On 5 March 2015 at 19:15, Kenneth Adam Miller < kennethadammiller@gmail.com
wrote:
Ok, where is the repo?
Cosmin: Are you open to the idea of having the trie act as a container and sort of adopt a superset of the operations that work on traditional STL containers?
Are you open to the idea of having the trie be parameterizable with concurrency strategies? With value strategies that can even be an additional data structures?
I am not quite experienced in concurrent data structures. Honestly I have never thought of that but I am open to any discussion about implementation strategies. Maybe Anthony could help us taking design decisions and some plan of implementation.
Well, usually you make a data structure concurrent after you've polished the implementation, removed the unnecessary fields, tuned structure for size. Big problem with concurrent data structures - is that generic solution would probably be much slower than a problem-tuned solution.
Making a concurrent trie is discussable, but in any case tuning the current implementation must be done first.
I'm interested in each of these. And I have ideas about how to move forward
implementing them. I'd also like to work on applying what we make toward several already open libraries, by writing some bindings to python and some other languages. I'm interested in writing some protobuf/capnproto specifications and some data-type translations between the vernacular too.
I am not familiar with protobuf / canproto.
Trie's API is not stable yet, so do not rush with that task.
-- Best regards, Antony Polukhin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Well, just conjecture, but I would guess that any one trie_node would not make much sense outside of the context. As in, you have to have a trie make sense with respect to it's surroundings. Say you use either the tree representation or a contiguouos array. A copy of a single element would imply that only a few bytes from the contiguous array are removed. You can't do indexing or any of the other magic that a trie is meant to do with only one of them. On Fri, Mar 6, 2015 at 1:21 PM, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
Hello,
Why is the trie_node noncopyable ?
Cosmin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 6 March 2015 at 20:32, Kenneth Adam Miller <kennethadammiller@gmail.com> wrote:
Well, just conjecture, but I would guess that any one trie_node would not make much sense outside of the context. As in, you have to have a trie make sense with respect to it's surroundings. Say you use either the tree representation or a contiguouos array. A copy of a single element would imply that only a few bytes from the contiguous array are removed. You can't do indexing or any of the other magic that a trie is meant to do with only one of them.
I don't get it. You can just create another node by copying it member with member. Cosmin
Yeah, but then you didn't create just one node, you created the entire trie. On Fri, Mar 6, 2015 at 1:37 PM, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
On 6 March 2015 at 20:32, Kenneth Adam Miller <kennethadammiller@gmail.com
wrote:
Well, just conjecture, but I would guess that any one trie_node would not make much sense outside of the context. As in, you have to have a trie make sense with respect to it's surroundings. Say you use either the tree representation or a contiguouos array. A copy of a single element would imply that only a few bytes from the contiguous array are removed. You can't do indexing or any of the other magic that a trie is meant to do with only one of them.
I don't get it. You can just create another node by copying it member with member.
Cosmin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
I have thought of this task too. I will share with you some ideas in order to get some feedback about them.
1. Value = void specialization for trie_node which would remove pointers to value_list_head/tail, self_value_count.
+1
I have done this task. Also I have done refactoring and cleanup on the whole project. I have decided to change a couple of things: 1. I have removed all the methods from trie class that created / destroyed nodes. I have replaced them with calls to new and delete. 2. I have removed the node_allocator / value_allocator from trie class. I think something like a node factory that would handle only creation / destruction of nodes is more appropriate if we plan to allow custom allocators. 3. I have added some methods to the trie_node class in order to share a common interface between template specializations. Now operations that query the node internal state (adding / removing values / no_value()) is now done using those methods. 4. Because the iterator needed to be specialized too in order to handle trie_node specialization I have decided to delegate the erasure of the iterator to the iterator itself. This was the only way that could handle the strucutral difference in terms of members between the specializations. 5. The iterator class now has another template parameter called Enable. I needed to specialize the iterator for both void / const void and i used boost::enable_if in order to achieve this with less code. const void specialization is needed too because of the erase methods from trie. In trie class there are two erase methods, erase(iterator), erase(const_iterator). The const iterator coresonding for iterator<Key, void> is iterator<Key, const void>. If they were the same the erase method from trie would have been redefined resulting in compile time errors. 6. I have added more tests for trie_set. Now all the features are tested, find, remove, insert, count_prefix, copy, iterators. 7. A friend of mine has implemented a build system based on cmake. I have described above all the changes that I needed to do in order to implement the template specialization. Also, I think now the node and iterator are abstracted better. Please give me some feedback about the changes that I have made [0]. [0] https://github.com/cosminBoaca/boost.trie Thank you, Cosmin
2015-03-07 20:10 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
I have thought of this task too. I will share with you some ideas in order to get some feedback about them.
1. Value = void specialization for trie_node which would remove pointers to value_list_head/tail, self_value_count.
+1
I have done this task. Also I have done refactoring and cleanup on the whole project.
Good work.
I have decided to change a couple of things:
1. I have removed all the methods from trie class that created / destroyed nodes. I have replaced them with calls to new and delete.
It's better to restore allocator as soon as possible. Just replace new|delete with allocate/deallocate and implicit calls to constructor/destructor.
2. I have removed the node_allocator / value_allocator from trie class. I think something like a node factory that would handle only creation / destruction of nodes is more appropriate if we plan to allow custom allocators.
Yep, methods create_node/destroy_node are required
3. I have added some methods to the trie_node class in order to share a common interface between template specializations. Now operations that query the node internal state (adding / removing values / no_value()) is now done using those methods.
Ok
4. Because the iterator needed to be specialized too in order to handle trie_node specialization I have decided to delegate the erasure of the iterator to the iterator itself. This was the only way that could handle the strucutral difference in terms of members between the specializations.
That's a bad idea: iterators must know nothing about erasure. Convert those __erase_self_value_node() methods into a free functions in detail namespace. Those functions will be accepting allocator and will be overloaded by node pointers.
5. The iterator class now has another template parameter called Enable. I needed to specialize the iterator for both void / const void and i used boost::enable_if in order to achieve this with less code. const void specialization is needed too because of the erase methods from trie. In trie class there are two erase methods, erase(iterator), erase(const_iterator). The const iterator coresonding for iterator<Key, void> is iterator<Key, const void>. If they were the same the erase method from trie would have been redefined resulting in compile time errors.
Ok
6. I have added more tests for trie_set. Now all the features are tested, find, remove, insert, count_prefix, copy, iterators.
+1
7. A friend of mine has implemented a build system based on cmake.
Why is the trie_node noncopyable ? To avoid erroneous attempts to copy node. It's a structure with multiple
+1 pointers and iterators that is not meant for copying. I have described above all the changes that I needed to do in order to
implement the template specialization. Also, I think now the node and iterator are abstracted better. Please give me some feedback about the changes that I have made [0].
Good work. Now let's concentrate on the struct trie_node<Key, void>. Many optimizations may be applied there: * key_ends_here is a boolean value, that actually could be hidden in one of the pointers. In all cases our nodes are at least 2 bytes aligned, so all the parent, pred_node, next_node guaranteed to have first bit set to 0. so we can hide that `key_ends_here` bit in pointer just like this: pointer | (key_ends_here ? 1u : 0u). This will probably require addition of set_parent(). parent(), <...> functions to all the trie_nodes. * children_type is not good. We get problems (like having additional child_iter_of_parent in node), because the key is not stored in node. So the plan is following: make trie_node to be usable in intrusive set, and use the nodes directly in children. Here are some steps: use boost::intrusive::set for children_type, derive trie_node from set_base_hook and make it intrusive-set compatible, remove child_iter_of_parent and store key in trie_node directly. This will significally reduce memory usage by nodes and memory allocations (unlike intrusive set, std::map allocates memory for nodes). Both tasks are hard to implement, so do not hesitate to ask questions! -- Best regards, Antony Polukhin
On 8 March 2015 at 11:27, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-03-07 20:10 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Yep, methods create_node/destroy_node are required I am thinking at creating a new class something like node_factory that would handle this. I think it's more like the OOP way.
That's a bad idea: iterators must know nothing about erasure. Convert those __erase_self_value_node() methods into a free functions in detail namespace. Those functions will be accepting allocator and will be overloaded by node pointers.
Ok. I will change this one. Many optimizations may be applied there:
* key_ends_here is a boolean value, that actually could be hidden in one of the pointers. In all cases our nodes are at least 2 bytes aligned, so all the parent, pred_node, next_node guaranteed to have first bit set to 0. so we can hide that `key_ends_here` bit in pointer just like this: pointer | (key_ends_here ? 1u : 0u). This will probably require addition of set_parent(). parent(), <...> functions to all the trie_nodes.
Can you give me some further explanations about this ? Why are the nodes at least 2 bytes aligned and why does this imply that the first bit of the pointers is set to 0 ? When you're saying the first bit you mean the least significant one (index 0) don't you ?
* children_type is not good. We get problems (like having additional child_iter_of_parent in node), because the key is not stored in node. So the plan is following: make trie_node to be usable in intrusive set, and use the nodes directly in children. Here are some steps: use boost::intrusive::set for children_type, derive trie_node from set_base_hook and make it intrusive-set compatible, remove child_iter_of_parent and store key in trie_node directly. This will significally reduce memory usage by nodes and memory allocations (unlike intrusive set, std::map allocates memory for nodes).
I think I will start with this. The idea of intrusive containers seems pretty interesting to me. I have never heard of something like this until now. Thank you, Cosmin
2015-03-08 12:45 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
* key_ends_here is a boolean value, that actually could be hidden in one of the pointers. In all cases our nodes are at least 2 bytes aligned, so all the parent, pred_node, next_node guaranteed to have first bit set to 0. so we can hide that `key_ends_here` bit in pointer just like this:
On 8 March 2015 at 11:27, Antony Polukhin <antoshkka@gmail.com> wrote: Many optimizations may be applied there: pointer
| (key_ends_here ? 1u : 0u). This will probably require addition of set_parent(). parent(), <...> functions to all the trie_nodes.
Can you give me some further explanations about this ? Why are the nodes at least 2 bytes aligned and why does this imply that the first bit of the pointers is set to 0 ? When you're saying the first bit you mean the least significant one (index 0) don't you ?
All the allocators must return storage that is aligned appropriately for objects of type value_type. We construct trie_nodes on storage returned from allocator. In other words we can count on that trie_nodes are aligned. trie_node::node_ptrs point to trie_node. trie_node contains many pointers, so probably the structure will have the pointer alignment or bigger (this can be statically asserted). According to the alignment definition, 2 bytes aligned addresses have modulo 2 equal to 0 (in other words they are represented by even numbers/addresses). It means that the the least significant bit is zero.
* children_type is not good. We get problems (like having additional child_iter_of_parent in node), because the key is not stored in node. So the plan is following: make trie_node to be usable in intrusive set, and use the nodes directly in children. Here are some steps: use boost::intrusive::set for children_type, derive trie_node from set_base_hook and make it intrusive-set compatible, remove child_iter_of_parent and store key in trie_node directly. This will significally reduce memory usage by nodes and memory allocations (unlike intrusive set, std::map allocates memory for nodes).
I think I will start with this. The idea of intrusive containers seems pretty interesting to me. I have never heard of something like this until now.
Good. That's the most important one! -- Best regards, Antony Polukhin
Hello, I have replaced std::map with intrusive set and I have run the benchmarks. The peformance dropped versus the std::map implementation quite a lot. For instance, inserting time is about 1.5x worse. I have pushed the changes into a branch of the project [0]. I think one potential improvement would be to use node refferences instead of node_pointers wherever possible in the whole trie implementation Looking forward to some feedback from you. [0] Thank you, Cosmin
Hello, I have forgotten to add the link to the branch [0] [0] https://github.com/cosminBoaca/boost.trie/tree/intrusive_set_node_container Thank you, Cosmin
2015-03-11 15:14 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have replaced std::map with intrusive set and I have run the benchmarks. The peformance dropped versus the std::map implementation quite a lot. For instance, inserting time is about 1.5x worse. I have pushed the changes into a branch of the project [0].
No need to worry. We've just started with that intrusive things. Let's try to tune our implementation: * we do not need constant time size(), so let's provide `optimize_size<true>` option for sets * turn link mode into `link_mode<normal_link>` for sets at least when measuring performance (turn back to safe_mode for debug purposes after measure) * the sweetest thing: trie_node does not need `pred_node` and `next_node` pointers any more. Just use `--children_type::s_iterator_to(*this)` and `++children_type::s_iterator_to(*this)`
I think one potential improvement would be to use node refferences instead of node_pointers wherever possible in the whole trie implementation
This will give you nothing. On Assembler level there's no difference between reference and pointer. -- Best regards, Antony Polukhin
On 11 March 2015 at 21:30, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-03-11 15:14 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
No need to worry. We've just started with that intrusive things. Let's try to tune our implementation: * we do not need constant time size(), so let's provide `optimize_size<true>` option for sets * turn link mode into `link_mode<normal_link>` for sets at least when measuring performance (turn back to safe_mode for debug purposes after measure) * the sweetest thing: trie_node does not need `pred_node` and `next_node` pointers any more. Just use `--children_type::s_iterator_to(*this)` and `++children_type::s_iterator_to(*this)`
I didn't thought of this. I will try to do the suggested changes. So you think I should merge the branch into master ? Thank you, Cosmin
2015-03-11 22:58 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
On 11 March 2015 at 21:30, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-03-11 15:14 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
No need to worry. We've just started with that intrusive things. Let's try to tune our implementation: * we do not need constant time size(), so let's provide `optimize_size<true>` option for sets * turn link mode into `link_mode<normal_link>` for sets at least when measuring performance (turn back to safe_mode for debug purposes after measure) * the sweetest thing: trie_node does not need `pred_node` and `next_node` pointers any more. Just use `--children_type::s_iterator_to(*this)` and `++children_type::s_iterator_to(*this)`
I didn't thought of this. I will try to do the suggested changes. So you think I should merge the branch into master ?
To be honest, I'm very surprised to see the std::map outperforming intrusive containers. Make a tag `before-intrusive` on master branch, so that I could investigate the issue some time later. After that - just merge the intrusive version to master. -- Best regards, Antony Polukhin
On 11-03-2015 21:27, Antony Polukhin wrote:
To be honest, I'm very surprised to see the std::map outperforming intrusive containers. Make a tag `before-intrusive` on master branch, so that I could investigate the issue some time later. After that - just merge the intrusive version to master.
Yes, that makes no sense. Something odd must be going on. -Thorsten
Hello, Iteration works faster indeed but all the other operations perform worse. It is also true that the current implementation is not tuned for memory locality. It's basically the same implementation used by the map having changed only the container. However the difference in performance are quite big. Also, I have performed some benchmarking myself on std::set vs intrusive set using variables that are declared in contigous memory zones and intrusive_set is performing better when compiled with -O2, -O3 but it performs worse when compiled without any optimization flag. Also, for a small number of elements (that is the case in our trie too) std::set performs better any time. I have tested insert / find / erase operations. (Those are the most common operations involved in std::trie to). Cosmin
Could you remember me why we need an order in trie's children? 2015-03-12 20:12 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
Iteration works faster indeed but all the other operations perform worse. It is also true that the current implementation is not tuned for memory locality. It's basically the same implementation used by the map having changed only the container. However the difference in performance are quite big. Also, I have performed some benchmarking myself on std::set vs intrusive set using variables that are declared in contigous memory zones and intrusive_set is performing better when compiled with -O2, -O3 but it performs worse when compiled without any optimization flag. Also, for a small number of elements (that is the case in our trie too) std::set performs better any time. I have tested insert / find / erase operations. (Those are the most common operations involved in std::trie to).
Cosmin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 12 March 2015 at 19:57, endight . <endight@gmail.com> wrote:
Could you remember me why we need an order in trie's children?
I think this is mainly because in the original design of the trie there were included the lower_bound / upper_bound functions. Also, this allow iteration in lexicographic order of prefixes. P.S. When posting to the Boost Developers mailing list, please follow the policies at http://www.boost.org/community/policy.html#quoting. Note particularly not to top post or over-quote. Cosmin
El 12/03/2015 a las 18:12, Cosmin Boaca escribió:
Hello,
Iteration works faster indeed but all the other operations perform worse. It is also true that the current implementation is not tuned for memory locality. It's basically the same implementation used by the map having changed only the container. However the difference in performance are quite big. Also, I have performed some benchmarking myself on std::set vs intrusive set using variables that are declared in contigous memory zones and intrusive_set is performing better when compiled with -O2, -O3 but it performs worse when compiled without any optimization flag. Also, for a small number of elements (that is the case in our trie too) std::set performs better any time. I have tested insert / find / erase operations. (Those are the most common operations involved in std::trie to).
We can't compare without optimizations. It disables inlining and that changes the whole thing as Intrusive has many policy layers. In any case, if Intrusive performs worse with optimizations, then there is a problem in Boost.Intrusive we need to solve. Maybe we need to see what's happening with few elements (and define what is "a few"). Also, as mentioned before use set_base_hook <optimize_size<true>, link_mode<normal_link> > as hook and set<node_type, constant_time_size<false> > as children_type. Best, Ion
2015-03-12 20:12 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
Iteration works faster indeed but all the other operations perform worse. It is also true that the current implementation is not tuned for memory locality. It's basically the same implementation used by the map having changed only the container. However the difference in performance are quite big. Also, I have performed some benchmarking myself on std::set vs intrusive set using variables that are declared in contigous memory zones and intrusive_set is performing better when compiled with -O2, -O3 but it performs worse when compiled without any optimization flag. Also, for a small number of elements (that is the case in our trie too) std::set performs better any time. I have tested insert / find / erase operations. (Those are the most common operations involved in std::trie to).
O2 and O3 are the main cases. Good performance on O1 and O0 is not essential. I've took a look into std::map implementation in GCC 4.8. Helper data in node of a map consumes same or bigger amount of memory, as intrusive::set_base_hook. std::map holds first node by value (it removes memory allocation for maps with size 1, which is a common case in our performance tests), that's why we did not get the x2 speedup when moved from std::map to intrusive. Let's apply Ion's recommendations: set_base_hook <optimize_size<true>, link_mode<normal_link> > as hook set<node_type, constant_time_size<false> > as children_type. and measure the speed at O2. -- Best regards, Antony Polukhin
On 12 March 2015 at 21:15, Antony Polukhin <antoshkka@gmail.com> wrote:
O2 and O3 are the main cases. Good performance on O1 and O0 is not essential.
I've took a look into std::map implementation in GCC 4.8. Helper data in node of a map consumes same or bigger amount of memory, as intrusive::set_base_hook. std::map holds first node by value (it removes memory allocation for maps with size 1, which is a common case in our performance tests), that's why we did not get the x2 speedup when moved from std::map to intrusive.
Let's apply Ion's recommendations:
set_base_hook <optimize_size<true>, link_mode<normal_link> > as hook set<node_type, constant_time_size<false> > as children_type.
and measure the speed at O2.
Even with O2 and O3 std::map based container still performs slightly better. I can give you the results of running the benchmark on my PC. Cosmin
Hello, I have added a new specialization for trie_node. Now trie_node is specialized for single value node (like trie_map), no value node (trie_set) and multiple value node (trie multimap). Please give me some feedback about it [0]. The removal of pred_node / next_node it's a bit more complicated than I've thought initially and I think it will slow down the iteration process a bit. I have thought a little about the container used for children. I think that a container with less insertion / find overhead would be better. I think that maintaining a sorted intrusive list / vector of pointers would perform better if the number of children for a node is small (something like < 50 , maybe < 100). We may tune this further on anyway, maybe using some kind of policy for the container. In the current implementation we have no memory locality of the nodes. I think this may be a reason of underperforming for intrusive set. I was thinking at preallocating the whole node path in the trie during insertion. For instance if the key is something "abcde" 5 nodes may be allocated in advance. This approach however has the downside that the hole path must be deallocated in advance, so even the nodes are erased from they can't be deallocated. Also, this implies to keep track of the allocated chains in an additional data structure and deallocating all of the data in the destructor. This was the only way to improve memory locality that I could've thought but personally I think that this is not a good idea. [0] https://github.com/cosminBoaca/boost.trie/commit/61ba81b033d3a582c9f290955b9... Thank you, Cosmin
2015-03-13 23:13 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have added a new specialization for trie_node. Now trie_node is specialized for single value node (like trie_map), no value node (trie_set) and multiple value node (trie multimap). Please give me some feedback about it [0].
+1
The removal of pred_node / next_node it's a bit more complicated than I've thought initially and I think it will slow down the iteration process a bit.
OK. Not a problem, we're not in a rush. By the way, think of making `parent` a private variable and adding (get|set)_parent() functions. We can later try to make node even smaller by removing `parent` variable and getting parent node by calling intrusive::set::container_from_iterator(iterator);
I have thought a little about the container used for children. I think that a container with less insertion / find overhead would be better. I think that maintaining a sorted intrusive list / vector of pointers would perform better if the number of children for a node is small (something like < 50 , maybe < 100). We may tune this further on anyway, maybe using some kind of policy for the container.
Good idea, but this must be done some time later. Currently there are more essential optimizations for nodes.
In the current implementation we have no memory locality of the nodes. I think this may be a reason of underperforming for intrusive set. I was thinking at preallocating the whole node path in the trie during insertion. For instance if the key is something "abcde" 5 nodes may be allocated in advance. This approach however has the downside that the hole path must be deallocated in advance, so even the nodes are erased from they can't be deallocated. Also, this implies to keep track of the allocated chains in an additional data structure and deallocating all of the data in the destructor. This was the only way to improve memory locality that I could've thought but personally I think that this is not a good idea.
I'm thinking of a slightly different approach: we can make a specific "leaf" node, that contains the remaining part of the value. For example, if we have two values in trie "hello word" and "hell's bells", then the trie would look like: node['h']->node['e']->node['l']->node['l']->(leaf["o world"] | leaf['''s bells']); But this must be done only after the existing code would be polished enough. One of the reasons for underperforming is a big size of the node. Node does not fit in cache line. I hope that removing pred_node/next_node will help a lot. -- Best regards, Antony Polukhin
On 14 March 2015 at 10:18, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-03-13 23:13 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
OK. Not a problem, we're not in a rush.
By the way, think of making `parent` a private variable and adding (get|set)_parent() functions. We can later try to make node even smaller by removing `parent` variable and getting parent node by calling intrusive::set::container_from_iterator(iterator);
I have thought a little about the container used for children. I think that a container with less insertion / find overhead would be better. I think that maintaining a sorted intrusive list / vector of pointers would perform better if the number of children for a node is small (something like < 50 , maybe < 100). We may tune this further on anyway, maybe using some kind of policy for the container.
I'm thinking of a slightly different approach: we can make a specific "leaf" node, that contains the remaining part of the value. For example, if we have two values in trie "hello word" and "hell's bells", then the trie would look like:
node['h']->node['e']->node['l']->node['l']->(leaf["o world"] | leaf['''s bells']);
But this must be done only after the existing code would be polished enough.
Related to this, I was thinking at changing the data structure to radix tree [0]. It would be more compact. You're idea it's something like a mix between trie and radix tree. In a radix tree the two strings would be represented like : node["hell"] -> (node["o world"] | node["'s bells"]). I think there are specific cases in which it performs worse than a trie, for instance when you add all the strings that may be formed by an alphabet, but this is a quite rare case, but in the general case it's more compact and I think it performs better. Cosmin
Hello, I have runned benchmarks on the original implementation. It works slightly better even than the implementation based on std::map. Insertion time on benchmarks it's about -0.4 seconds better and I can't figure out why. That implementation updates 4 additional pointers and the current implementation does only one copy for updating the value. Also, node size is smaller. I don't understand how the original implementation can be more cache friendly. Cosmin
2015-03-15 19:58 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I have runned benchmarks on the original implementation. It works slightly better even than the implementation based on std::map. Insertion time on benchmarks it's about -0.4 seconds better and I can't figure out why. That implementation updates 4 additional pointers and the current implementation does only one copy for updating the value. Also, node size is smaller. I don't understand how the original implementation can be more cache friendly.
Something really strange is happening. Give me the revision number of the original version and the revision numbers of the updated version. I'll do measures on different compilers, different systems and different optimization flags. -- Best regards, Antony Polukhin
Hello, The original version [0], intrusive_set [1], std_map with optimized node for only one value [2]. [0] https://github.com/BoostGSoC13/boost.trie [1] https://github.com/cosminBoaca/boost.trie [2] https://github.com/cosminBoaca/boost.trie/tree/std_map Cosmin
Hello, After a lot of time wasted, I have discovered that the only problem was that I had run benchmarks with b2 and not with b2 release. Now, the intrusive version works better. Cosmin
2015-03-15 23:29 GMT+04:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
After a lot of time wasted, I have discovered that the only problem was that I had run benchmarks with b2 and not with b2 release. Now, the intrusive version works better.
Great! One less mystery to solve :) -- Best regards, Antony Polukhin
2015-03-16 10:42 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
On Mar 16, 2015 8:59 AM, "Antony Polukhin" <antoshkka@gmail.com> wrote:
Great! One less mystery to solve :)
Yes :) . Now I can focus on optimizations .
How is it going? Need any advice? -- Best regards, Antony Polukhin
Hello, I'm sorry but I had an insane coursework load during the last two weeks. I had about 4 programming assignments to solve based on system programming on both linux / windows and I hadn't any time to work at Boost.Trie. However, I have finished all my assignments and I will work on the iterator tomorrow and maybe during the weekend. I hope I will have it working until Sunday. Cosmin
I know this might be a little bit late in asking, because I've been insanely busy too. Say I know that I have a finite length of my tree-just 3 bytes-that I would like to be cached. Instead of resolving the tree dynamically, I would like there to be a contiguous array, but for each array entry, have some kind special value-either a mapping that produces a value despite length, or a concrete length limited pointer to result type. [ <- 32 megabytes -> ] An index into the radix tree can be performed by multiplying the first byte by 2^16 (possibly minus one). Then a pointer to some dynamically managed tree would suit our needs. I calculate that a lookup, assuming that the tree is also shallow, shouldn't take more than just a handful of instructions. I'm all about speed in this scenario, and my use case requites that insertion consume as few instructions as feasibly possible. Other operations aren't as important. On Thu, Apr 2, 2015 at 7:09 PM, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
Hello,
I'm sorry but I had an insane coursework load during the last two weeks. I had about 4 programming assignments to solve based on system programming on both linux / windows and I hadn't any time to work at Boost.Trie. However, I have finished all my assignments and I will work on the iterator tomorrow and maybe during the weekend. I hope I will have it working until Sunday.
Cosmin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Meant to ask if such a specialization had been added or was planned... On Fri, Apr 3, 2015 at 5:05 AM, Kenneth Adam Miller < kennethadammiller@gmail.com> wrote:
I know this might be a little bit late in asking, because I've been insanely busy too.
Say I know that I have a finite length of my tree-just 3 bytes-that I would like to be cached. Instead of resolving the tree dynamically, I would like there to be a contiguous array, but for each array entry, have some kind special value-either a mapping that produces a value despite length, or a concrete length limited pointer to result type.
[ <- 32 megabytes -> ]
An index into the radix tree can be performed by multiplying the first byte by 2^16 (possibly minus one). Then a pointer to some dynamically managed tree would suit our needs.
I calculate that a lookup, assuming that the tree is also shallow, shouldn't take more than just a handful of instructions. I'm all about speed in this scenario, and my use case requites that insertion consume as few instructions as feasibly possible. Other operations aren't as important.
On Thu, Apr 2, 2015 at 7:09 PM, Cosmin Boaca <boost.cosmin.boaca@gmail.com
wrote:
Hello,
I'm sorry but I had an insane coursework load during the last two weeks. I had about 4 programming assignments to solve based on system programming on both linux / windows and I hadn't any time to work at Boost.Trie. However, I have finished all my assignments and I will work on the iterator tomorrow and maybe during the weekend. I hope I will have it working until Sunday.
Cosmin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 3 April 2015 at 12:05, Kenneth Adam Miller <kennethadammiller@gmail.com> wrote:
I know this might be a little bit late in asking, because I've been insanely busy too.
Say I know that I have a finite length of my tree-just 3 bytes-that I would like to be cached. Instead of resolving the tree dynamically, I would like there to be a contiguous array, but for each array entry, have some kind special value-either a mapping that produces a value despite length, or a concrete length limited pointer to result type.
What do you mean by "length of my tree is 3 bytes" ? If you mean some like "keys in the trie have maximum 3-bytes" then I've never thought of such a specialization. I think that such an optimization would be good only for a very small number of use-cases. Also, can you describe the requirements for your data structure ? Maybe there are other data structures that would suite your needs better, especially if the range of the keys it's bounded. Cosmin
Not my tree, each entry. Sorry, I misspoke. Well, I know that entries will be shallow, but sparse, and extremely many. I will never have an entry greater than the wordsize of the machine. On Fri, Apr 3, 2015 at 5:27 AM, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
On 3 April 2015 at 12:05, Kenneth Adam Miller <kennethadammiller@gmail.com
wrote:
I know this might be a little bit late in asking, because I've been insanely busy too.
Say I know that I have a finite length of my tree-just 3 bytes-that I would like to be cached. Instead of resolving the tree dynamically, I would like there to be a contiguous array, but for each array entry, have some kind special value-either a mapping that produces a value despite length, or a concrete length limited pointer to result type.
What do you mean by "length of my tree is 3 bytes" ?
If you mean some like "keys in the trie have maximum 3-bytes" then I've never thought of such a specialization. I think that such an optimization would be good only for a very small number of use-cases.
Also, can you describe the requirements for your data structure ? Maybe there are other data structures that would suite your needs better, especially if the range of the keys it's bounded.
Cosmin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
If your keys are in such a bounded range maybe a VEB tree may be more suitable [0]. It has O(log log M) complexity on insert operation and if you have many entries it should work good and be quite space efficient. If it doesn't suites your needs, maybe I have misunderstood the problem. http://en.wikipedia.org/wiki/Van_Emde_Boas_tree Cosmin
This is really good, I hadn't heard about this before. Is there a van embe boas tree implementation in c++ anywhere in boost? On Fri, Apr 3, 2015 at 5:36 AM, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
If your keys are in such a bounded range maybe a VEB tree may be more suitable [0]. It has O(log log M) complexity on insert operation and if you have many entries it should work good and be quite space efficient. If it doesn't suites your needs, maybe I have misunderstood the problem.
http://en.wikipedia.org/wiki/Van_Emde_Boas_tree
Cosmin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hello, Anthony, I have some problems with the iterator. I have managed to implement the ++ operator but I'm struggling with the -- operator. Actually I don't know if i can implement it in constant memory, because when I am on one node it's hard to decide whether I should go to it's left brother or up to the father. One idea would be to keep track of the nodes I have iterated until now using a stack but this would make the iterator quite heavy. Cosmin
Hello, The main problem that appears is the following : In the implementation of the ++ operator I am doing something like iterator it = ++intrusive_set::s_iterator_to(*tnode) if (it != tnode->parent->children.end()) { /* code */ } In the implementation of the -- operator I need to do the reversed operation iterator it = --intrusive_set::s_iterator_to(*tnode) if (it != ??????) { /* code */ } I can't compare it with begin() because the comparison should work for the begin node, but fail for a node that is somehow before begin(), like end() is past the last element. I don't know how to achieve this. I have tried casting the iterator to reverse_iterator but it didn't help me. Cosmin
2015-04-08 22:49 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
The main problem that appears is the following :
In the implementation of the ++ operator I am doing something like
iterator it = ++intrusive_set::s_iterator_to(*tnode) if (it != tnode->parent->children.end()) { /* code */ }
In the implementation of the -- operator I need to do the reversed operation
iterator it = --intrusive_set::s_iterator_to(*tnode) if (it != ??????) { /* code */ }
I can't compare it with begin() because the comparison should work for the begin node, but fail for a node that is somehow before begin(), like end() is past the last element. I don't know how to achieve this. I have tried casting the iterator to reverse_iterator but it didn't help me.
How about iterator it = intrusive_set::s_iterator_to(*tnode) if (it != tnode->parent->children.begin()) { --it; /* code */ } -- Best regards, Antony Polukhin
It won't work. I need to do something like I am doing in the ++ operator implementation. Basically testing the iterator against end() means that I have finished the nodes at the current level and I need to go up in the trie. If I will test against begin I will ignore the first node from each level and I will go straight up. Cosmin
2015-04-08 23:06 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
It won't work. I need to do something like I am doing in the ++ operator implementation. Basically testing the iterator against end() means that I have finished the nodes at the current level and I need to go up in the trie. If I will test against begin I will ignore the first node from each level and I will go straight up.
I still can not see problem here: operator ++() : *increment* the iterator and *compare* with end() operator -- () : *compare* with begin() and if it is not on begin(), then *decrement*. Otherwise go to parent Let's take a look into different cases: we are at the begin() + 1: in that case we'll get begin() and user will work with begin() we are at the begin() : in that case we detect that we are at the begin() and just go to parent -- Best regards, Antony Polukhin
On 9 April 2015 at 09:51, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-04-08 23:06 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
I still can not see problem here: operator ++() : *increment* the iterator and *compare* with end() operator -- () : *compare* with begin() and if it is not on begin(), then *decrement*. Otherwise go to parent
Let's take a look into different cases: we are at the begin() + 1: in that case we'll get begin() and user will work with begin() we are at the begin() : in that case we detect that we are at the begin() and just go to parent
Hello, You were right. It worked. I have removed pred/next_node fields from trie_node class [0]. What would you suggest to do next ? [0] https://github.com/cosminBoaca/boost.trie Thank you, Cosmin
2015-04-10 0:22 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
I have removed pred/next_node fields from trie_node class [0]. What would you suggest to do next ?
Good! I'd recommend to clean up boost/trie/detail/trie_node.hpp : * `value_count` and `self_value_count` - are both members used in code? * move `key`, `children` and `parent` to to a separate base class. Define `operator<` and `comparator` only for that base class * make all the accesses to `parent` member through functions (`set_parent()`, `get_parent()`). This will simplify future node-minimization work -- Best regards, Antony Polukhin
Hello, I'm sorry for my high delayed answers but my coursework load it's keeping me busy all the time. Good!
I'd recommend to clean up boost/trie/detail/trie_node.hpp : * `value_count` and `self_value_count` - are both members used in code?
Yes. self_value_count is used for the count of values from the current node and value_count means how many values are in the subtree of the current node.
* move `key`, `children` and `parent` to to a separate base class. Define `operator<` and `comparator` only for that base class * make all the accesses to `parent` member through functions (`set_parent()`, `get_parent()`). This will simplify future node-minimization work
I have one question here. The intrusive set that keeps the children of the node will be of type intrusive_set<node_base> ? If so, I should cast the iterator return value to trie_node everywhere shouldn't I ? Thank you, Cosmin
2015-04-30 22:42 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I'm sorry for my high delayed answers but my coursework load it's keeping me busy all the time.
No need to worry. I'm also slow on answers :)
* move `key`, `children` and `parent` to to a separate base class. Define `operator<` and `comparator` only for that base class * make all the accesses to `parent` member through functions (`set_parent()`, `get_parent()`). This will simplify future node-minimization work
I have one question here. The intrusive set that keeps the children of the node will be of type intrusive_set<node_base> ? If so, I should cast the iterator return value to trie_node everywhere shouldn't I ?
I've missed that, so probably it's better to leave it as is right now. There are big things to do: * create a `leaf` node, that contains the remaining part of the value. For example, if we have two values in trie "hello word" and "hell's bells", then the trie would look like: node['h']->node['e']->node['l']->node['l']->(leaf["o world"] | leaf['''s bells']); In worst case, when all the trie values have unique beginnings, this will make the trie performace same as std::map performance. * make sure that trie works with user specified comparators. For example, make shure that it works right with std::greater<> and keys "abba" and "anna" are sorted in reverse order ("anna", "abba") -- Best regards, Antony Polukhin
Hello, * create a `leaf` node, that contains the remaining part of the value. For
example, if we have two values in trie "hello word" and "hell's bells", then the trie would look like: node['h']->node['e']->node['l']->node['l']->(leaf["o world"] | leaf['''s bells']);
In worst case, when all the trie values have unique beginnings, this will make the trie performace same as std::map performance.
I will start implementing this one.
* make sure that trie works with user specified comparators. For example, make shure that it works right with std::greater<> and keys "abba" and "anna" are sorted in reverse order ("anna", "abba")
When I have started working on the Trie I have removed the compare template parameter. Should I add it again ? Thank you, Cosmin
Hello, Is it really needed to create a new class called leaf ? I was thinking of storing a vector<key_type> in each node instead of a key_type element and this would handle both cases (one element / more elements) . Thank you, Cosmin
2015-05-04 17:13 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
Is it really needed to create a new class called leaf ? I was thinking of storing a vector<key_type> in each node instead of a key_type element and this would handle both cases (one element / more elements) .
Actually, this could be even better for first time. But keep in mind that we'll need to replace vector with something more compact some day. Also, use boost::container::small_vector<key, 1> instead of std::vector<key>. boost::container::small_vector<key, 1> does not cause dynamic memory allocations for storing a single element. -- Best regards, Antony Polukhin
Hello, I will start implementing this idea. What do you think about the radix tree ? It's even more compact than the idea with the special node. [0] http://en.wikipedia.org/wiki/Radix_tree On 5 May 2015 at 23:02, Antony Polukhin <antoshkka@gmail.com> wrote:
2015-05-04 17:13 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
Is it really needed to create a new class called leaf ? I was thinking of storing a vector<key_type> in each node instead of a key_type element and this would handle both cases (one element / more elements) .
Actually, this could be even better for first time. But keep in mind that we'll need to replace vector with something more compact some day.
Also, use boost::container::small_vector<key, 1> instead of std::vector<key>. boost::container::small_vector<key, 1> does not cause dynamic memory allocations for storing a single element.
-- Best regards, Antony Polukhin
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
2015-05-05 23:10 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca@gmail.com>:
Hello,
I will start implementing this idea. What do you think about the radix tree ? It's even more compact than the idea with the special node.
Good idea. But let's eat elephant one byte at a time. It's better to deal with leaf nodes first and make sure that everything still works :) -- Best regards, Antony Polukhin
On 5 Mar 2015 at 12:15, Kenneth Adam Miller wrote:
Cosmin: Are you open to the idea of having the trie act as a container and sort of adopt a superset of the operations that work on traditional STL containers?
Are you open to the idea of having the trie be parameterizable with concurrency strategies? With value strategies that can even be an additional data structures?
Are you open to the idea of having one specialization that is highly optimized for integer keys, with separate implementations for x32/x64?
There is one of these already at https://github.com/ned14/nedtries. I wouldn't recommend using its STL container implementation, but its C macro/C++ template implementation is plenty sound and very mature. If the concurrent unordered map GSoC goes well, it will be followed by a concurrent map GSoC (extension) which internally is based on concurrent bitwise tries. These are vastly faster under concurrent modification than red black trees, and still provide excellent concurrency given a good key hash function. None of these are a proper full fat trie container implementation, but then in my own code I don't need that right now, I need concurrent maps ordered and unordered. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
Well, my requirements are that I store range-offset encodings and be able to access them for any reason as fast as possible. So, what I was thinking about doing was storing a range-value encoding as a tree, but it could be any kind of self balancing tree. The thing with red-black trees is, I know how to make them wait-free by reading and implementing a whitepaper that UTDallas published. Does your nedtree cover the ability to store range value encodings? Because my thinking is, with a trie, you can get the first 3 bytes of an integer without giving up to much space (it requires only slightly less than 32 megabytes contiguously stored). Each entry in that 32 megabytes can be a pointer to a red-black tree node, wherein when you do an insertion, you revert back to the wait-free redblack tree implementation. So, you can still do range-value encodings, but it will also be far, far faster than if you didn't having this caching to accelerate it. I estimate that it requires at most about 11 instructions in the absolute worst case scenario to do an insertion, and typical 7 instruction on the usual "cache-miss"., and amortized 2 instruction for contiguous memory insertions. It is very highly scalable, and can be managed in flexibly with various concurrency runtime patterns. On Thu, Mar 5, 2015 at 1:14 PM, Niall Douglas <s_sourceforge@nedprod.com> wrote:
On 5 Mar 2015 at 12:15, Kenneth Adam Miller wrote:
Cosmin: Are you open to the idea of having the trie act as a container and sort of adopt a superset of the operations that work on traditional STL containers?
Are you open to the idea of having the trie be parameterizable with concurrency strategies? With value strategies that can even be an additional data structures?
Are you open to the idea of having one specialization that is highly optimized for integer keys, with separate implementations for x32/x64?
There is one of these already at https://github.com/ned14/nedtries. I wouldn't recommend using its STL container implementation, but its C macro/C++ template implementation is plenty sound and very mature.
If the concurrent unordered map GSoC goes well, it will be followed by a concurrent map GSoC (extension) which internally is based on concurrent bitwise tries. These are vastly faster under concurrent modification than red black trees, and still provide excellent concurrency given a good key hash function.
None of these are a proper full fat trie container implementation, but then in my own code I don't need that right now, I need concurrent maps ordered and unordered.
Niall
-- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 5 Mar 2015 at 13:49, Kenneth Adam Miller wrote:
Well, my requirements are that I store range-offset encodings and be able to access them for any reason as fast as possible. So, what I was thinking about doing was storing a range-value encoding as a tree, but it could be any kind of self balancing tree. The thing with red-black trees is, I know how to make them wait-free by reading and implementing a whitepaper that UTDallas published.
The rebalancing generates a lot of cache line invalidation traffic, and ruins concurrent modify performance. Academic theory often misses the reality.
Does your nedtree cover the ability to store range value encodings? Because my thinking is, with a trie, you can get the first 3 bytes of an integer without giving up to much space (it requires only slightly less than 32 megabytes contiguously stored). Each entry in that 32 megabytes can be a pointer to a red-black tree node, wherein when you do an insertion, you revert back to the wait-free redblack tree implementation. So, you can still do range-value encodings, but it will also be far, far faster than if you didn't having this caching to accelerate it.
nedtries provide a size_t key index algorithms. You can build the rest from that. Bitwise tries are hugely superior to red black trees for close and exact find lookups, but inferior for closest find lookups. You can see some graphs at http://www.nedprod.com/programs/portable/nedtries/.
I estimate that it requires at most about 11 instructions in the absolute worst case scenario to do an insertion, and typical 7 instruction on the usual "cache-miss"., and amortized 2 instruction for contiguous memory insertions. It is very highly scalable, and can be managed in flexibly with various concurrency runtime patterns.
I think you may need to scale your estimates by about 10x. A cache line miss costs about 250 CPU cycles alone, and any kind of tree algorithm spends most of its time waiting on cache lines. This is why bitwise tries are much faster than red black trees on out of order CPUs: they execute out of order much better. On an in order core like Intel Atom bitwise tries are extremely slow, and they're not great on even a Cortex A15 ARM either. nedtries costs about 100 CPU cycles per insert/find/remove. That is exceptionally well balanced, most algorithms are much slower at say insert than finds (e.g. hash tables). If you do equally do finding, inserting and removing, and your item count is less than 10,000, bitwise tries are unbeatable. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
Before we go much further, please let me explain my use case and considerations. My context is the need for a more capable and highly performant shadow memory representation for a reduced memory requiring alternative to taint analysis. This lower memory requirement and higher taint identity is achieved by the range based encoding, which consumes less than the bitvector shadow and affords more for identity storage. The scalability requirement is rather wild-we need only ever be able to insert. We don't need to really look things up-only test for a single special key upon insertion, which is rather trivial, requiring only a single cpu test instruction. The way I see the next generation dynamic binary instrumentation library & runtime operating is this: rather than use a bitmap index, requiring O(N) and having fast update times but overall poor identity storage, we can use an index and some careful optimizations to reduce the number of "cache misses" by taking advantage of the context. These optimizations actually take advantage of statistical characteristics of typical benign code rather than anything to do with actual data structures. By using a pre-allocated trie, we can store the first three bytes of any given address in the lookup, and retrieve/update/insert into that table in just a single instruction. That table fits in main memory easily, and represents keys into any range-offset encoding data structure. Between each insertion, the algorithm just compares the first three bytes. If that test succeeds, only the variables that were last updated are incremented; ie the range-offset encoding is updated (one more instruction). That makes 2-3 instructions. Statistically speaking, the majority memory is actually affected that is rather contiguous both as source and destination operands. On modern hardward, we may even get away with storing multiple recent lookups contexts in the CPU registers, being as there are quite a few available to work with, reducing penalties even further. I'm curious about using C++'s new relaxed memory capabilities with fences and what not, and possibly some ring buffers to maintain actually copies of subsegments of said trie and devoting a dedicated maintainer thread manage merging those upon each cache flush from each of the respective worker threads. So, total core overhead is somewhere like *N+C additional*, where N is the number of cores that the target process requires, and the C is the one (or however many) additional maintainer threads that merge between the workers and the coherent data structure. Hell, if it works out well, let each worker have it's own trie. But overall, this is a data structure that is very sensitive to concurrent operation, because target threads may race with themselves, and not only that, but we do not want analysis code to either race with it's "brethren" or with that of the application. So, really, wait-free red black trees were the only thing that I could think of that was the fastest (in a concurrent context) that would also affort correctness, and yet also be able to store range value encodings. But I'm open to anything that will operate in a concurrent context that will satisfy the requirements will do however. Now, back to everything you just said- @rebalancing generates a lot of cache line invalidation traffic Do you think a lazy rebalancing strategem with possibly dedicated worker thread could be applied? @nedtries provide a size_t key index algorithms. You can build the rest from that. I looked over the part where you linked me in further depth. The graphs are good, but I'm really disconcerted about the applicability of ned-tries because the metrics aren't what I'm interested in. An insertion has to be performed once per memory instruction. I read and have heard that for most software, memory operations tend to occur statistically about 33% of the time. So, I have to focus on # of instructions per insertion. Update is really just insertion. Nothing really is deleted, just updated. @Bitwise tries are hugely superior to red black trees for close and exact find lookups "Bitwise" - I don't want to fall back to using a bitvector. Those have already be applied in academia, and are publicly available. That's where I think everyone falls short; you might get ultimate speed there, but with good concurrent design and hardware primitive usage you can vastly reduce *perceived latency*. Also, identity diversity is a moot topic with bitvector shadow memory. @I think you may need to scale your estimates... Well, *immediate* tree rebalancing shouldn't ever occur unless the worst case occurs, which is that the two target application threads content for memory on a near same time basis. Also, all trees operate on colocal data-regionally colocated within proximally 255 bytes to be exact (on 32bit hardware). In fact, most tree insertion operations can be made very highly lazy-do not really update any of the other nodes of the tree, just continuously insert, even if numerous conflicting addresses aren't immediately resolved. I think identity storage schemes can be proven associative and commutative, greatly facilitating both concurrency and overall correctness. In this way, rebalancing is a rather infrequent operation. On Thu, Mar 5, 2015 at 7:30 PM, Niall Douglas <s_sourceforge@nedprod.com> wrote:
On 5 Mar 2015 at 13:49, Kenneth Adam Miller wrote:
Well, my requirements are that I store range-offset encodings and be able to access them for any reason as fast as possible. So, what I was thinking about doing was storing a range-value encoding as a tree, but it could be any kind of self balancing tree. The thing with red-black trees is, I know how to make them wait-free by reading and implementing a whitepaper that UTDallas published.
The rebalancing generates a lot of cache line invalidation traffic, and ruins concurrent modify performance. Academic theory often misses the reality.
Does your nedtree cover the ability to store range value encodings? Because my thinking is, with a trie, you can get the first 3 bytes of an integer without giving up to much space (it requires only slightly less than 32 megabytes contiguously stored). Each entry in that 32 megabytes can be a pointer to a red-black tree node, wherein when you do an insertion, you revert back to the wait-free redblack tree implementation. So, you can still do range-value encodings, but it will also be far, far faster than if you didn't having this caching to accelerate it.
nedtries provide a size_t key index algorithms. You can build the rest from that.
Bitwise tries are hugely superior to red black trees for close and exact find lookups, but inferior for closest find lookups. You can see some graphs at http://www.nedprod.com/programs/portable/nedtries/.
I estimate that it requires at most about 11 instructions in the absolute worst case scenario to do an insertion, and typical 7 instruction on the usual "cache-miss"., and amortized 2 instruction for contiguous memory insertions. It is very highly scalable, and can be managed in flexibly with various concurrency runtime patterns.
I think you may need to scale your estimates by about 10x. A cache line miss costs about 250 CPU cycles alone, and any kind of tree algorithm spends most of its time waiting on cache lines.
This is why bitwise tries are much faster than red black trees on out of order CPUs: they execute out of order much better. On an in order core like Intel Atom bitwise tries are extremely slow, and they're not great on even a Cortex A15 ARM either.
nedtries costs about 100 CPU cycles per insert/find/remove. That is exceptionally well balanced, most algorithms are much slower at say insert than finds (e.g. hash tables). If you do equally do finding, inserting and removing, and your item count is less than 10,000, bitwise tries are unbeatable.
Niall
-- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
2015-03-05 19:21 GMT+03:00 endight . <endight@gmail.com>:
Is there place for one more contributor (for me) ? Should i someshow prove my skills?
Try to cooperate with Cosmin Boaca. He could give you some tasks. Share tasks and work on different tasks in own git branches/repos. Merge finished tasks into a single repo. For example: 1. discuss with Cosmin the tasks and pick one (for example you do the auto-testing/travisCI) 2. in your own repo solve the task 3. make a pull request with your fixes or just push the commits to common repo 4. goto 1. -- Best regards, Antony Polukhin
On 5 March 2015 at 18:21, endight . <endight@gmail.com> wrote:
Is there place for one more contributor (for me) ? Should i someshow prove my skills?
We may cooperate. You can try to do the travis task Anthony has told you and then just make a pull request. I think it's better to work with pull requests rather than just pushing into the same repository.
Where is the repo in order that I start working on it? On Thu, Mar 5, 2015 at 12:51 PM, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
On 5 March 2015 at 18:21, endight . <endight@gmail.com> wrote:
Is there place for one more contributor (for me) ? Should i someshow prove my skills?
We may cooperate. You can try to do the travis task Anthony has told you and then just make a pull request. I think it's better to work with pull requests rather than just pushing into the same repository.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 5 March 2015 at 19:53, Kenneth Adam Miller <kennethadammiller@gmail.com> wrote:
Where is the repo in order that I start working on it?
The repo is here [0]. Which of the features you have mentioned earlier are you going to start working first ? [0] https://github.com/cosminBoaca/boost.trie
I'm in discussion with someone else; let me explore the code some and I will get back with you. Overt liklihood that it will be the integer specialization. And by integer, I mean storage in a cache so that you can use this as a sort of, low space requirement key store and tree prefix. So Trie trie; trie[5 % 7]; would insert allow, in just a few instructions, a band to be stored, from 5-7. There are other uses, but I think you get the idea. It can be applied to Rtrees and some other stuff. Anyway, will let you know :) On Thu, Mar 5, 2015 at 12:57 PM, Cosmin Boaca <boost.cosmin.boaca@gmail.com> wrote:
On 5 March 2015 at 19:53, Kenneth Adam Miller <kennethadammiller@gmail.com
wrote:
Where is the repo in order that I start working on it?
The repo is here [0]. Which of the features you have mentioned earlier are you going to start working first ?
[0] https://github.com/cosminBoaca/boost.trie
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (8)
-
Antony Polukhin
-
Cosmin Boaca
-
Edward Diener
-
endight .
-
Ion Gaztañaga
-
Kenneth Adam Miller
-
Niall Douglas
-
Thorsten Ottosen