[container] New container: why is everybody silent?
Hi boost community! Please tell me something about my container, don't be silent. I firmly intend to offer it to boost, because it's much faster than competitors (by a constant factor). It's intended to replace std::vector in cases, when an application requires multiple insertion/deletion. It takes O(log(N)) for insertion/deletion. Docs: http://alexkupri.github.com/array/ Code: https://github.com/alexkupri/array Best regards, Aleksandr Kupriianov
I think that most Boost library devs are currently focused on fixing the issues with the the incoming release which is a bit difficult to get because it's the first time there is a release with the git-ified boost sources. If you don't get any feedback right now, try again after the release is done?
Please tell me something about my container, don't be silent. I firmly intend to offer it to boost, because it's much faster than competitors (by a constant factor).
I think that most Boost library devs are currently focused on fixing the issues with the the incoming release If you don't get any feedback right now, try again after the release is done?
I am interested in using the container on a microcontroller,
for which performance is crucial. I intend to use your container
in combination with a custom allocator.
I think folks are still working very hard on the release.
I am also very busy with normal work and the GSoC 2014
program. So could you please grant us maybe a bit
more time?
Cheers
Chris
On Monday, July 14, 2014 8:28 PM, Alexander Kuprijanov
Christopher Kormanyos
Please tell me something about my container, don't be silent. I firmly intend to offer it to boost, because it's much faster than competitors (by a constant factor).
I am interested in using the container on a microcontroller, for which performance is crucial. I intend to use your container in combination with a custom allocator.
I think folks are still working very hard on the release. I am also very busy with normal work and the GSoC 2014 program. So could you please grant us maybe a bit more time?
Cheers Chris
Hi Chris!
Thank you for your answer. If you have any questions regarding the container,
feel free to ask me.
By the way, if you are going to use it on microcontroller, other template
parameters L,M might be better (smaller, I think), also default are good
enough, I believe. I mean: btree_seq
Please tell me something about my container, don't be silent. I firmly intend to offer it to boost, because it's much faster than competitors (by a constant factor).
I am interested in using the container on a microcontroller, for which performance is crucial. I intend to use your container in combination with a custom allocator.
Cheers Chris
Hi Chris! Thank you for your answer. If you have any questions regarding the container, feel free to ask me.
Hi Alex,
I can provide you with my benchmark results below. Please do
not be discouraged by the potentially disappointing report, as
I suspect my benchmark might be stressing the node-linked
list in the wrong areas --- and with the wrong competition
(std::vector).
I tested the container with a floating-point benchmark
that also performs numerous sequential operations on
iterators ranging from [begin, end).
I used a desktop PC with Microsoft Visual Studio 2013
and an ARM(R) A8 microcontroller with GCC 4.8.2.
I used the container with a custom allocator having
a memory pool of 512 bytes.
Here is my report and the interpretation of the report
below the report:
---------------------------------------------------------------
I found many warnings in both GCC as well as VC:
- redundant semicolons (empty statements)
- local variables shadowing class members.
- non-explicit conversion to/from signed/unsigned
- signed / unsigned comparison
- unary minus applied to unsigned type
- unreferenced local variables
I stripped the implementation:
* I removed all exception handling and asserts.
I identified the container uses in the benchmark:
* Create two containers.
* One container has 4 floats, the other has 5 floats.
Do in a loop about 10 times the following:
* On each container apply std::accumulate with std::multiplies.
* On each container apply std::for_each with element incrementation.
* Do the normal floating-point operations and loop management.
I use a custom allocator with 512 byte pool:
* typedef util::ring_allocator<T> allocator_type;
I use either btree_seq or std::vector:
* typedef btree_seq
Please tell me something about my container, don't be silent. I firmly intend to offer it to boost, because it's much faster than competitors (by a constant factor).
I am interested in using the container on a microcontroller, for which performance is crucial. I intend to use your container in combination with a custom allocator.
I think folks are still working very hard on the release. I am also very busy with normal work and the GSoC 2014 program. So could you please grant us maybe a bit more time?
Cheers Chris
Hi Chris!
Thank you for your answer. If you have any questions regarding the container,
feel free to ask me.
By the way, if you are going to use it on microcontroller, other template
parameters L,M might be better (smaller, I think), also default are good
enough, I believe. I mean: btree_seq
Christopher Kormanyos
Hi Alex,
I tested the container with a floating-point benchmark that also performs numerous sequential operations on iterators ranging from [begin, end).
Here is my report and the interpretation of the report below the report:
---------------------------------------------------------------
I found many warnings in both GCC as well as VC:
I stripped the implementation: * I removed all exception handling and asserts.
I identified the container uses in the benchmark: * Create two containers. * One container has 4 floats, the other has 5 floats.
Do in a loop about 10 times the following: * On each container apply std::accumulate with std::multiplies. * On each container apply std::for_each with element incrementation. * Do the normal floating-point operations and loop management.
I use a custom allocator with 512 byte pool: * typedef util::ring_allocator<T> allocator_type;
I use either btree_seq or std::vector: * typedef btree_seq
container_type; * typedef std::vector container_type; I perform the numerical correctness analysis: * Run on PC with VS2012. Result OK. * Run on ARM A8 with GCC. Result OK.
I perform the size analysis on the ARM(R): * using neither : 7344 byte * using std::vector : 8928 byte, delta = 1584 * using btree_seq : 15968 byte, delta = 8624
I perform the runtime analysis on the ARM(R): * using std::vector : 4.4 microseconds * using btree_seq : 5.3 microseconds
---------------------------------------------------------------------------- -
Interpretation:
The btree_seq obtains the correct result and can be used on a PC as well as microcontroller with a custom allocator. This is very positive!
The btree_seq creates much more code than the corresponding use of std::vector. In addition, btree_seq runs significantly slower than std::vector in this particular benchmark.
The code could be potentially cleaned up significantly. For this, build at very high warning levels and eliminate all warnings.
As mentioned earlier, some C++11 awareness might be mandatory. An awareness of move construction, nullptr where appropriate, and construction / equating with std::initializer_list. Boost has C++11 compiler switches for these and more C++11 features.
If you would like to see the code of the benchmark, it is available in the public domain.
I encourage you to keep going.
Cheers Chris
Hi, Chris! 1) Thank you for spending some time to test my container. 2) Performance. Yes, for iterative and random-access the vector is slightly better (but only slightly, I'm also encouraged by 5.3 : 4.2 score). Vector is champion for these operations, and probably nobody can outperform it. But as soon as insertions/deletions are necessary, my container can outperform. 3) Real app. Are you writing the real app for microcontroller now? If you use insertions/deletions, especially random ones, my container can significantly outperform. Even if you just use push_back, my thing can slightly outperform. However, if you use iterative-access, vector wins, but not too much. I suggest you to try to use my thing in the real app. 4) Benchmark - can you provide a link? 5) C++11 - fully accepted. I'll deal with it soon. 6) Warnings - fully accepted. Best regards, and thank you again Aleksandr
I tested the container with a floating-point benchmark that also performs numerous sequential operations on iterators ranging from [begin, end).
2) Performance. Yes, for iterative and random-access the vector is slightly better (but only slightly, I'm also encouraged by 5.3 : 4.2 score). Vector is champion for these operations, and probably nobody can outperform it. But as soon as insertions/deletions are necessary, my container can outperform.
I agree. Then it is important to provide the users with information regarding where btree_seq is most efficient, and where it is less efficient. Some library developers might wonder how btree_seq is different from std::list. What missing niche in the STL does btree_seq intend to fill?
3) Real app. Are you writing the real app for microcontroller now? If you use insertions/deletions, especially random ones, my container can significantly outperform.
Yes. I can work out a test case.
Even if you just use push_back, my thing can slightly outperform. However, if you use iterative-access, vector wins, but not too much. I suggest you to try to use my thing in the real app.
Yes. That's aood idea.
4) Benchmark - can you provide a link?
Yes.
5) C++11 - fully accepted. I'll deal with it soon.
OK. Give me a few days for cleanup. It's all at Github. Links will follow.
6) Warnings - fully accepted Some very high GCC Warning settings are shown below.
-Wall \
-Wextra \
-pedantic \
-Wmain \
-Wundef \
-Wsign-conversion \
-Wunused-parameter \
-Wuninitialized \
-Wshadow \
-Wunreachable-code \
-Wswitch-default \
-Wswitch-enum \
-Wcast-align \
-Wmissing-include-dirs \
-Winit-self \
-Wfloat-equal \
-Wdouble-promotion
If you have -std=c++11, then you can also use:
-Wzero-as-null-pointer-constant
I also tested btree_seq on an 8-bit Atmel(R) AVR(R)
microcontroller today. It was about 500 times slower than
the ARM(R) core, but got the right results. So now
btree_seq has seen some action on 8-bit, 32-bit. and 64-bit
systems.
-----------------------------------------------------------------
8-bit Atmel(R) AVR(R) microcontroller
Size analysis:
* using neither : 2240 byte
* using std::vector : 4448 byte, delta = 2208
* using btree_seq : 13504 byte, delta = 11264
Runtime analysis:
* using std::vector : 1500 microseconds
* using btree_seq : 2200 microseconds
-----------------------------------------------------------------
Cheers
Chris
On Wednesday, July 16, 2014 8:42 AM, Aleksandr Kupriianov
Hi Alex,
I tested the container with a floating-point benchmark that also performs numerous sequential operations on iterators ranging from [begin, end).
Here is my report and the interpretation of the report below the report:
---------------------------------------------------------------
I found many warnings in both GCC as well as VC:
I stripped the implementation: * I removed all exception handling and asserts.
I identified the container uses in the benchmark: * Create two containers. * One container has 4 floats, the other has 5 floats.
Do in a loop about 10 times the following: * On each container apply std::accumulate with std::multiplies. * On each container apply std::for_each with element incrementation. * Do the normal floating-point operations and loop management.
I use a custom allocator with 512 byte pool: * typedef util::ring_allocator<T> allocator_type;
I use either btree_seq or std::vector: * typedef btree_seq
container_type; * typedef std::vector container_type; I perform the numerical correctness analysis: * Run on PC with VS2012. Result OK. * Run on ARM A8 with GCC. Result OK.
I perform the size analysis on the ARM(R): * using neither : 7344 byte * using std::vector : 8928 byte, delta = 1584 * using btree_seq : 15968 byte, delta = 8624
I perform the runtime analysis on the ARM(R): * using std::vector : 4.4 microseconds * using btree_seq : 5.3 microseconds
---------------------------------------------------------------------------- -
Interpretation:
The btree_seq obtains the correct result and can be used on a PC as well as microcontroller with a custom allocator. This is very positive!
The btree_seq creates much more code than the corresponding use of std::vector. In addition, btree_seq runs significantly slower than std::vector in this particular benchmark.
The code could be potentially cleaned up significantly. For this, build at very high warning levels and eliminate all warnings.
As mentioned earlier, some C++11 awareness might be mandatory. An awareness of move construction, nullptr where appropriate, and construction / equating with std::initializer_list. Boost has C++11 compiler switches for these and more C++11 features.
If you would like to see the code of the benchmark, it is available in the public domain.
I encourage you to keep going.
Cheers Chris
Hi, Chris! 1) Thank you for spending some time to test my container. 2) Performance. Yes, for iterative and random-access the vector is slightly better (but only slightly, I'm also encouraged by 5.3 : 4.2 score). Vector is champion for these operations, and probably nobody can outperform it. But as soon as insertions/deletions are necessary, my container can outperform. 3) Real app. Are you writing the real app for microcontroller now? If you use insertions/deletions, especially random ones, my container can significantly outperform. Even if you just use push_back, my thing can slightly outperform. However, if you use iterative-access, vector wins, but not too much. I suggest you to try to use my thing in the real app. 4) Benchmark - can you provide a link? 5) C++11 - fully accepted. I'll deal with it soon. 6) Warnings - fully accepted. Best regards, and thank you again Aleksandr _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hi, Chris!
Christopher Kormanyos
I agree. Then it is important to provide the users with information regarding where btree_seq is most efficient, and where it is less efficient.
Thank you so much for testing my thing. External testing is very important psychological thing, because my own tests can be too gentle. Random insertions and deletions are things where it's champion.
Some library developers might wonder how btree_seq is different from std::list. What missing niche in the STL does btree_seq intend to fill?
Shortly speaking my thing is between list and vector. It sells everything for
O(log(N)) price. In the list you can insert something in the middle for a
constant time, only if you have an iterator, but going into the middle will
take O(N) time. If you insert in the middle using vector, it will require O
(N) anyway.
Another niche is possible porting string on its basis. I.e., something like
this:
typedef string_abstraction
8-bit Atmel(R) AVR(R) microcontroller
Size analysis: * using neither : 2240 byte * using std::vector : 4448 byte, delta = 2208 * using btree_seq : 13504 byte, delta = 11264
I'm not sure I understand your words "using neither", that's why I asked for a link. Yes, size for 8-bit Atmel is a little bit unexpected; normally it takes roughly 2 times more memory than vector. Thank you 1e10 times for testing, Best regards, Aleksandr
On 17 July 2014 06:44, Aleksandr Kupriianov
Some library developers might wonder how btree_seq is different from std::list. What missing niche in the STL does btree_seq intend to fill?
Shortly speaking my thing is between list and vector.
Instead of yet another container, have you thought about adding another kind of index to Boost.Multi-index? The big benefit to that is that it makes it very easy to compare different types of indices with real world data. I've done this kind of thing with sequenced indices vs. random access indices.
Another niche is possible porting string on its basis. I.e., something like this:
typedef string_abstraction
string; typedef string_abstraction scalable_string;
Unlikely to be that useful, as the iterator invalidation and contiguous guarantees are vastly different in the two cases. -- Nevin ":-)" Liber mailto:nevin@eviloverlord.com (847) 691-1404
8-bit Atmel(R) AVR(R) microcontroller
Size analysis: * using neither : 2240 byte * using std::vector : 4448 byte, delta = 2208 * using btree_seq : 13504 byte, delta = 11264
I'm not sure I understand your words "using neither", that's why I asked for a link. Yes, size for 8-bit Atmel is a little bit unexpected; normally it takes roughly 2 times more memory than vector.
The sizes refer to the actual size of constant program
code. This is something that most PC programmers
rarely consider in normal development.
When I write "neither", it means the size of the microcontroller
program code without the benchmark part active. In other
words, there is neither vector nor btree_seq being used
because noe of the benchmark code is active.
The benchmark is linked below. Actually it is a stronger
benchmark for floating-point portability rather than for containers.
But it is well-tested code, and i have it available. So I decided
to use it for a preliminary look at btree_seq.
The benchmark is called within ugly compiler switches
when these lines are un-commented.
//#define CFG_USE_APP_BENCHMARK_FPU
// #define CFG_APP_BENCHMARK_FPU_TYPE CFG_APP_BENCHMARK_FPU_TYPE_HYPERG
https://github.com/ckormanyos/real-time-cpp/blob/master/ref_app/src/app/benc...
The code for the mathematical hypergeometric function in
the actual benchmark run is here.
https://github.com/ckormanyos/real-time-cpp/blob/master/ref_app/src/app/benc...
The function hypergeometric_pfq is being called.
I can (and maybe should) work out some simpler test
cases that more clearly expose the functionality of
btree_seq.
Cheers, Chris
On Thursday, July 17, 2014 1:45 PM, Aleksandr Kupriianov
I agree. Then it is important to provide the users with information regarding where btree_seq is most efficient, and where it is less efficient.
Thank you so much for testing my thing. External testing is very important psychological thing, because my own tests can be too gentle. Random insertions and deletions are things where it's champion.
Some library developers might wonder how btree_seq is different from std::list. What missing niche in the STL does btree_seq intend to fill?
Shortly speaking my thing is between list and vector. It sells everything for
O(log(N)) price. In the list you can insert something in the middle for a
constant time, only if you have an iterator, but going into the middle will
take O(N) time. If you insert in the middle using vector, it will require O
(N) anyway.
Another niche is possible porting string on its basis. I.e., something like
this:
typedef string_abstraction
8-bit Atmel(R) AVR(R) microcontroller
Size analysis: * using neither : 2240 byte * using std::vector : 4448 byte, delta = 2208 * using btree_seq : 13504 byte, delta = 11264
I'm not sure I understand your words "using neither", that's why I asked for a link. Yes, size for 8-bit Atmel is a little bit unexpected; normally it takes roughly 2 times more memory than vector. Thank you 1e10 times for testing, Best regards, Aleksandr _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
El 14/07/2014 20:28, Alexander Kuprijanov escribió:
Hi boost community!
Please tell me something about my container, don't be silent. I firmly intend to offer it to boost, because it's much faster than competitors (by a constant factor).
Sorry, I have your previous post on my to-do list but no time to even read a bit about it. To make it part of Boost.Container, it should support portable non-raw pointers, move semantics, C++11 allocators and other features supported by Boost.Container. A btree based data structure seems useful. However, in your links, I see little documentation on how the data structure works, how you mapped an associative container into a sequence container, the implementation of insert,erase,operator[]... That would help a lot to receive more feedback. Best, Ion
Ion Gaztañaga
El 14/07/2014 20:28, Alexander Kuprijanov escribió:
Hi boost community!
Please tell me something about my container, don't be silent. I firmly intend to offer it to boost, because it's much faster than competitors (by a constant factor).
To make it part of Boost.Container, it should support portable non-raw pointers, move semantics, C++11 allocators and other features supported by Boost.Container.
A btree based data structure seems useful. However, in your links, I see little documentation on how the data structure works, how you mapped an associative container into a sequence container, the implementation of insert,erase,operator[]... That would help a lot to receive more feedback.
Hi Ion! Thank you for feedback. I'll take your advice and add description firstly and C ++11 features then. By the way, it's not exactly "mapping sequence on associative container". It's implementation of the sequence using tree-like structure. Best regards, Aleksandr
Ion Gaztañaga
A btree based data structure seems useful. However, in your links, I see little documentation on how the data structure works, how you mapped an associative container into a sequence container, the implementation of insert,erase,operator[]... That would help a lot to receive more feedback.
Hi Ion! The documentation added: http://alexkupri.github.io/array/ C++11 features coming soon. Best regards, Aleksandr
Ion Gaztañaga
Sorry, I have your previous post on my to-do list but no time to even read a bit about it.
To make it part of Boost.Container, it should support portable non-raw pointers, move semantics, C++11 allocators and other features supported by Boost.Container.
Hi Ion! All requirements fulfilled: documentation, license, c++11, warnings suppressed. http://alexkupri.github.io/array/ Please, add it to review schedule. Best regards, Aleksandr
On Mon, Aug 11, 2014 at 8:52 AM, Aleksandr Kupriianov
All requirements fulfilled: documentation, license, c++11, warnings suppressed. http://alexkupri.github.io/array/
It does not yet support the C++11 allocator model. Some changes required: 1. Do not call .construct or .destroy directly, but only via std::allocator_trails Note: If you want to support both C++11 and C++03 compilers then do this when BOOST_NO_CXX11_ALLOCATOR is _not_ defined - and when it is defined, use placement new for construction Note: Do this only for construction of value_type objects; for all other construction use placement new 2. Do not use the ::pointer member types of the allocator type directly. Use std::allocator_traits to obtain these types. 3. A C++11 allocator's 'pointer' type may not be a raw pointer, so do not make that assumption. p = a.allocate(n); - p should be the allocator's ::pointer type. If you need a raw pointer from it, std::addressof(*p) will get you that address - but preserve the value of p for as long as you keep that address around, because p is what you will give back to .deallocate(). Note: Both std::addressof and std::allocator_traits mentioned have Boost equivalents; boost::addressof and boost::allocator_traits; if you want to avoid the BOOST_NO_CXX11_* checks and conditional code for each. Glen
Just curious if this container might be better suited to sorting (and re-sorting) than a std::vector. And, I guess, for containers of large structs that are relatively expensive to copy or swap by value? - Nigel
I'm pretty excited by this container; I'll start testing it as soon as I find an excuse for it. Did you create it as part of a thesis? If so, it would be nice to see the thesis too. If not, it would be nice to see some discussion about which/whose B-Tree you have implemented and a bibliography. I'm not intimately familiar with this literature but a question that immediately comes to mind is whether your implementation is cache-oblivious and why. My limited understanding is that a CO structure would be more general and thus work on supercomputers and microcontrollers equally well without the programmer needing to tweak the L and M parameters. I still need to read the relevant paper [1] and your implementation in detail. Cheers. Jeremy [1] http://epubs.siam.org/doi/abs/10.1137/S0097539701389956 http://www.cc.gatech.edu/~bader/COURSES/UNM/ece637-Fall2003/papers/BDF00.pdf
Jeremy Murphy
Did you create it as part of a thesis? If so, it would be nice to see the thesis too. If not, it would be nice to see some discussion about which/whose B-Tree you have implemented and a bibliography. I'm not intimately familiar with this literature but a question that immediately comes to mind is whether your implementation is cache-oblivious and why.
1) It's not a part of a thesis. 2) Well, it is cache-efficient, not cache-oblivious. (In other words, you should choose L and M so, that they fit well for a desired data structure and size of cache line. If it was cache-oblivious, it would work for all sizes of cache line.) Cache-oblivious implementation (like in your reference) would probably require custom allocator and might trigger problems with fragmentation. On the other hand, many processors have small size of cache line (for example, i7 has 64 bytes for L1, L2 and L3), so I don't think that cache-oblivious implementation is necessary. 3) I just combined concept of sequence container and b-tree for access. I did not use any existing concepts or code, just did it from scratch. There are many implementations of similar concepts, for example, __gnu_cxx::rope. rope combines binary tree and sequence in the similar way. There is an article in wikipedia on the rope. There is a comparison of my container with rope in my main.cpp, although it is switched off.
On 18 August 2014 19:24, Aleksandr Kupriianov
2) Well, it is cache-efficient, not cache-oblivious. (In other words, you should choose L and M so, that they fit well for a desired data structure and size of cache line. If it was cache-oblivious, it would work for all sizes of cache line.) Cache-oblivious implementation (like in your reference) would probably require custom allocator and might trigger problems with fragmentation. On the other hand, many processors have small size of cache line (for example, i7 has 64 bytes for L1, L2 and L3), so I don't think that cache-oblivious implementation is necessary.
OK. I might have missed it, but I can't see in the docs an explanation of how to choose L and M for a given data structure and cache line size. Is there a formula? Knowing data structure size sounds reasonable but how many people know the size of their cache line? And do I understand correctly that since it is specified as a compile-time constant, the value in the source code needs to be changed or at least checked when compiled on a given machine for the first time? I guess I'm concerned that these L and M values limit the ease of portability, by which I mean the additional effort required to consider the performance effects of the parameters on one machine or another is very different to the ease of use of std::vector. 3) I just combined concept of sequence container and b-tree for access. I
did not use any existing concepts or code, just did it from scratch. There are many implementations of similar concepts, for example, __gnu_cxx::rope. rope combines binary tree and sequence in the similar way. There is an article in wikipedia on the rope. There is a comparison of my container with rope in my main.cpp, although it is switched off.
Well, the point of the bibliography is primarily for anyone looking at your data structure to understand why it is designed that way. A specific reference to a textbook or paper allows us to see that and also potentially see what you missed. If your implementation is based on a Wikipedia article then cite that. Maybe I'm banging the science drum too hard, but I personally find that kind of explicit trail of theory helpful. Jeremy
Jeremy Murphy
OK. I might have missed it, but I can't see in the docs an explanation of how to choose L and M for a given data structure and cache line size. Is there a formula? ... I guess I'm concerned that these L and M values limit the ease of portability, by which I mean the additional effort required to consider the performance effects of the parameters on one machine or another is very different to the ease of use of std::vector.
Jeremy, I proposed good default values (L=30, M=60), which could be however changed. The default values are good for most PCs. I've performed experiments on my laptop and found that these values form plateau. I mean, if you change L or M twice, the result changes not more than 7 to 10%. However, if you decrease them strongly (L=4, M=4), you can lose 50% performance, if you increase them strongly (M=500), insertion/deletion will be performed slowly. ___ To be brief, you should use default values ___. (Compare this with vector behavior: you use default growth strategy, but you can change it if you want.) If want to get best performance, you should take into account not only machine, but also operations your application performs, also size of your structure, also memory manager, also typical size of container. If the application likes to read/access, large M is preferable. If it likes to insert/delete, small M is preferable. Default L is good enough unless you want to split/merge containers or perform bulk insertions/deletions (in that case L should be slightly increased). Large L is slightly better for big containers. Not to mention memory allocation: if ptree_seq contains 1 element, it allocates the whole leaf or M+ elements. So mobile apps can prefer smaller L and M for memory saving. ___ To be brief, if you want 100% performance, you should tune L and M during the profiling stage, when the whole application is complete ___.
Well, the point of the bibliography is primarily for anyone looking at your data structure to understand why it is designed that way. A specific reference to a textbook or paper allows us to see that and also potentially see what you missed. If your implementation is based on a Wikipedia article then cite that. Maybe I'm banging the science drum too hard, but I personally find that kind of explicit trail of theory helpful.
I mentioned article in wikipedia about rope just because it helps the common understanding. It shows how to keep sequence in tree-like structure. The idea is quite simple: to keep number of elements in each subtree near the pointer to that subtree. If you insert (delete) element, you modify log(N) subtrees, so you update log(N) numbers. If you want to access an element with given index, you use the search algorithm, which is very similar to tree-search algorithm, but you use these numbers instead of index. Also, I mention rope because it is very respectful and is included into gcc under the name of __gnu_cxx::rope. However, the rest is my optimizations. Firstly, I use btree instead of binary tree, which makes data structure cache-efficient. Secondly, I use some algoritmic tricks, which allow to improve performance (by constant value). The best example is using recursion-free, one-pass update algorithm, which I described on my title page. To be brief, I used an already known concept of keeping sequence in tree- like structure, and then effectively applied some known optimizations to it. Best regards, Aleksandr Kupriianov
Nigel Stewart
Just curious if this container might be better suited to sorting (and re-sorting) than a std::vector.
btree_seq is better than vector in insert/delete intensive applications. Since sorting does not require insert/delete, btree_seq will be comparable (but slower) with vector in sorting.
And, I guess, for containers of large structs that are relatively expensive to copy or swap by value?
I expect that after including btree_seq into boost, ptr_btree_seq will be built upon the btree_seq, like ptr_vector upon vector. Leaves of ptr_btree_seq will physically contain only pointers to the large objects, but the container will have the same interface as btree_seq. (This should not take much effort, since ptr_* implementation already exists for vector.)
On Tue, Jul 15, 2014 at 4:28 AM, Alexander Kuprijanov
Hi boost community!
Please tell me something about my container, don't be silent. I firmly intend to offer it to boost, because it's much faster than competitors (by a constant factor).
It's intended to replace std::vector in cases, when an application requires multiple insertion/deletion. It takes O(log(N)) for insertion/deletion.
Docs: http://alexkupri.github.com/array/ Code: https://github.com/alexkupri/array
Hi Alexander, I have already made a few comments. I'd like say a bit more about one of previous comments. Normally, Boost community does NOT review projects published without Boost license. Are you going to add Boost license to your project? Regards, Vadim Stadnik
Vadim Stadnik
Hi Alexander,
I have already made a few comments. I'd like say a bit more about one of previous comments.
Normally, Boost community does NOT review projects published without Boost license. Are you going to add Boost license to your project?
Regards, Vadim Stadnik
Hi, Vadim! Thank you for your kind words. 1) License - accepted. 2) Documentation - accepted (coming soon). 3) Examples - accepted (after adding c++11 features). Best regards, Aleksandr
participants (10)
-
Aleksandr Kupriianov
-
Alexander Kuprijanov
-
Christopher Kormanyos
-
Glen Fernandes
-
Ion Gaztañaga
-
Jeremy Murphy
-
Klaim - Joël Lamotte
-
Nevin Liber
-
Nigel Stewart
-
Vadim Stadnik