[btree] Status report

The slides for my BoostCon "Proposed Boost B-tree Library" presentation are available: http://github.com/boostcon/2011_presentations/raw/master/tue/proposed_b_tree... Please be aware that while the library is progressing nicely, it isn't ready for production use: * There are outstanding bugs, particularly with variable length data. * Names and signatures may still change. * Documentation is barely started. * Build is failing on Mac OS X. That probably won't get fixed until after BoostCon. (Reported by Marshall Clow) That said, the response to the presentation was very positive and I'm actively continuing development. Anyone interested in the library is encouraged to take a look and give it a try. See http://github.com/Beman/Boost-Btree/blob/master/README for install instructions. Feedback welcome! Thanks, --Beman

On 05/18/2011 08:54 PM, Beman Dawes wrote:
Feedback welcome!
Thanks,
--Beman
Hello Beman, looks interesting. I'm not able to find the bootstrap.sh file mentioned in the README, it seems to be missing in the repository. I have four remarks on the current state 1) I think having the in-memory representation to be identical to the on-disk representation is a very serious limitation. Wrapping around it by overloading the output stream operators doesn't seem efficient. I'm not advocating Boost.Serialization; rather something like Asio's buffer-handling mechanism (which, in turn, can handle Boost.Serialization). 2) using a "path" argument to having a file open presumes a real underlying file. Although this is fine, maybe I would like to use an fstream? 3) It would be nice if it would work with an asynchronous file-like services, like, e.g., Asio's RandomAccessHandleService, see http://www.boost.org/doc/libs/1_46_0/doc/html/boost_asio/reference/RandomAcc... 4) have you looked at http://goo.gl/r8Z2m and/or http://goo.gl/lvc9A ? Thanks, Cheers, Rutger

On Thu, May 19, 2011 at 3:24 AM, Rutger ter Borg <rutger@terborg.net> wrote:
On 05/18/2011 08:54 PM, Beman Dawes wrote:
Feedback welcome!
Thanks,
--Beman
Hello Beman,
looks interesting. I'm not able to find the bootstrap.sh file mentioned in the README, it seems to be missing in the repository.
It comes from the Boost repo when the install instructions are applied.
I have four remarks on the current state
1) I think having the in-memory representation to be identical to the on-disk representation is a very serious limitation.
Yes, agreed. But the assumption is that it is very, very, fast compared to an implementation that must translate between different external and internal representations. The limitations can be overcome by using the B-tree within the implementation of some higher level abstraction.
Wrapping around it by overloading the output stream operators doesn't seem efficient.
That's also my assumption.
I'm not advocating Boost.Serialization; rather something like Asio's buffer-handling mechanism (which, in turn, can handle Boost.Serialization).
I'm not familiar with that mechanism. Chis is here at BoostCon so I'll ask him.
2) using a "path" argument to having a file open presumes a real underlying file. Although this is fine, maybe I would like to use an fstream?
What would a real-world use case be?
3) It would be nice if it would work with an asynchronous file-like services, like, e.g., Asio's RandomAccessHandleService, see http://www.boost.org/doc/libs/1_46_0/doc/html/boost_asio/reference/RandomAcc...
I'd need stronger motivation that "it would be nice"! What would the benefit be?
4) have you looked at http://goo.gl/r8Z2m
Yes, quickly. It appears the proposed Boost B-tree's caching scheme and pack optimization provide similar benefits. and/or http://goo.gl/lvc9A ? I find that one more interesting and will study it more carefully. But the proposed containers have highly tunable performance, and were designed with many of the same concerns of that paper in mind WRT disk access versus cache lines. Thanks, --Beman

On 2011-05-19 20:07, Beman Dawes wrote:
Yes, agreed. But the assumption is that it is very, very, fast compared to an implementation that must translate between different external and internal representations.
I always assumed that file-access is orders of magnitude slower than DMA. Especially when the file isn't local. Another case might be convincing here: Keys that don't require translation, Mapped that requires it.
2) using a "path" argument to having a file open presumes a real underlying file. Although this is fine, maybe I would like to use an fstream?
What would a real-world use case be?
See below.
3) It would be nice if it would work with an asynchronous file-like services, like, e.g., Asio's RandomAccessHandleService, see http://www.boost.org/doc/libs/1_46_0/doc/html/boost_asio/reference/RandomAcc...
I'd need stronger motivation that "it would be nice"! What would the benefit be?
There's an increased availability of network-based storage mechanisms, like, e.g., Amazon's S3. The combination of (such a) storage backend with a B-tree container delivers a NoSQL-stack, which I think is a considerable benefit. To name a concrete example, using a distributed object store like Ceph's underlying RADOS, see http://ceph.newdream.net/wiki/Librados , as a storage-backend. Operations at this level don't require "expensive" file-system semantics. Cheers, Rutger

On Fri, May 20, 2011 at 4:46 AM, Rutger ter Borg <rutger@terborg.net> wrote:
On 2011-05-19 20:07, Beman Dawes wrote:
Yes, agreed. But the assumption is that it is very, very, fast compared to an implementation that must translate between different external and internal representations.
I always assumed that file-access is orders of magnitude slower than DMA. Especially when the file isn't local. Another case might be convincing here: Keys that don't require translation, Mapped that requires it.
2) using a "path" argument to having a file open presumes a real underlying file. Although this is fine, maybe I would like to use an fstream?
What would a real-world use case be?
See below.
3) It would be nice if it would work with an asynchronous file-like services, like, e.g., Asio's RandomAccessHandleService, see
http://www.boost.org/doc/libs/1_46_0/doc/html/boost_asio/reference/RandomAcc...
I'd need stronger motivation that "it would be nice"! What would the benefit be?
There's an increased availability of network-based storage mechanisms, like, e.g., Amazon's S3. The combination of (such a) storage backend with a B-tree container delivers a NoSQL-stack, which I think is a considerable benefit.
Yes, of course. And such a stack might want to provide async operational functions. But they would be provided by a level above the basic B-tree package. Chris Kohlhoff and I discussed this yesterday. Neither of us can see any way for Boost B-tree to benefit from async approaches. A UB-tree (i.e. a B-tree that does multi-dimentional searches) might well benefit. But that isn't what is on the table at the moment.
To name a concrete example, using a distributed object store like Ceph's underlying RADOS, see http://ceph.newdream.net/wiki/Librados , as a storage-backend. Operations at this level don't require "expensive" file-system semantics.
I'll take a look. --Beman

On Thu, May 19, 2011 at 11:07 AM, Beman Dawes <bdawes@acm.org> wrote:
On Thu, May 19, 2011 at 3:24 AM, Rutger ter Borg <rutger@terborg.net> wrote:
3) It would be nice if it would work with an asynchronous file-like services, like, e.g., Asio's RandomAccessHandleService, see http://www.boost.org/doc/libs/1_46_0/doc/html/boost_asio/reference/RandomAcc...
I'd need stronger motivation that "it would be nice"! What would the benefit be?
Real-world situation: I've got a fuzzy full-text search running on top of my own async b-tree code right now. Searching needs to hit the b-tree several times depending on the length of the query. Being able to launch all these b-tree queries in parallel is very important for performance, but launching one thread for each query is tremendously sub-optimal. In a server environment that processes hundreds of these full-text searches at once, it might mean asking for thousands of threads! With async I can do all of this very efficiently using only a single thread (or any number of threads of my choosing). This said, async will significantly increase the amount of code needed to implement this library. I wouldn't want it to be delayed for it, especially when most people won't need async. If it could be designed in a way that decouples the core bits from I/O as much as possible so that it would be easy to move over later, that would be cool. ie. implement operations with simple I/O-agnostic state machines -- instead of doing any potentially blocking operation like filling buffers directly, it requests the user of the state machine to fill the buffer for it and resume processing afterward. A blocking lower_bound() might be implemented something like: lower_bound_op sm(key) result r; while((r = sm.run()) != done) { if(r == ...) ... if(r == need_block) fill_buffer(sm.buf, sm.buf_length); } -- Cory Nelson http://int64.org

Den 18-05-2011 20:54, Beman Dawes skrev:
The slides for my BoostCon "Proposed Boost B-tree Library" presentation are available:
http://github.com/boostcon/2011_presentations/raw/master/tue/proposed_b_tree...
// boost::btree:: template <class Key, class T, class Traits = default_endian_traits, class Compare = btree::less<Key> > class btree_map; Would it makes sense to swap the Traits and Compare argument? Isn't it more common to change the predicate? -Thorsten

On Thu, May 19, 2011 at 5:40 AM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Den 18-05-2011 20:54, Beman Dawes skrev:
The slides for my BoostCon "Proposed Boost B-tree Library" presentation are available:
http://github.com/boostcon/2011_presentations/raw/master/tue/proposed_b_tree...
// boost::btree:: template <class Key, class T, class Traits = default_endian_traits, class Compare = btree::less<Key> > class btree_map;
Would it makes sense to swap the Traits and Compare argument?
I've gone back and forth on that. The argument that may sway me is trying to stay as close as possible to standard library containers.
Isn't it more common to change the predicate?
I'm not sure. Some folks want the highest possible performance, don't care about portability, and are willing to accept alignment limitations. Many others want some combination of those. That implies a lot of use of the Traits parameter. But the current ordering does seem to be a bit of a lightning rod! Thanks, --Beman

Den 18-05-2011 20:54, Beman Dawes skrev:
The slides for my BoostCon "Proposed Boost B-tree Library" presentation are available:
http://github.com/boostcon/2011_presentations/raw/master/tue/proposed_b_tree...
A comment more. In for (map_type::iterator it = bt_map.begin(); it != bt_map.end(); ++it) { cout << " " << it->key() << " --> " << it->mapped_value() << '\n'; } I got lot of negative feedback when I was not interface compatible with existing containers in Boost.PtrContainer. Thus, we made the syntax i->first and i->second availble without using std::pair so at least generic algorithms would work. Boost.Bimap does something similar. Would it not be possible to do something similar, even though the value is fetched lazily? -Thorsten

On Thu, May 19, 2011 at 5:49 AM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Den 18-05-2011 20:54, Beman Dawes skrev:
The slides for my BoostCon "Proposed Boost B-tree Library" presentation are available:
http://github.com/boostcon/2011_presentations/raw/master/tue/proposed_b_tree...
A comment more. In
for (map_type::iterator it = bt_map.begin(); it != bt_map.end(); ++it) { cout << " " << it->key() << " --> " << it->mapped_value() << '\n'; }
I got lot of negative feedback when I was not interface compatible with existing containers in Boost.PtrContainer. Thus, we made the syntax i->first and i->second availble without using std::pair so at least generic algorithms would work.
Interesting! This is the interface design choice I'm most worried about, and I'm willing to make even painful changes to the current approach if convinced a different approach is better. The feedback I got from the BoostCon audience was that they really like the explicit names, although several people thought that "mapped_value" should be renamed to "value". Iterators current contain two pointers. I've worried about any approach that would increase the size. But I had not cosidered the impact on algorithms that expect the "first" and "second" names, that further complicates the decision.
Boost.Bimap does something similar. Would it not be possible to do something similar, even though the value is fetched lazily?
I'll look at Bimap. Thanks for the pointer. --Beman

Hi Beman, On Thu, May 19, 2011 at 7:25 PM, Beman Dawes <bdawes@acm.org> wrote:
But I had not cosidered the impact on algorithms that expect the "first" and "second" names, that further complicates the decision.
If it is possible, I think it is a good idea to at least offer the standard "first" and "second" names as an alternative option.
On Thu, May 19, 2011 at 5:49 AM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Boost.Bimap does something similar. Would it not be possible to do something similar, even though the value is fetched lazily?
I'll look at Bimap. Thanks for the pointer.
I leave you here a direct pointer to the rationale for Boost.Bimap design decisions: http://tinyurl.com/bimap-rationale Best regards, Matias

Den 21-05-2011 12:28, Matias Capeletto skrev:
Hi Beman,
On Thu, May 19, 2011 at 5:49 AM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Boost.Bimap does something similar. Would it not be possible to do something similar, even though the value is fetched lazily?
I'll look at Bimap. Thanks for the pointer.
In PtrContainer we cache the underlying pair to adjust the interface: template< class F, class S > struct ref_pair { typedef F first_type; typedef S second_type; const F& first; S second; template< class F2, class S2 > ref_pair( const std::pair<F2,S2>& p ) : first(p.first), second(static_cast<S>(p.second)) { } template< class RP > ref_pair( const RP* rp ) : first(rp->first), second(rp->second) { } const ref_pair* const operator->() const { return this; } ... }; Alternatively, it might be slightly more efficient to provide cnoversion operations for dummy objects called first/second. But I doubt it will have any significant overhead in the btree case as I suspect that iteration is already slower than normal in-memory datas tructures. -Thorsten
participants (5)
-
Beman Dawes
-
Cory Nelson
-
Matias Capeletto
-
Rutger ter Borg
-
Thorsten Ottosen