Re: [boost] [GSoC] Heaps & Queues (draft proposal)

First of all, thank you, Andrew and Luke, for pointing out my mistakes and misunderstanding in my draft project which I sent several days ago. After that, I have read some source code in the STL and do some more research on this topic. Now, I work out my draft proposal again. Please feel free to point out my stupidity. I will be very grateful for this. Besides, I still have some doubt on this project: What we need to do in this project is to build a new library as the result of decoupling boost/pending from the BGL. So the compatibility with the new library and the BGL need to be concerned. However, I have read the document in the BGL. It’s said that we can convert existing graphs to the BGL by implementing some functions. Dose it mean the APIs of the new library does not effect the compatibility of the new library and the BGL since we can change it properly. Is that right? ==== draft proposal v0.2 ==== Project Proposal * Please provide a description of your proposed work. ** Background There are a number of data structures in boost/pending that implement different kinds of heaps and queues that are used in a number of different Boost Graph (BGL) algorithms. These can be cleaned up, decoupled from the BGL and redeveloped into a useful library for advanced data structures. Why don’t we use them directly? We can’t use boost/pending or libs/pri_queue as the new library directly for several reasons: 1. Heaps don’t parameterize for container type. They use vector as default. In the STL, there is very clear abstraction between containers and algorithms (they use iterator to achieve this abstraction). Since abstraction is a goal of the project, it’s necessary to modify the data structure in boost/pending rather than use it directly. 2. The APIs designed in the boost/pending is not user-friendly. In my opinion, the heaps in boost/pending act as the container of the priority queue rather than as a data structure. So some operations are omitted since they are useless in priority queue, but useful in heap and maybe necessary for other adaptors. They lack some operations such as delete one key (not just pop the minimum or maximum key), update the key in heap. This leads to the necessity of redeveloping a new library. ** Benefit for Boost Heap is widely used in development such as heap sort, priority queue and other graph algorithm. It also can be the container for advanced data structures. But the heaps in boost/pending and libs/pri_queue is not extendable or highly abstract. This leads to the limited usage of this library. To redevelop a new library about heaps and priority queue more abstract and more user-friendly not only can broaden the user group of Boost, but also can fix the design deficiency in boost/pending and make it more conforms to the philosophy of Boost. ** Solution In my opinion, most of work is to rewrite and reuse the existing data structures in boost/pending and libs/pri_queue. The most important thing I should consider is the APIs design of the new library. It should be highly abstract and user-friendly. The compatibility with the new library and the BGL needs to be concerned. *In order to be extendible, I would like to add iterator to the base class, which is more frequently used in programming, especially generic programming, than pointer. In some libraries, such as the STL, iterator is a bridge for container and algorithm. Since one of the goals of this project is to redevelop a new library for advance data structures and adaptors, I consider iterator to be necessary component to the library. Maybe it’ll take a little bit more memory, but in modern PCs, I think it acceptable.(am I wrong about the usage of the iterator?)* * * Besides the heaps required, I would like to add d-ary heap into the new library. Because d-ary heap is a variation of binary heap (so I can use binary heap as a special case for d-ary) and has faster running time. If time permits, I want to add splay heap (because it has the attractive property that recently accessed elements are quick to access again. I think it will be very useful in application software programming) and other heaps mentioned in libs/pri_queue into the new library. Below is my design to the library APIs. template <class RandomAccessIterator, // iterator of the container class TreeNode, // key type class Comp = std::less<TreeNode>> class heap{ public: class iterator{ // implement iterator so that this class can be use as container // for other adaptors and algorithms public: TreeNode const& operator* () const; TreeNode const* operator->() const; iterator& operator++(); iterator operator++(int); bool operator== (iterator const& it); bool operator!= (iterator const& it); }; public: /* param: new_inserted points to the bottom of the container * where the new value has been inserted but not heapify. */ void insert(RandomAccessIterator new_inserted); void delete(RandomAccessIterator delete_node); void update(RandomAccessIterator update_node, TreeNode& new_node); // make the heap mutable RandomAccessIterator merge(RandomAccessIterator other_root); RandomAccessIterator getRoot(); RandomAccessIterator find(TreeNode& node); bool empty() const; size_type size() const; iterator begin() const; iterator end() const; void print() const; private: void print_recur() const; protected: RandomAccessIterator _root; }; NOTICE: operations print and print_recur is optional. But I prefer to include them because it'll be helpful when debugging the programs using this library. The class mentioned above is the base class for all the heap structures in the new library. The classes such as Fibonacci heap, relaxed heap, binomial heap and d-ary heap should inherit from this abstract class and add additional operations if needed. As the adaptor of heaps, the priority queue API may like: template< class RandomAccessIterator, class TreeNode, class Comp = std::less<TreeNode> class Container = fibonacci_heap<RandomAccessIterator, TreeNode, Comp>> class priority_queue{ public: /* param: new_inserted points to the place * where the new value has been inserted */ void push(RandomAccessIterator new_inserted); void pop(); // get the first value without popping it const TreeNode& front() const; TreeNode& front() ; RandomAccessIterator front() const; RandomAccessIterator find(TreeNode& node); void update(RandomAccessIterator update_node, TreeNode& new_node); // make the priority queue mutable RandomAccessIterator merge(RandomAccessIterator other_head); bool empty() const; size_type size() const; void print() const; private: void print_recur() const; protected: RandomAccessIterator _head; }; Thank you for reading

Besides, I still have some doubt on this project: What we need to do in this project is to build a new library as the result of decoupling boost/pending from the BGL. So the compatibility with the new library and the BGL need to be concerned. However, I have read the document in the BGL. It’s said that we can convert existing graphs to the BGL by implementing some functions. Dose it mean the APIs of the new library does not effect the compatibility of the new library and the BGL since we can change it properly. Is that right?
I wouldn't worry about the BGL. I would just focus on the heap/priority queue interfaces.. The proposal seems to be in good shape. Andrew Sutton andrew.n.sutton@gmail.com

On Wed, 7 Apr 2010, iaml wrote:
Besides the heaps required, I would like to add d-ary heap into the new library. Because d-ary heap is a variation of binary heap (so I can use binary heap as a special case for d-ary) and has faster running time. If time permits, I want to add splay heap (because it has the attractive property that recently accessed elements are quick to access again. I think it will be very useful in application software programming) and other heaps mentioned in libs/pri_queue into the new library.
FYI, there is a d-ary help in BGL already. It's in <boost/graph/detail/d_ary_heap.hpp> and has a customizable container type and compare operator. -- Jeremiah Willcock
participants (3)
-
Andrew Sutton
-
iaml
-
Jeremiah Willcock