Re: [boost] Re: Re: Re: Re: full functoids

26 Feb 2004

      On Wed, Feb 25, 2004 at 12:24:40PM -0500, Brian McNamara wrote:
...
Objects are not anathema to FP.  (But _mutable_ objects are anathema in
_pure_ FP.)  Both paradigms (OOP and FP) deal with both data and
behavior.  In OOP, the data (objects) are the focus.  In FP, the
behaviors (functions) are the focus.
A lot of this discussion may be "too abstract" to be useful.  To make
things more concrete, in a little while I will post a short example
illustrating some differences of how the paradigms deal with data and
behaviors.
Ok, as promised...

Here's an example that shows how higher-order functions are useful,
especially when you want to do lots of different kinds of computations
on a particular object/datatype.  Hope you'll bear with me through the
example, even if it's a little trying at various points.  (I'm trying to
show both the "OOP" and the "FP" ways to approach a problem.)

Say you have a binary tree data structure:

   template <class T>
   struct Tree {
      typedef T value_type;
      T data;
      shared_ptr<Tree> left, right;
      Tree( const T& x, shared_ptr<Tree> l, shared_ptr<Tree> r )
         : data(x), left(l), right(r) {}
      Tree( const T& x ) : data(x), left(), right() {}
   };

There are various things you might want to do with the data in the tree.
For example, we might want to collect all the data into a "inorder
traversal" sequence.  One straightforward way to do this is: 

   // helper function, main function is below
   template <class T>
   void oop_tree_inorder( shared_ptr<Tree<T> > t, vector<T>& v ) {
      if(t) {
         oop_tree_inorder( t->left, v );
         v.push_back( t->data );
         oop_tree_inorder( t->right, v );
      }
   }
   template <class T>
   vector<T> oop_tree_inorder( shared_ptr<Tree<T> > t ) {
      vector<T> v;
      oop_tree_inorder( t, v );
      return v;
   }

and now we can call "oop_tree_inorder(a_tree)" and get back a vector
with all the data.

Another thing we might need to do with a binary tree is compute its
height.  So we may write:

   template <class T>
   int oop_tree_height( shared_ptr<Tree<T> > t ) {
      if( !t )
         return 0;
      else
         return 1+max( oop_tree_height(t->left), oop_tree_height(t->right) );
   }

Easy enough.  There are lots more things to do with trees.  If the Ts
are LessThanComparables, we might want to find the minimum element in
the tree.  Since the tree may be empty, a "tree_min" function should
return a maybe<T> (where a maybe<T> is either "nothing" or "just t"--
maybe is very similar to boost::optional).  As a convenience, I am
assuming that there already exists a function min_maybe:
   T min_maybe( T, maybe<T> );
which computes the minimum of a value and a maybe-value.  Anyway:

   template <class T>
   maybe<T> oop_tree_min( shared_ptr<Tree<T> > t ) {
      if( !t )
         return NOTHING;
      else
         return just( min_maybe( min_maybe( t->data, oop_tree_min(t->left) )
                               , oop_tree_min(t->right) ) );
   }

Or, if T is some arithmetic-like type, we may want to create a new tree,
where all the data have been incremented by a certain value.  Assuming
the existence of "mk_tree":
   shared_ptr<Tree<T> > 
   mk_tree( T, shared_ptr<Tree<T> >, shared_ptr<Tree<T> > );
then we could write:

   template <class T>
   shared_ptr<Tree<T> > oop_tree_add( shared_ptr<Tree<T> > t, const T& x ) {
      if( !t )
         return shared_ptr<Tree<T> >();
      else
         return mk_tree( t->data + x, oop_tree_add(t->left,x),
                                      oop_tree_add(t->right,x) );
   }

And there are tons of other things we might want to do with trees (sum
all the data in the tree, get a reverse-post-order traversal, ...).  We
could spend the afternoon writing tons of tree functions if we liked.
(But we probably wouldn't; the Tree class doesn't deserve such a wide
interface, filled with tons of only-occasionally-useful methods.)

The FP programmer, being enamored with higher-order functions (HOFs),
notices that all these tree functions have a common pattern.  Rather
than write tons of miscellaneous tree functions, he instead writes one
function to do them all:

   // using FC++
   struct FoldTree {
      template <class L, class N, class SPT> 
         struct sig : public fun_type<L> {};
      template <class L, class N, class T>
      L operator()( const L& leaf, const N& node, 
                    shared_ptr<Tree<T> > t ) const {
         if( !t )
            return leaf;
         else
            return node( t->data, (*this)(leaf,node,t->left), 
                                  (*this)(leaf,node,t->right) );
      }
   };
   typedef full3<FoldTree> fold_tree_type;
   fold_tree_type fold_tree;

fold_tree() captures the common essence of all of the tree functions we
already wrote.  These algorithms all handle the empty tree case by
returning some value, and for non-empty trees, the algorithms recurse on
the left and right children, and combine the results with the data in
the current node.  In fold_tree(), "leaf" is the value to be returned
for empty trees, and "node" is a ternary function for combining the
data with the recursive calls on the left and right children.

With fold_tree(), it is easy to express new tree functions on the fly.  
For example, here is a function to get an inorder travseral:

   fold_tree( fcpp::list<T>(), lambda(D,L,R)[ cat[ L, cons[D,R] ] ] )

If the tree is empty, we return an empty list.  Otherwise, we cons the
current node data (D) onto the right-recursive call (R), and then
concatenate the left-recursive call (L) onto the front.  What if we want
the height of a tree?

   fold_tree( 0, lambda(D,L,R)[ plus[ 1, max[L,R] ] ] )

The empty tree has height 0.  For non-empty trees, we add 1 to the max
of the recursive calls.  What about min?

   fold_tree( maybe<int>(),
              lambda(D,L,R)[ just[ min_maybe[min_maybe[D,L],R] ] ] )

Etc.  If we want new tree functions, fold_tree() lets us create them
on-the-fly as needed.  Of course, if you are going to be using a
specific function more than once, you would probably bind it to a name.  
For example, if you are going to be repeatedly asking the heights of
trees-of-characters, you would probably write

   boost::function< int(shared_ptr<Tree<char> >) > hgt = 
      fold_tree( 0, lambda(D,L,R)[ plus[ 1, max[L,R] ] ] )

and then just call "hgt(my_tree)" as necessary.

In fact, this is pretty similar to what already happens with iterators.
Back when I was writing all the simple-minded oop_tree_xxx() functions,
you were probably thinking to yourself that if _you_ were writing all
these algorithms to do stuff with tree data, you'd follow the lead of
the STL and create iterators for the tree class.  Then you could
decouple the algorithms from the data structure, and with just a few
different kinds of iterators (for pre/in/post-order), you could easily
use STL algorithms or iterator adapters to compose solutions to various
tree computations on-the-fly.

In this sense, HOFs and iterators (or iterator adapters) are very
similar.  The FPers will create _functions_ (like fold_tree) to abstract
away in order to create generalized algorithms, whereas the OOPers will
create _objects_ (like iterators) for the same general ends.  Both
strategies have the same general flavor, just taken with a different
tack.

Anyway, that's just one tiny (and thus a little contrived) example of
how HOFs can be useful when applied to OO data structures.  You are, of
course, probably already familiar with HOFs in some limited STL
domains: sort() takes a comparator, and transform() and for_each() take
functions as arguments.  When you have a decent framework for easily
creating little functions on-the-fly (like boost::lambda or FC++),
using HOFs becomes an increasingly attractive way to design general
algorithm interfaces.

-- 
-Brian McNamara (lorgon@cc.gatech.edu)