Re: [boost] Interest check: memoization

25 Jan 2009

      ----- Original Message ----- 
From: "James Porter" <porterj@alum.rit.edu>
To: <boost@lists.boost.org>
Sent: Sunday, January 25, 2009 3:39 AM
Subject: [boost] Interest check: memoization
...
After hacking around with some simple dynamic programming, it occurred 
to me that it should be possible to write a simple class template to 
handle memoization of functions. With the changes in C++0x that help 
solve the forwarding problem, writing a generic memoizer should be 
fairly straightforward. (Or so I thought!)
The interface is straightforward, so I'll let the following code snippet 
speak for itself:
memoizer<int (int,int)> memo(my_func);
int ret = memo(i,j);
As you'd expect, this creates a memoizer that caches results of calls to 
my_func. Pretty simple, right? Well, not exactly. :)
(If you'd rather not read about the implementation, you can skip ahead 
to the end, where there is a link to download the source.)
There are a number of issues that make this a fairly complicated 
problem, even with the improved generic programming facilities in C++0x. 
Chief among these is copying. With the naive implementation of the 
memoizer, I found that calculating the result of an unmemoized value 
created 3 copies (one for creating a tuple to search a map for, and two 
to insert the tuple into the map), and returning a memoized value 
created 1 copy (to create a tuple to search the map).
To make a long story short, I set about trying to reduce the number of 
copies to the expected amount: 1 copy for unmemoized values and 0 for 
memoized values. I'm happy to say that this was a success! The trick was 
creating a tuple that holds references to each argument and can be 
upgraded to hold the arguments by value. In this way, we can compare by 
reference and store by value. This is simple on the surface, but 
requires some care when dealing with rvalue references.
There are also issues with passing convertible types to the memoizer 
(e.g. passing floats when the function expects ints). For complicated 
types, you certainly wouldn't want to copy them more than once, and 
you'd want to use move constructors where possible. This is resolved 
with some creative manipulation of template types (lines 52-58 of 
memoizer.hpp).
There are still a number of things missing in this code, such as the 
ability to use function objects with the memoizer, or the ability to 
specify the map type, but I believe that the most important issues are 
resolved:
* Minimizes copying of arguments
* Supports move constructors wherever possible
* Safely converts arguments to the appropriate types
The full source code for this is available here (requires gcc 4.3 or a 
compiler that supports variadic templates and rvalue refs): 
http://www.teamboxel.com/misc/memoizer.tar.gz
Comments and criticisms are welcome. If there's interest, I will - of 
course - work on a version that supports the current C++ standard.
Hi,

The implementation is very elegant and the use of the C++0x facilities reduce the code to the minimum.

I remember that something like this was requested to the Flyweigth library. The following is extracted from the documentation: "
Example 5: flyweight-based memoization
Memoization is an optimization technique consisting in caching the results of a computation for later reuse; this can dramatically improve performance when calculating recursive numerical functions, for instance. Key-value flyweights can be used to implement memoization for a numerical function f by modeling a memoized invocation of the function as a value of type flyweight<key_value<int,compute_f> >, where compute_f is a type that does the computation of f(n) at its compute_f::compute_f(int) constructor. For instance, the Fibonacci numbers can be computed with memoization like this: 

struct compute_fibonacci;
typedef flyweight<key_value<int,compute_fibonacci>,no_tracking> fibonacci;

struct compute_fibonacci{
  compute_fibonacci(int n):
    result(n==0?0:n==1?1:fibonacci(n-2).get()+fibonacci(n-1).get())
  {}

  operator int()const{return result;}
  int result;
};

  for(int n=0;n<40;++n){
    std::cout<<"F("<<n<<")="<<fibonacci(n)<<std::endl;
  }

The no_tracking policy is used so that the memoized computations persist for future use throughout the program. "

How both solutions compares? 
    int fibonacci_impl(int n);
    memoizer<int (int)> fibonacci(fibonacci_impl);

    int fibonacci_impl(int n) {
        return (n==0?0:n==1?1:fibonacci(n-2)+fibonacci(n-1))
    }

    for(int n=0;n<40;++n){
        std::cout<<"F("<<n<<")="<<fibonacci(n)<<std::endl;
    }

We can memoize only unary functions with flyweight and have flyweight values with memoize(identity function). So which one perform better.

If I understand memoizer can be used only with pure functions, isn't it? Is there a way to check for a pure function in C++0X? Is there a way to constraint the template parameter to only pure functions?
Should the following compile?
    memoizer<int&& (int,int)> memo(my_func);  // move result on calling
    int ret = memo(i,j);

Should the following compile?
    memoizer<int (int&,int)> memo(my_func); // allows to modify the first argument
    int ret = memo(i,j);

Some minor changes in 
    typedef const Ret & result_reference;

will be needed to return references.
    memoizer<int& (int,int)> memo(my_func);
    int& ret = memo(i,j);

Best,
Vicente

Re: [boost] Interest check: memoization

vicente.botet