[ATT:Jim Apple] Memoization Library status?

In December last year, Jim Apple asked if there was interest in a memoization library he was working on. [1] It seems there were several people interested in this, but I have seen no more posts about it. Was this project discontinued? Are there other boosters working on a library like this? Dave Abrahams pointed out (in [2]) that a library such as this may need the ability to invalidate parts of the cache, but was met with confusion and I don't think the motivation for the need was widely understood. I have had increasing need for a memoization library such as this in our current project, so I thought I'd try to summarize it as a kind of use case. The project is a medical diagnosis application using in the order of a hundred image processing step for diagnosis of digitized images. Some of these steps can take a very significant amount of time, and many of them may need to use intermediate results that were already computed in other steps. I am currently obtaining the result image recursively through a very rudimentary non-generic memoization cache class, where recursive calls are explicitly routed through the cache object (by a string->funptr mapping). Thanks to the caching I am now probably cutting away a good 90% of processing time. Although all processing-step functions now (of necessity) have the same signature, many of them use various globally set parameters related to the algorithm, such as threshold settings for neural-network outputs. Some parameters are not used until late in the process, whereas other parameters affect early steps. To optimize the parameter values we must perform a lot of searches through the parameter space using various parameter optimization techniques. Currently, with every parameter change we must re-run the entire process, even if the parameter actually only affects one of the very last steps. With a memoization library that knows about parameter dependencies, we would be able to try many different values of this parameter and only the dependent parts of the cache would be invalidated, which could save hours if not days of processing time. /Mattias [1] http://lists.boost.org/MailArchives/boost/msg56780.php [2] http://lists.boost.org/MailArchives/boost/msg56913.php

Mattias Flodin wrote:
In December last year, Jim Apple asked if there was interest in a memoization library he was working on. [1] It seems there were several people interested in this, but I have seen no more posts about it. Was this project discontinued? Are there other boosters working on a library like this?
I cut short work when I faced what I considered a very serious thread-safety issue. Thread safety not only becomes an issue in the tricky cases (clearing the cache after every top-level call, clearing the cache manually), but also in the simplest case. I could lose substantial time saving benefits if the would-be-saved calculations are on a different threads and can't see the work that each is doing. I should have, at this point asked a couple of new questions: 1. Are there any thread gurus who want to work with me on this? 2. Is there any interest in a memoization library that is known to be thread-UNsafe? 3. How do lazy languages (Clean, Haskell) deal with this issue?
Dave Abrahams pointed out (in [2]) that a library such as this may need the ability to invalidate parts of the cache, but was met with confusion and I don't think the motivation for the need was widely understood.
I have developed methods for clearing the cache automatically when a global variable or passed parameter changes. Partial invalidation is not something I have considered.
With a memoization library that knows about parameter dependencies, we would be able to try many different values of this parameter and only the dependent parts of the cache would be invalidated, which could save hours if not days of processing time.
A suggestion: image intro_step(const image &) image final_step(const image &) Assume intro_step depends on no globals, but final_step does. Give them different caches, then manually clear final_step's every time you change a global. Am I missing the point? Jim
participants (2)
-
Jim Apple
-
Mattias Flodin