
AMDG Jesse Perla wrote:
Thanks as always Stephen (who I still refuse to accept is either a single individual or human): Sorry to all if this is getting so long and specific, this project is justified later.
Your comments generated a lot of thought about what my main problems are. I think I distracted myself by the generation of derivatives here before I had worked out some other issues. But your implementation looks very elegant. For now lets assume that people supply the gradient/jacobian directly or from their own finite_difference/autodifferentiation. So here are some, more basic, questions.
Question 1: Based on your template, how do I forward a function to the boost::function's operator() in this wrapper? Why might we want to do this? Because we are trying to implement a concept with operator() and .gradient(..) with a function wrapper here. Library writers may want to write generic code that operates on any of these, not just ones wrapped in math_function... Then we may end up with true inlining of operator() and .gradient(...) when a full differentiable function object is implemented, instead of always going through an indirection pointer in boost::function.
We can assume that anything assigned to a math_function is a pure function, right? The easiest way is to inherit from boost::function. To make this relatively safe we can use private inheritance: template<class Signature> class math_function : private boost::function<Signature> { public: using boost::function::operator(); };
Question 2: How can this wrapping interact with binding? void test_mathfunctions_binding() { using lambda::_1; using lambda::_2; //But you might also want a constant thrown in there... Why a separate parameter? Because you don't take the derivative over the other stuff. And you want to keep your functions pure so you can bind, not force them to be stateful, etc. math_function
myfunc(_1*_1 + _2, 2* _1); //simple quadratic translated by a constant //But how would you work with bind? Would these following work immediately? boost::function
myfunc_bound = bind(myfunc, _1, 2.0); boost::function myfuncderiv_bound = bind(myfunc::gradient, _1, 2.0); //or something like this... I always forget member function notation.
Yes. These should just work.
//Any way to bind that entire structure back to a math_function? Very useful since we would bind parameters before calling optimizers, etc. With dynamic programming we usually have a grid of values and call the optimizer binding over the entire grid. math_function
myfunc_bound = math_bind(myfunc, _1, 2.0); cout << my_func_bound.gradient(1.0); //Note that the binding can now use the gradient. }
It would take a bit of work, but it could be done. To make it really efficient, you would have to duplicate most of Boost.Bind with appropriate modifications.
Question 3: How would we adapt a structure implementing the operator() and gradient() to be stored here? Just like boost::function accepting any function object? struct square { double operator()(double x) {return x * x;} double gradient(double x){return 2 * x;} }; math_function
myfunc(square); Given the previous code, would we write a constructor like?: template<class F> math_function(const F& f) : f_(f), g_(bind(F::gradient, f, _1...?) //Here I don't see how to bind to the arbitrary signature, extracting the inputs from the Signature which (may) be more than 1
Easily. I guess the question is whether you want to detect the presence of F::gradient and fall back to some default numeric algorithm if it isn't available.
Question 3: What are the thoughts on whether/how this is extensible to higher dimensions? Lets start on the gradient with mapping functions: f : R^N -> R It is perfectly reasonable for these types of problems for the first parameter in the function to be the only one that the derivative is taken over. The dimensionality of the derivative/gradient/jacobian would come out of a combination of the result_type from the signature and the dimensionality of the domain. For example, something like: template
class math_function;
Sure. But wouldn't you want DomainDim to be the arity of the function (at
least by default).
template
Question 4: Now lets say that what we really want is to manage f: R^N -> R^M functions.
I don't see any particular reason that this should present any extra challenges. From the standpoint of designing the concepts for math functions, a vector is not all that different from a simple double.
Question 5: I want to also associate information about the nature of the function. Whether it is convex/concave, increasing, strictly increasing, etc. These tend to have an inheritance structure, but it isn't absolutely necessary. I was thinking about using trait classes. But I assume that we would lose all those traits when we use type erasure? Are we better off storing a bunch of enums in this data structure? Increasing the size of the class shouldn't be an issue.
If you need to know this information at compile time then, yes you lose it when you apply type erasure. However, it is fairly easy to give math_function members that return these traits at runtime.
Let me justify why this is all very useful and a general problem: Right now, everyone writing non-linear algorithms in C++ has their own structures for managing mathematical functions that contain derivatives, jacobians, hessians. I am running into this right now because I am trying to wrap a bunch of other people's algorithms for my own work. A few comments: 1) Everyone is using dynamic polymorphism for these overloading a virtual operator() and some type of gradient/jacobian... I assume this is because there is no agreed upon concept for functions objects with gradients... In reality, none of these seem to require dynamic polymorphism. Why shouldn't a newton's method optimizer have inlining of the operator() and derivative through static polymorphism where possible?
It depends on how complex the function is. Are the ones in question sufficiently complex that the inlining wouldn't help much anyway?
2) Most of these guys have conflated functionality/settings specific to their optimizers/solves with the evaluation of the function/gradient. This makes generic code difficult without wrappers/adapters. 3) Many use their own data structures for arrays, matrices, multidim results. This may be inevitable, but means that you need to have these types parameterized in any mathematical callback routine. 4) While a lot could be gained from having people agree on a static concept for mathematical applications akin to standardizing on operator() for functors, it is also very useful to have a generic callback wrapper akin to boost;:function. The main reason for this is that you will tend to write parameterized, pure functions where you want to bind parameters. Worst case, people could write their own functions with this mathematical callback structure and adapters could be written for the different libraries.
If you have an appropriate concept, you can always either a) non-intrusively adapt any class that provides the correct functionality to model the concept or b) Create a wrapper that adapts classes to model the concept. I would say that the most important thing is the concept and math_function can then be defined as being able to hold any object that models the concept.
5) Not to disparage these superb libraries, but a few examples to see what I mean: http://quantlib.org/reference/class_quant_lib_1_1_cost_function.html and http://trilinos.sandia.gov/packages/docs/r6.0/packages/didasko/doc/html/nox_... and http://www.coin-or.org/CppAD/Doc/ipopt_cppad_simple.cpp.xml
If anyone is interested in working with me on this very interesting and useful problem, please give me an email. I think it has a lot of potential for general use.
In Christ, Steven Watanabe