
Jesse Perla wrote:
A couple of other questions on this beautiful distributions library - which I see as a great starting point for a lot of operations treating them as measures: * Is there any planned or existing support in the library for multi-dimensional distributions? In particular, bivariate norms with a covariance matrix passed in? Will the library design and notation support these sorts of extensions without too many problems?
Good question, it's something we've been asked for before so we're not alone, but I haven't had a chance to implement anything. Interface wise, we can define the distribution constructor to accept a covariance matrix - provided as any type that looks like a matrix. The main decision is how to store the covariance matrix internally, because that largely determines the interface for the accessors that return the matrix. For the pdf/cdf functions we could either accept multiple x values in a list: double p = cdf(my_bivariate_distribution, x1, x2); or in a vector: double p = cdf(my_bivariate_distribution, my_vector); where my_vector is basically any subscriptable type. I haven't thought about cdf-complements, and quantiles I assume are undefined?
* Is there any way in the current library to pass in a user created discrete valued probability measure and have it as a "distribution" here? If not, I think I may write it, but as I am not a very good library programmer I would love to exploit any other work out there. I would need this to work for at least 1 and 2 dimension measures.
Not yet, but again this is something (along with interpolation between points) that has been asked for before. Throw in numeric integration/differentiation and you have a rather powerful tool.
* If we were to use a function to bundle a discrete valued function, is there a nice class/pattern out there that already combines the x-values and y-values into a single class? (I also want this later for implementing different interpolation algorithms returning functors after passing in a discrete function in the constructor).
I'm not sure what you're asking for there? If you mean storage for the 2-D matrix of values, then you could either use a pair of std::vectors, a boost::multi_array, or one of the uBLAS matrixes.
* Then I would write a specialization of this (2nd and higher dimension) for a markov measure. * Then, I want to write a general operator for unconditional expectation passing in a function object, and over a generic distribution as a measure from this library (though I will use it for the discrete measures to begin with). Specializing for different types of distributions, this could use exact analytical calculations, quadrature, or use monte-carlo methods. * Last, I want to write a conditional expectation operator that takes in a function object mapping reals^2 to reals, a distribution (would have to be 2 dimensional), the conditional (which would be a real for now, but could be generic), and what dimension the condition applies to. * Eventually, we may be able to tie together this operator with boost::lambda and boost::bind and we could up with a very elegant way to write out functions that involve conditional expectations. It may not have the performance of hand-tweaked fortran loops, but it should be a good enough place to start and infinitely preferable to matlab.
Afraid you've lost me again now :-( Might just be stretching my stats knowledge to breaking point too ;-) Cheers, John.