[math] Empirical distribution function

Hello, would there be an interest in adding the Empirical distribution function ( http://en.wikipedia.org/wiki/Empirical_distribution_function) to Boost.Math? The good statistical software projects implement it. For instance, in R, we have the ECDF ( http://stat.ethz.ch/R-manual/R-patched/library/stats/html/ecdf.html), and Python has something like that as well ( http://statsmodels.sourceforge.net/generated/scikits.statsmodels.tools.ECDF.... ). Best Denis

On Tue, Jun 14, 2011 at 7:57 PM, Denis Arnaud <denis.arnaud_boost@m4x.org> wrote:
Hello,
would there be an interest in adding the Empirical distribution function ( http://en.wikipedia.org/wiki/Empirical_distribution_function) to Boost.Math?
The good statistical software projects implement it. For instance, in R, we have the ECDF ( http://stat.ethz.ch/R-manual/R-patched/library/stats/html/ecdf.html), and Python has something like that as well ( http://statsmodels.sourceforge.net/generated/scikits.statsmodels.tools.ECDF.... ).
+1 Usually I use R, but it would be great to have to in Boost. Thanks, -- Marco

On 6/14/11 3:41 PM, sguazt wrote:
On Tue, Jun 14, 2011 at 7:57 PM, Denis Arnaud <denis.arnaud_boost@m4x.org> wrote:
Hello,
would there be an interest in adding the Empirical distribution function ( http://en.wikipedia.org/wiki/Empirical_distribution_function) to Boost.Math?
The good statistical software projects implement it. For instance, in R, we have the ECDF ( http://stat.ethz.ch/R-manual/R-patched/library/stats/html/ecdf.html), and Python has something like that as well ( http://statsmodels.sourceforge.net/generated/scikits.statsmodels.tools.ECDF.... ).
+1
Usually I use R, but it would be great to have to in Boost.
Thanks,
-- Marco _______________________________________________ Unsubscribe& other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
I guess you mean the ECDF under assumption of independent trials? I developed one as a Boost.Accumulator here: https://svn.boost.org/svn/boost/sandbox/statistics/non_parametric/boost/stat... While it's been a while I haven't touched these directories, I'd be happy to do some maintenance & a proper doc/test suite.

I guess you mean the ECDF under assumption of independent trials? I developed one as a Boost.Accumulator here:
https://svn.boost.org/svn/boost/sandbox/statistics/non_parametric/boost/stat...
While it's been a while I haven't touched these directories, I'd be happy to do some maintenance & a proper doc/test suite.
Nod, this looks more like an accumulator than a distribution - in the sense that you can only calculate properties from a set of values? John.

On 6/15/11 4:30 AM, John Maddock wrote:
I guess you mean the ECDF under assumption of independent trials? I developed one as a Boost.Accumulator here:
https://svn.boost.org/svn/boost/sandbox/statistics/non_parametric/boost/stat...
While it's been a while I haven't touched these directories, I'd be happy to do some maintenance & a proper doc/test suite.
Nod, this looks more like an accumulator than a distribution - in the sense that you can only calculate properties from a set of values?
John. __________________________
It is the ECDF assuming independent sampling (I think identically need not even be assumed) and derived quantities (such as Kolmogorov Smirnov statistic) : F(x) = count of samples below x The fact that it's an accummulator is just a convenience : the distribution is updated each time a sample is passed to the acc. I'm doing maintenance work right not, whether or not this matches the need of the Denis. It shouldn't take too much time (days).

On 6/15/11 6:19 AM, er wrote:
On 6/15/11 4:30 AM, John Maddock wrote:
I guess you mean the ECDF under assumption of independent trials? I developed one as a Boost.Accumulator here:
https://svn.boost.org/svn/boost/sandbox/statistics/non_parametric/boost/stat...
While it's been a while I haven't touched these directories, I'd be happy to do some maintenance & a proper doc/test suite.
Nod, this looks more like an accumulator than a distribution - in the sense that you can only calculate properties from a set of values?
John. __________________________
It is the ECDF assuming independent sampling (I think identically need not even be assumed) and derived quantities (such as Kolmogorov Smirnov statistic) :
F(x) = count of samples below x
The fact that it's an accummulator is just a convenience : the distribution is updated each time a sample is passed to the acc.
I'm doing maintenance work right not, whether or not this matches the need of the Denis. It shouldn't take too much time (days).
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Perhaps you are thinking about offering an interface identical to boost's statistical distribution's (not accumulator), as [math] would suggest. Just a thought : it shouldn't be too hard to specify a wrapper around an accumulator with this interface, but perhaps there already exists things like that (haven't checked in a while)...

Am 14.06.2011 19:57, schrieb Denis Arnaud:
Hello,
would there be an interest in adding the Empirical distribution function ( http://en.wikipedia.org/wiki/Empirical_distribution_function) to Boost.Math? You can implement on top of boost.accumulators from eric niebler the ecdf function. This should work.
Regards, Kim
The good statistical software projects implement it. For instance, in R, we have the ECDF ( http://stat.ethz.ch/R-manual/R-patched/library/stats/html/ecdf.html), and Python has something like that as well ( http://statsmodels.sourceforge.net/generated/scikits.statsmodels.tools.ECDF.... ).
Best
Denis _______________________________________________ Unsubscribe& other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (5)
-
Denis Arnaud
-
er
-
John Maddock
-
Kim Kuen Tang
-
sguazt