This Maple worksheet accompanies the paper:
Di Nardo E. (2010) A new approach to Sheppard?s corrections. In press Mathematical Methods in Statistics. (http://arxiv.org/abs/1004.4989)
A new approach to Sheppard's corrections
E. Di Nardo*
elvira.dinardo@unibas.it
http://www.unibas.it/utenti/dinardo/home.html;
Tel: +39 0971205890, Fax: +39 0971205896
G. Guarino**
giuseppe.guarino@rete.basilicata.it
* Dipartimento di Matematica e Informatica, Universit? degli Studi della Basilicata,
Viale dell'Ateneo Lucano n.10, 85100 Potenza, Italy
**Medical School, Universit? del Sacro Cuore (Rome branch),
Largo Agostino Gemelli n.8, 00168 Roma, Italy
Introduction
Abstract: in the real world, continuous variables are observed and recorded in finite precision through a rounding or coarsening operation, i.e. a grouping rule. A compromise between the desire to know and the cost of knowing is then a necessary consequence.
Attention has been paid in the literature to the computation of moments when data are grouped into classes. The moments computed by means of the resulting grouped frequency distribution are looked upon as a first approximation to the moments of the parent distribution, but they suffer from the error committed in grouping. A good correction procedure is given by Sheppard's corrections that are nowadays still employed. Sheppard's corrections are usually referred to continuos parent distribution. But grouping includes also censoring or splitting data into categories during collection or publication, and so it does not only involve continuous variables.
A very simple closed-form formula for Sheppard's corrections has been recovered by the classical umbral calculus (see [5]) as well as a more general closed-form formula for discrete parent distributions (see [2]). No attention was paid in the literature to multivariate generalizations of Sheppard's corrections, probably due to the complexity of the resulting formulae (see [1]). Via the umbral calculus, the generalization to the multivariate case turns to be straightforward.
All these new formulae are particularly suited to be implemented in MAPLE. The theoretical background of these formulae can be found in Di Nardo E. (2010) (see [3])
Application Areas/Subject: combinatorics & algebraic methods in statistics.
Keywords: raw moment, grouped moment, Sheppard's correction, umbral calculus.
See Also: background on umbral calculus in [4]
Initialization
> |
 |
raw2grp
Suppose
a multivariate random vector.
The raw multivariate moment of X of order
is denoted by
.
The moments calculated from the grouped frequencies are denoted by
.
Assume
are not-zero width window for each component and
the numbers of consecutive values grouped in a frequency class of width
.
The procedure raw2grp gives raw moments
in terms of grouped moments
, by using formula (31) of the paper [3].
In particular, set the variable t = 0 when Sheppard's corrections are required for continuous parent distribution.
Note:
that sequence f1 in the procedure refers to formula (14) and sequence f2 refers to formula (17).
Examples
continuous parent distributions
> |
 |
![`+`(g[2, 2], `-`(`*`(`/`(1, 12), `*`(`^`(h[2], 2), `*`(g[2, 0])))), `-`(`*`(`/`(1, 12), `*`(`^`(h[1], 2), `*`(g[0, 2])))), `*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))))](/view.aspx?SI=59515/0/images/SceppardCorrectio_27.gif) |
(3.1.1) |
discrete parent distributions
> |
 |
![`+`(`/`(`*`(`/`(1, 12), `*`(`^`(h[1], 2), `*`(g[0, 2]))), `*`(`^`(m[1], 2))), g[2, 2], `-`(`/`(`*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`^`(m[1], 2)))), `-`(`*`(`/`(1, 12), `*`(`^`(...](/view.aspx?SI=59515/0/images/SceppardCorrectio_29.gif)
![`+`(`/`(`*`(`/`(1, 12), `*`(`^`(h[1], 2), `*`(g[0, 2]))), `*`(`^`(m[1], 2))), g[2, 2], `-`(`/`(`*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`^`(m[1], 2)))), `-`(`*`(`/`(1, 12), `*`(`^`(...](/view.aspx?SI=59515/0/images/SceppardCorrectio_30.gif) |
(3.1.2) |
grp2raw
Suppose
a multivariate random vector.
The raw multivariate moment of X of order
is denoted by
.
The moments calculated from the grouped frequencies are denoted by
.
Assume
are not-zero width window for each component and
the number of consecutive values grouped in a frequency class of width
.
The procedure grp2raw gives grouped moments
in terms of raw moments
, by using formula (32) of the paper [3].
In particular, set the variable t = 0 when Sheppard's corrections are required for continuous parent distribution.
Note:
that sequence f1 in the procedure refers to formula (14) and sequence f2 refers to formula (17).
Examples
continuous parent distributions
> |
 |
![`+`(r[2, 2], `*`(`/`(1, 12), `*`(`^`(h[2], 2), `*`(r[2, 0]))), `*`(`/`(1, 12), `*`(`^`(h[1], 2), `*`(r[0, 2]))), `*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))))](/view.aspx?SI=59515/0/images/SceppardCorrectio_54.gif) |
(4.1.1) |
discrete parent distributions
> |
 |
![`+`(`-`(`/`(`*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`^`(m[1], 2)))), `-`(`/`(`*`(`/`(1, 12), `*`(`^`(h[1], 2), `*`(r[0, 2]))), `*`(`^`(m[1], 2)))), `-`(`/`(`*`(`/`(1, 12), `*`(`^`(...](/view.aspx?SI=59515/0/images/SceppardCorrectio_56.gif)
![`+`(`-`(`/`(`*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`^`(m[1], 2)))), `-`(`/`(`*`(`/`(1, 12), `*`(`^`(h[1], 2), `*`(r[0, 2]))), `*`(`^`(m[1], 2)))), `-`(`/`(`*`(`/`(1, 12), `*`(`^`(...](/view.aspx?SI=59515/0/images/SceppardCorrectio_57.gif) |
(4.1.2) |
Tests
The procedure raw2grp gives raw moments
in terms of grouped moments
If the output is evaluated using
= grp2raw(
]) you obtain the raw moment
continuous parent distributions
> |
 |
![`+`(g[2, 2], `-`(`*`(`/`(1, 12), `*`(`^`(h[2], 2), `*`(g[2, 0])))), `-`(`*`(`/`(1, 12), `*`(`^`(h[1], 2), `*`(g[0, 2])))), `*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))))](/view.aspx?SI=59515/0/images/SceppardCorrectio_64.gif) |
(5.1) |
![r[2, 2]](/view.aspx?SI=59515/0/images/SceppardCorrectio_68.gif) |
(5.2) |
discrete parent distributions
> |
 |
![`+`(g[2, 2], `/`(`*`(`/`(1, 12), `*`(`^`(h[2], 2), `*`(g[2, 0]))), `*`(`^`(m[2], 2))), `-`(`/`(`*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`^`(m[1], 2)))), `-`(`*`(`/`(1, 12), `*`(`^`(...](/view.aspx?SI=59515/0/images/SceppardCorrectio_70.gif)
![`+`(g[2, 2], `/`(`*`(`/`(1, 12), `*`(`^`(h[2], 2), `*`(g[2, 0]))), `*`(`^`(m[2], 2))), `-`(`/`(`*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`^`(m[1], 2)))), `-`(`*`(`/`(1, 12), `*`(`^`(...](/view.aspx?SI=59515/0/images/SceppardCorrectio_71.gif) |
(5.3) |
![r[2, 2]](/view.aspx?SI=59515/0/images/SceppardCorrectio_75.gif) |
(5.4) |
The procedure grp2grp gives grouped moments
in terms of raw moments
If the output is evaluated using
= raw2grp(
]) you obtain the grouped moments
continuous parent distributions
> |
 |
![`+`(r[2, 2], `*`(`/`(1, 12), `*`(`^`(h[2], 2), `*`(r[2, 0]))), `*`(`/`(1, 12), `*`(`^`(h[1], 2), `*`(r[0, 2]))), `*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))))](/view.aspx?SI=59515/0/images/SceppardCorrectio_82.gif) |
(5.5) |
![g[2, 2]](/view.aspx?SI=59515/0/images/SceppardCorrectio_86.gif) |
(5.6) |
discrete parent distributions
> |
 |
![`+`(`-`(`/`(`*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`^`(m[1], 2)))), `*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`/`(1, 12), `*`(`^`(h[2], 2), `*`(r[2, 0]))), `*`(`...](/view.aspx?SI=59515/0/images/SceppardCorrectio_88.gif)
![`+`(`-`(`/`(`*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`^`(m[1], 2)))), `*`(`/`(1, 144), `*`(`^`(h[1], 2), `*`(`^`(h[2], 2)))), `*`(`/`(1, 12), `*`(`^`(h[2], 2), `*`(r[2, 0]))), `*`(`...](/view.aspx?SI=59515/0/images/SceppardCorrectio_89.gif) |
(5.7) |
![g[2, 2]](/view.aspx?SI=59515/0/images/SceppardCorrectio_93.gif) |
(5.8) |
Conclusions
We have shown how the corrections of moments resulting from grouping into classes may be summarized in few closed-form formulae.
Once more, this algorithm shows how the classical umbral calculus should be taken into account for managing sequence of numbers related to random variables, since many calculations are reduced. For example, the reader interested in recovering corrections for cumulants and factorial moments, by using the classical umbral calculus, can refer to [4].
References
[1] Baten, W.D. (1931) Correction for the Moments of a Frequency Distribution in Two Variables. Ann. Math. Stat 2, No. 3, 309-319.
[2] Craig, C.C. (1936) Sheppard's corrections for a discrete variable. Ann.Math. Stat 7, No. 2, 55-61.
[3] Di Nardo E. (2010) A new approach to Sheppard's corrections. Math. Meth. Stat. in press. (http://arxiv.org/abs/1004.4989)
[4] Di Nardo, E., Senato, D. (2006) An umbral setting for cumulants and factorial moments. European J. Combin. 27, No. 3, 394?413. (http://www.arxiv.org/abs/math/0412052)
[5] Di Nardo, E., Guarino, G., Senato, D. (2008) A unifying framework for k-statistics, polykays and their multivariate generalizations. Bernoulli. 14, No. 2, 440?468. (http://www.unibas.it/utenti/dinardo/BEJ6163290408.pdf)
Legal Notice: The copyright for this application is owned by the authors. Neither Maplesoft nor the authors are responsible for any errors contained within and are not liable for any damages resulting from the use of this material. This application is intended for non-commercial, non-profit use only. Contact the authors for permission if you wish to use this application in for-profit activities