in Linear Regression
have discussed before, the MLE approach is only valid under the assumption
that the error component is distributed according to a prespecified
probability distribution function. Most of the time the normal
distribution is implicitly assumed.
order to check the error component for normality there are some
diagnostics which may be valuable assistants: the histograms, and
measures for kurtosis and skewness.
is a graph consisting of a set of rectangles. Each rectangle has its
base on the x axis and the length of each base is equal to the class
interval sizes. The area of each rectangle is proportional to the
class frequency. With some practice it should not be hard to see at
one glance if the histogram "fits" the normal curve.
of a distribution, which is said to be a measure for peakedness, can
for instance be measured by
is the fourth moment about the mean divided by the square of the
second moment about the mean. It can be proved that a normally
distributed variable has a kurtosis of 3 (according to eq.
(III.I.4-1)). Therefore a distribution can be called
"leptokurtic" if the measure is larger than 3 and
"platykurtic" if it is smaller than 3.
coefficient of kurtosis should only be used with care ! In the
statistics literature there have been given many examples where this
measure for kurtosis fails (for small samples). Also, it is possible
for a variable to have a kurtosis of 3, without being normally
skewness of a
distribution, which is a measure for asymmetry, can for example be
second coefficient of skewness is defined by
skewness of a distribution can also be defined by means of moments as in
distribution of a stochastic variable can also be tested with
respect to any
given theoretical distribution. The main idea behind these tests is
that the difference between histogram-frequencies of the variable
and theoretical frequencies are computed. It is then tested whether
these differences are "large enough" in order to reject
the null-hypothesis that the variable is distributed according to
the prespecified theoretical distribution. Most of these tests are
based on the Chi-square distribution. An excellent review (with
illustrations) of these tests (including graphical methods such as
root-display") can be found in Mills 1990.
suspended rootogram is
based on the same principle, except for the fact that the deviations
between the normal and actual frequencies is computed on using