A Comparison of the Näive Histogram and the Cool Non-Parametric Density Estimate:

  Here is a video of a histogram generated on a single dataset (the first day of returns of the 100 stocks - i.e. 100 datum).  However the 'binwidth' is allowed to vary, i.e. the size of the bars is allowed to increase thus including more observations in each bar.  Which binwidth is ideal?  Also, the origins are moving - that is to say that the center point of each bar is moving.  Which center point is ideal?  Finally, with what confidence can we say that the estimated probability is the 'true' probability of the 'true' density?  These are all questions that non-parametric density estimation with smooth kernels deals with.

Histogram with binwidth and origins changing. NonParametric Density Estimate with changing 'h'.
Histogram with changing h.avi (317K bytes) Effect of Varying h.avi (405K bytes)

Click on image to see video

    On the right is a clip of the same data set, this time with the density estimated with a Normal kernel.  The width, or 'h' of the normal curve that is used to weight the probabilities of the adjacent observations is varied too.  In fact the 'h' and the binwidths are the same thing, only the kernel for the histogram is a univariate density.  One can see that the bivariate nature of the distribution is more clearly and consistently illustrated with the N-P estimator.

BACK