Friday, April 4, 2008

[MT 1] How to combine x- and y-step widths for the calculation of the kurtosis


  • We have a list of two-dimensional points (x_i, y_i), describing a particle trajectory.
  • This defines a list of steps (dx_i, dy_i)=(x_i-x_i-1 , y_i-y_i-1 ).
  • The steps have a certain 2D statistical distribution P(dx_i, dy_i).
  • We are interested in "The kurtosis of the step width distribution".

Method 1: Merging dx- and dy-lists

Some people proceed as follows:
  • Show that the dx_i and dy_j are statistically independent.
  • Merge the dx_i and dy_i into a single list (ds_i) = (dx_1, dx_2,..., dx_N, dy_1, dy_2,..., dy_N).
  • Compute the kurtosis of the combined list (ds_i).
A simple example demonstrates that this procedure is incorrect in general:
  • Assume that P(dx) = Gaussian[mean=0,var=sigmaX].
  • Assume that P(dy) = Gaussian[mean=0,var=sigmaY].
  • Assume that sigmaX >> sigmaY, so that the 2D distribution is elliptically streched along the x-axis.
  • The distribution P(ds) of the combined list entries (ds_i) looks as follows: A narrow Gaussian peak riding on top of a broad one.
  • This pronounced central peak with broad wings results in a positive kurtosis.
  • But in reality the kurtosis of uncorrelated Gaussian random numbers is zero.


Method 2: Separate kurtosis

The easiest solution is to give up the idea of a combined kurtosis and to determine two separate kurtosis for the dx- and dy-data. Finally the average may be taken - or not.


Method 3: Using Euklidian distance

The width of step i is defined as

\Delta r_i=\sqrt{\Delta x_i^2+\Delta y_i^2}

It appears natural to use the distribution P(dr) for computing "The step width kurtosis". However, in a 2D plane points at radius r around the center are geometrically over-represented by a factor of r,

d\Delta_x d\Delta_y = 2\pi \;\Delta r \; d\Delta r ,

and so a naive numerical histogram P(dr) will be distorted by this geometrical factor. The operation P(dr)-->P(dr)/dr fixes the problem in principle, but multiplies the noise at small dr. This is especially critical for computing the curtosis afterwards.


No comments:

Post a Comment