Running means and covariances

Welfords algorithmPermalink

The mean of a string of numbers is

μn=(x1+x2+xn)/n

and this gives a simple update equation for a running mean, when updating online.

μ0=0 and μn=μn1n1n+xn1n

The similar formlua for variance is by accumulating the sum of squares Sn=i=1nxixiT=Sn1+xnxnT, S0=0 and then compute

Σn=1ni=1n(xiμn)(xiμn)T=1n(i=1nxixiT)μn2=Snnμn2

The algorithm is not great, since we compute the answer as a difference between two large quantities, and may lose numeric precision.

We want to find an accumulator that is “closer” to the answer, not requireing a subtraction at the end. Examine the difference of sum of central squares, Mn=nΣn. We obtain

Σn=Mn/nM0=0Mn=Mn1+n1n(xnμn1)(xnμn1)T

The proof of this formula is quite simple. Some steps use the recursion μn=μn1+xnμn1n. Others use i=1n1xi=(n1)μn1. The rest uses straightforward algebra. the computation below is for scalar data, but the generalization to vector valued data is straightforward since I won’t use commutativity. Just replace multiplication with outer products.

MnMn1=i=1n(xiμn)2i=1n1(xiμn1)2=(xnμn)2+i=1n1[(xiμn)2(xiμn1)2]=(xnμn)2+i=1n1[(μnxi)μn+(μn1μn)xi+(xiμn1)μn1]=(xnμn)2+(n1)(μnμn1)2=(n1)2n2(xnμn1)2+n1n2(xnμn1)2=n1n(xnμn1)2=(xnμn1)2(xnμn1)2/n

The paper Comparison of Several Algorithms for Computing Sample Means and Variances by Robert F. Ling (1974) doi attributes the final equation to Welford (1962), but also posits some alternative algorithms and compare them. While writing this article, I found that others have written this or very similar blog post before me, e.g. Joni Salonen, Jakob Ludewig and John D Cook. The formulas reported in their posts may differ slightly from mine, and might be better. Do your own work if you want to be sure which one is the best for your data. You can also read about this kind of algorithms on Wikipedia.

Updated: