Sunday, February 08, 2009

Junk statistics on Wikipedia

Wiki’s Kriging bugs me just as much today as did textbooks on geostatistics in the 1990s. What bugs me most of all is that Wiki’s keepers of Kriging didn’t give the set of measured values. That brought back many bad memories of Matheron’s magnum opus. His 1954 Formule des Minerais Connexes is peppered with formulas and symbols, gives but few a derived statistics, and no sets of measured values. This paper is posted as Note géostatistique No 1 but is itself marked Note Statistique No 1. Somebody played a silly game in predating the birthday of Matheron’s new science of geostatistics. Wiki's link to Matheron's seminal work went dead on December 12, 2008. Click Centre de Geosciences, and go to Ressources Documentaires & Logiciels. Next, click Bibliothèque Géostat (en ligne) and take a long look at Matheron’s past. This link is still hot, and I’m tickled pink. I’ve got to get this new link on some of my old blogs.

Wiki's Kriging keepers made me think of Matheron's statistically challenged disciples, and of all their tangled thoughts. Why do formulas and symbols run rampant where sets of measured values are as scarce as hen’s teeth? Little odds and ends of geostat speak such as “…a system of linear equations which is obtained by assuming that ƒ is a sample path of a random process F(x)…” make me cringe. Sounds a bit like Matheron’s take on Brownian motion. Why did degrees of freedom fail to inspire the wardens of Wiki’s Kriging? Of course, it would explain why they just keep on kriging for life!

I had asked for but never got the set of measured values that underpins Figure 1. So, I derived the same measured values in scale units. Fisher’s F-test proved that the ordered set of scale units does not display a significant degree of spatial dependence. The guardians of Wiki’s Kriging pointed out, “From the geological point of view, the practice of kriging is based on assuming continued mineralization between measured values”. What a way to practice kriging! Stanford’s Journel espoused the same sort of assumed nonsense in 1992. I never took him serious but he may well have thought he was. I did what Journel didn't do in 1978. Several years before the Bre-X fraud I derived variances of density- and length-weighted average lead and silver grades of core samples. I worked with weighting factors since the set of measured values in Figure 1 is unevenly spaced.

I’m caught between real statistics and hardcore kriging. I interpolated by kriging between each pair of measured Y- and X-values. The spreadsheet template shows that the first pair gives a Y-value of 103.0 scale units and an X-value of 25.8 scale units, the second pair gives a Y-vale of 96.0 scale units and an X-value of 45.5 scale units, and so on for a set of seventeen (17) pairs. The following chart shows why interpolation by kriging does so much more with less. All it takes is to rig the rules of statistics.

False 95% confidence intervals

Now here’s the clincher. Fisher’s F-test cannot be applied to an ordered set of seventeen (17) values, each of which is either measured or kriged. The problem is sets of measured values do give degrees of freedom whereas kriged values give none. A simple rule of thumb is that measured values do give degrees of freedom whereas kriged values give nothing but headaches. Unless, of course, one grasps the irrefutable fact that each kriged estimate does have its own variance just as much as do central values such as arithmetic means and all sorts of weighted averages.


Listed above are 95% confidence limits for central values of nine (9) measured values only, and of seventeen (17) measured and kriged values. Interpolation by kriging between measured values seem to give a higher degree of precision do than measured values alone. It’s not so much that Krige knew how to work miracles with a few measured values but that Matheron’s disciples have rigged the rules of real statistics. To put it simply, a kriged estimate has its own variance since all functionally dependent values do. A reliable rule of thumb is kriged estimates give big problems whereas measured values give degrees of freedom.

One would expect Wiki’s Kriging squad to show how “95% confidence intervals” in Figure 1 were derived. Surely, the squad was joking when it put forward, “Assuming prior knowledge encapsulates how minerals co-occur as a function of space. Then, given an ordered set of measured grades, interpolation by kriging predicts mineral concentrations at unobserved points”. How about that? Sounds like Wikipedians live in Wonderland. Krige himself couldn’t have cooked up such drivel.

No comments: