Tuesday, July 01, 2008

Sorting out Matheron's junk statistics

Matheron claimed in the Rectificative to his Note Statistique No 1 that he had derived the length-weighted average lead and silver grades of core samples with variable lengths. I couldn’t verify whether he did or not because primary data and weighted average grades were missing. Matheron didn’t derive unbiased confidence limits for weighted average grades. Here’s what he should have done but didn’t do. He should have derived the variances of length-weighted average lead and silver grades. He should have tested for spatial dependence between metal grades of ordered core samples by applying Fisher’s F-test to the variance of the set and the first variance term of the ordered set. He should have used the lowest variance to derive unbiased confidence limits for weighted average grades. He should have taken into account the variable lengths of core samples. It was beyond his grasp to count degrees of freedom either for the set of core samples, or for the ordered set. It is safe to assume Matheron did know how to count core samples.

Matheron’s Note Statistique No 1 proved he was well on his way to become a self-made wizard of odd statistics. Matheron worked by himself and made but few references to other authors when he was stacking the odds against classical statistics. He didn’t have what it took to grasp “la statistique classique.”

Just the same, he wrote 85 papers between 1954 and 1965. Rapport N-96 was a 1965 paper by Matheron and Formery. I took an instant liking to its rich title! It might shed light on Matheron’s work between 1954 and 1965. Did he add a touch of Visman’s sampling theory or a dash of Volk’s applied statistics to his search for structure and randomness in that new science of geostatistics? Not so fast!

Matheron and Formery brought up that De Wijs, Krige and Sichel worked with geometric concepts unknown in “la statistique classique” and to its practitioners. Yet, those authors did refer to classical statistics in their own work. Matheron and his coauthor did agree statistics had a role to play in quality control of manufactured products. Just the same, they prattled a lot about all that’s wrong with classical statistics. Here’s but one line I’ve struggled to convert into English prose, “The properties of classical statistics are often transposed in a rather rough manner.” I’ll say! And here’s more drivel, “(Classical statistics) resulted sometimes in naivety or even silliness.” Don’t take my word for it but do read that rather rough and silly paper.

Matheron's structure and randomness

Matheron and his coauthor set out to study structure and randomness at regular intervals. They did so with the aid of ordered and randomly distributed integers. Readers were told to put a pragmatic spin on structure and randomness, and to infer integers are in fact grades. My son and I worked with genuine gold grades of ordered rounds in a drift. We derived Riemann sums and proved a significant degree of spatial dependence between ordered grades by applying Fisher’s F-test to the variance of the set and the first variance term of the ordered set. It was that simple! Yet, geostatistical minds are taught to infer grades between coordinates

Riemann's method is precisely what Matheron and his coauthor should have applied in 1965. Riemann sums would have given the jth variance term of an ordered set (Matheron’s structured set) as follows: varj(x)=∑(xi−xi+j)2÷[2(n−j)]. The first variance term of the ordered set is var1(x)=0.50, and the variance of the set is var(x)=2.82. The observed value of F=2.82/0.50= 5.64 exceeds the tabulated value of F0.05;10;20=2.35 at 95% probability and with applicable numbers of degrees of freedom. Hence, the ordered set displays a statistically significant degree of spatial dependent. And dont' take my stats on face value! Set up a spreadsheet template and figure out what I did!

Riemann sums also underpin sampling variograms. A sampling variogram is a graph that shows where orderliness in a sample space or a sampling unit dissipates into randomness. Matheron and Formery mentioned variograms but didn’t explain how to derive lags that underscore where orderliness disperses into randomness. Matheron’s search for structure and randomness made him march in place to the beat of kriging drums. Matheron knew he ought to do something but never knew what Visman had done already. He babbled gibberish when contemplating what to do next. Matheron’s problem was he didn’t have the foggiest notion what Sir Ronald A Fisher had been doing across the Channel ever since the storm with Pearson about degrees of freedom. Matheron and his disciples didn’t have a clue how they got into junk statistics.

No comments: