Saturday, July 12, 2008

Hooked on junk statistics

Our parents told us not to put all our eggs in one basket. This lesson has passed the test of time ever since the Easter Bunny got to working with real eggs. The world’s mining industry has put its basket full of junk statistics and got egg on its fa├žade. Junk statistics does not give unbiased confidence limits for grades and contents of mineral reserves and resources. Annual reports, unlike opinion polls, do not sport 95% confidence intervals and ranges as a measure for the risks mining investors encounter. Many years ago I put classical statistics in my own basket. I thought I couldn’t go wrong because Sir Ronald A Fisher was knighted in 1953. But was I wrong? Matheron, who is called The creator of geostatistics, knew very little about variances, and even less about the properties of variances.

Matheron deserved some credit because he didn’t put all core samples of a single borehole in one baskett. He would have lost all his degrees of freedom but wouldn't have missed them anyway. He did derive the length-weighted average grade of a set of grades determined in core samples of variable lengths. What he didn’t derive was the variance of this length-weighted average. Matheron wrote a Synopsis for Gy’s 1967 Minerals sampling. Gy, in turn, referred to Visman’s 1947 thesis on the sampling of coal, and to his 1962 Towards a common basis for the sampling of materials. Visman bridged the gap between sampling theory with its homogeneous populations and sampling practice with its heterogeneous sampling units and sample spaces. Matheron never knew there was a gap.

Should a set of primary increments be put in one basket? Or should it be partitioned into a pair of subsets? Gy proposed in his 1977 Sampling of Particulate Matter a set of primary increments be treated as a single primary sample. He claimed the variance of a primary sample mass derives from the average mass and number of primary increments in a set, the properties of the binomial distribution, and some kind of sampling constant. I explained in Sampling in Mineral Processing why Gy’s sampling theory and his sampling constant should be consumed with a few grains of salt.


When I met G G Gould for the first time at the Port of Rotterdam in the mid 1960s, he told me how Visman’s sampling theory impacted his work on ASTM D2234-Collection of a Gross Sample of Coal. Visman’s sampling experiment is described in this ASTM Standard Method. Visman’s 1947 thesis taught me that the sampling variance is the sum of the composition variance and the distribution variance. I got to know Jan Visman in person here in Canada. I treasure my copy of his thesis. I enjoyed his sense of humor when we were griping about those who try to play games with the rules of classical statistics.

On-stream data for slurries and solids taught me all I needed to grasp about spatial dependence in sampling units and sample spaces. Fisher’s F-test is applied to test for spatial dependence, to chart a sampling variogram, and to optimize a sampling protocol.


Selecting interleaved primary samples by partitioning the set of primary increments into odd- and even-numbered subsets is described in several ISO standards. A pair of A- and B-primary samples gives a single degree of freedom but putting all primary increments in one basket gives none. Shipments of bulk solids are often divided in sets of lots so that lower t-values than t0.05; 1=12.706 apply. Those who do not respect degrees of freedom as much as statisticians do may cling to the notion that the cost for preparing and testing a second test sample is too high a price for some invisible degree of freedom. They just don't grasp why confidence limits and degrees of freedom belong together as much as do ducks and eggs.

The interleaved sampling protocol gives a reliable estimate for the total variance at the lowest possible cost. It takes into account var2(x), the second variance term of the ordered set. It makes sense to take interleaved bulk samples in mineral exploration because they give realistic estimates for intrinsic variances in sample spaces. Both Visman and Volk, the author of Applied Statistics for Engineers, were conversant with classical statistics. The geostatistical fraternity made up some new rules and fumbled a few others. They got hooked on a basket of junk statistics and are doomed to end up with egg on their faces.

No comments: