Tuesday, July 29, 2008

Going GIGO with CRIRSCO

Snappy acronyms add spice to the way we blog and talk. GIGO has been tagging along with computing science without losing its punch. CRIRSCO is but one tong twisting tour de force for Combined Reserves International Reporting Standards Committee. Its Chairman is Niall Weatherstone of Rio Tinto. Larry Smith of Vale Inco asked Weatherstone about Setting International Standards. Weatherstone said CRIRSCO was set up in 1993 but its website says it was 1994. CRIRSCO's website makes a tough read because of its dreadfully long lines. So what have Weatherstone and his Crirsconians been doing during all those years?

Smith should have but didn’t ask what CRIRSCO has accomplished. It would seem some sort of semi-international reporting template has been set up. The problem is the Russian Federation has a code of its own, and China’s is sort of similar. As it stands, Crirsconians have yet to develop valuation codes for mineral properties. At the present pace, valuation codes that give unbiased confidence limits for contents and grades of reserves and resources might be ready in 2020, the year of perfect vision. It had better be based on classical statistics!

Here’s what was happening in my life when CRIRSCO came about either in 1993 or in 1994. I talked to CIM Members in Vancouver, BC, about the use and abuse of statistics in ore reserve estimation. Bre-X Minerals raised money to acquire the Busang property. Clark wanted me to go from Zero to Kriging in 30 Hours at the Mackay School of Mines. I didn’t go because her semi-variograms are rubbish. The international forum on Geostatistics for the Next Century at McGill University didn’t want to hear about The Properties of Variances. David S Robertson, PhD, PEng, CIM President, failed to, “… find support for your desire to debate.” What irked me was Jean-Michel Rendu’s 1994 Jackling Lecture on Mining geostatistics - Forty years passed. What lies ahead? He rambled on about, “…an endless list of other ‘kriging’ methods…” and prophesied geostatistics, “… is here to stay with all its strengths and weaknesses.” At that time, Rendu knew about infinite sets of kriged estimates and zero kriging variances.

Rendu’s lecture stood in sharp contrast to A Geostatistical Monograph of The Mining and Metallurgical Society of America. Robert Shurtz, a mining engineer and a friend of mine, wrote The Geostatistics Machine and the Drill Core Paradox. Harry Parker, a Stanford-bred geostat sage, was to find fault in Shurtz’s work. This great debate got nowhere because neither grasped the properties of variances. Otherwise, both of them could have put in plain words why kriging variances drop off. A few of Parker’s geostat pals had already found out why in 1989.

Figure 2 is rather odd in the sense that, “The kriging variance rises up to a maximum and then drops off.” That’s precisely what Armstrong and Champigny wrote in A Study of Kriging Small Blocks published in CIM Bulletin of March 1989. What I saw kriging variances do is what real variances never do. Armstrong and Champigny alleged kriging variances drop off because mine planners over-smooth small blocks. More research brought to light that kriged block estimates and actual grades were “uncorrelated.” That would make a random number generator of sorts for kriged block grades. It was David himself who approved that blatant nonsense for publication in CIM Bulletin.

Figure 2 gives kriging variances as a function of variogram ranges. As such, it was more telling than Parker’s. Neither Shurtz nor Parker scrutinized Armstrong and Champigny’s 1989 A Study of Kriging Small Blocks. Otherwise, Shurtz might have pointed out Parker’s kriging variances looked a touch over-smoothed. Neither did Parker confess he does over-smooth the odd time.

Corrected and uncorrected sampling variograms for Bre-X’s bonanza grade borehole BSSE198 show where spatial dependence between bogus gold grades of crushed, salted and ordered core samples from this borehole dissipates into randomness. The adjective “corrected” implies that the variance of selecting a test portion of a crushed and salted core sample, and the variance of analyzing such a test portion, are extraneous to the in situ variance of gold in Bre-X’s Busang resource. Subtracting the sum of extraneous variances gives an unbiased estimate for the intrinsic variance of bogus gold in Busang’s phantom gold resource. Fisher’s F-test proved this intrinsic variance to be statistically identical to zero.

Harry Parker and Jean-Michel Rendu appear to speak for the Society for Mining, Metallurgy and Exploration (SME) in the USA. What it takes to cook up ballpark reserves and resources are soothsayers who know how to failingly infer mineralization between boreholes, hardcore krigers and cocksure smoothers. What CRIRSCO ought to have done after the Bre-X fraud is set up an ISO Technical Committee on reserve and resource estimation. It’s never too late to do it! GIGO may be a bit dated but Garbage In does stand the test of time. Nowadays, Good Graphics Bad Statistics Out is a much more likely outcome. What a pity that GIGGBSO lacks GIGO’s punch!

Saturday, July 12, 2008

Hooked on junk statistics

Our parents told us not to put all our eggs in one basket. This lesson has passed the test of time ever since the Easter Bunny got to working with real eggs. The world’s mining industry has put its basket full of junk statistics and got egg on its fa├žade. Junk statistics does not give unbiased confidence limits for grades and contents of mineral reserves and resources. Annual reports, unlike opinion polls, do not sport 95% confidence intervals and ranges as a measure for the risks mining investors encounter. Many years ago I put classical statistics in my own basket. I thought I couldn’t go wrong because Sir Ronald A Fisher was knighted in 1953. But was I wrong? Matheron, who is called The creator of geostatistics, knew very little about variances, and even less about the properties of variances.

Matheron deserved some credit because he didn’t put all core samples of a single borehole in one baskett. He would have lost all his degrees of freedom but wouldn't have missed them anyway. He did derive the length-weighted average grade of a set of grades determined in core samples of variable lengths. What he didn’t derive was the variance of this length-weighted average. Matheron wrote a Synopsis for Gy’s 1967 Minerals sampling. Gy, in turn, referred to Visman’s 1947 thesis on the sampling of coal, and to his 1962 Towards a common basis for the sampling of materials. Visman bridged the gap between sampling theory with its homogeneous populations and sampling practice with its heterogeneous sampling units and sample spaces. Matheron never knew there was a gap.

Should a set of primary increments be put in one basket? Or should it be partitioned into a pair of subsets? Gy proposed in his 1977 Sampling of Particulate Matter a set of primary increments be treated as a single primary sample. He claimed the variance of a primary sample mass derives from the average mass and number of primary increments in a set, the properties of the binomial distribution, and some kind of sampling constant. I explained in Sampling in Mineral Processing why Gy’s sampling theory and his sampling constant should be consumed with a few grains of salt.


When I met G G Gould for the first time at the Port of Rotterdam in the mid 1960s, he told me how Visman’s sampling theory impacted his work on ASTM D2234-Collection of a Gross Sample of Coal. Visman’s sampling experiment is described in this ASTM Standard Method. Visman’s 1947 thesis taught me that the sampling variance is the sum of the composition variance and the distribution variance. I got to know Jan Visman in person here in Canada. I treasure my copy of his thesis. I enjoyed his sense of humor when we were griping about those who try to play games with the rules of classical statistics.

On-stream data for slurries and solids taught me all I needed to grasp about spatial dependence in sampling units and sample spaces. Fisher’s F-test is applied to test for spatial dependence, to chart a sampling variogram, and to optimize a sampling protocol.


Selecting interleaved primary samples by partitioning the set of primary increments into odd- and even-numbered subsets is described in several ISO standards. A pair of A- and B-primary samples gives a single degree of freedom but putting all primary increments in one basket gives none. Shipments of bulk solids are often divided in sets of lots so that lower t-values than t0.05; 1=12.706 apply. Those who do not respect degrees of freedom as much as statisticians do may cling to the notion that the cost for preparing and testing a second test sample is too high a price for some invisible degree of freedom. They just don't grasp why confidence limits and degrees of freedom belong together as much as do ducks and eggs.

The interleaved sampling protocol gives a reliable estimate for the total variance at the lowest possible cost. It takes into account var2(x), the second variance term of the ordered set. It makes sense to take interleaved bulk samples in mineral exploration because they give realistic estimates for intrinsic variances in sample spaces. Both Visman and Volk, the author of Applied Statistics for Engineers, were conversant with classical statistics. The geostatistical fraternity made up some new rules and fumbled a few others. They got hooked on a basket of junk statistics and are doomed to end up with egg on their faces.

Tuesday, July 01, 2008

Sorting out Matheron's junk statistics

Matheron claimed in the Rectificative to his Note Statistique No 1 that he had derived the length-weighted average lead and silver grades of core samples with variable lengths. I couldn’t verify whether he did or not because primary data and weighted average grades were missing. Matheron didn’t derive unbiased confidence limits for weighted average grades. Here’s what he should have done but didn’t do. He should have derived the variances of length-weighted average lead and silver grades. He should have tested for spatial dependence between metal grades of ordered core samples by applying Fisher’s F-test to the variance of the set and the first variance term of the ordered set. He should have used the lowest variance to derive unbiased confidence limits for weighted average grades. He should have taken into account the variable lengths of core samples. It was beyond his grasp to count degrees of freedom either for the set of core samples, or for the ordered set. It is safe to assume Matheron did know how to count core samples.

Matheron’s Note Statistique No 1 proved he was well on his way to become a self-made wizard of odd statistics. Matheron worked by himself and made but few references to other authors when he was stacking the odds against classical statistics. He didn’t have what it took to grasp “la statistique classique.”

Just the same, he wrote 85 papers between 1954 and 1965. Rapport N-96 was a 1965 paper by Matheron and Formery. I took an instant liking to its rich title! It might shed light on Matheron’s work between 1954 and 1965. Did he add a touch of Visman’s sampling theory or a dash of Volk’s applied statistics to his search for structure and randomness in that new science of geostatistics? Not so fast!

Matheron and Formery brought up that De Wijs, Krige and Sichel worked with geometric concepts unknown in “la statistique classique” and to its practitioners. Yet, those authors did refer to classical statistics in their own work. Matheron and his coauthor did agree statistics had a role to play in quality control of manufactured products. Just the same, they prattled a lot about all that’s wrong with classical statistics. Here’s but one line I’ve struggled to convert into English prose, “The properties of classical statistics are often transposed in a rather rough manner.” I’ll say! And here’s more drivel, “(Classical statistics) resulted sometimes in naivety or even silliness.” Don’t take my word for it but do read that rather rough and silly paper.

Matheron's structure and randomness

Matheron and his coauthor set out to study structure and randomness at regular intervals. They did so with the aid of ordered and randomly distributed integers. Readers were told to put a pragmatic spin on structure and randomness, and to infer integers are in fact grades. My son and I worked with genuine gold grades of ordered rounds in a drift. We derived Riemann sums and proved a significant degree of spatial dependence between ordered grades by applying Fisher’s F-test to the variance of the set and the first variance term of the ordered set. It was that simple! Yet, geostatistical minds are taught to infer grades between coordinates

Riemann's method is precisely what Matheron and his coauthor should have applied in 1965. Riemann sums would have given the jth variance term of an ordered set (Matheron’s structured set) as follows: varj(x)=∑(xi−xi+j)2÷[2(n−j)]. The first variance term of the ordered set is var1(x)=0.50, and the variance of the set is var(x)=2.82. The observed value of F=2.82/0.50= 5.64 exceeds the tabulated value of F0.05;10;20=2.35 at 95% probability and with applicable numbers of degrees of freedom. Hence, the ordered set displays a statistically significant degree of spatial dependent. And dont' take my stats on face value! Set up a spreadsheet template and figure out what I did!

Riemann sums also underpin sampling variograms. A sampling variogram is a graph that shows where orderliness in a sample space or a sampling unit dissipates into randomness. Matheron and Formery mentioned variograms but didn’t explain how to derive lags that underscore where orderliness disperses into randomness. Matheron’s search for structure and randomness made him march in place to the beat of kriging drums. Matheron knew he ought to do something but never knew what Visman had done already. He babbled gibberish when contemplating what to do next. Matheron’s problem was he didn’t have the foggiest notion what Sir Ronald A Fisher had been doing across the Channel ever since the storm with Pearson about degrees of freedom. Matheron and his disciples didn’t have a clue how they got into junk statistics.