Thursday, April 21, 2011

Pros and cons of Practical Geostatistics

Dr Isobel Clark is the author of Practical Geostatistics. What she did in this 1979 textbook took me by surprise. It would have baffled many a thinker who has never bothered to read works of others. She deserved praise because she had derived the variance of the distance-weighted average. Agterberg in 1974, David in 1977, and Journel in 1978 never took the trouble to derive this variance. It’s a bad omen that the distance-weighted average morphed into a kriged estimate on Matheron’s watch. It was Matheron who had failed to derive the variance of this kriged estimate long before geostatistics was hailed far and wide as a new science.

Dr IC pointed out on the jacket of her textbook, “Geostatistics is the popular name for the application of statistical methods to problems in mining and geology”. She confessed in her Preface that Journel and others at Fontainebleau, France had taught her all she knows about the Theory of Regionalized Variables. Now that’s what got me worried. Matheron’s way of drawing strings of symbols on a blackboard may well have driven his most gifted disciple to routinely assume spatial dependence. All it took is to assume, krige, smooth, a leap of faith, and a lot of luck!

Author 1974 Geomathematics
ex NRCan Emeritus Scientist

Dr Frederik P Agterberg, too, took a leap from applied statistics to geostatistics. He went from serial correlations in 1958 to functions without variances in 1974. In contrast, Dr Isobel Clark derived not only the distance-weighted average grade of a set of measured values but also the variance of the set and the variance of its central value. Much of it is detailed in Figure 4.1 and Table 4.1 of her textbook and a few pages about her data set. She fretted a bit on page 72 whether or not the Central Limit Theorem would hold. But it always does! Surely, David himself in 1977 would not have written about if it were false.

Dr IC’s set of hypothetical uranium concentrations seem to have been measured in hypothetical samples selected at positions with real coordinates in a two-dimensional sample space. I wish she had given the mass of each of her hypothetical samples. She reports that her 95% confidence range for a distance-weighted average hypothetical uranium grade of 400 ppm has a lower limit of 95% CRL=350 ppm and an upper limit of 95% CRU=450 ppm. It doesn’t look too bad, does it? Of course, t0.95;4=2.776 from the t-distribution rather than z0.95=1.96 from the normal distribution should have been used to derive the 95% confidence interval for xbar=400 ppm. But who would split hairs at this stage? Sir R A Fisher was already counting degrees of freedom for small data sets when Dr IC was a cute tot.

Assumes with poise!

The question is not so much whether this distance-weighted average hypothetical grade of 400 ppm is an unbiased estimate for the unknown true grade at 1,244 m Easting and 713 m Northing. Of course, I wish I knew her sampling protocol. The question is whether it is an unbiased estimate for the sample space defined by this set of five (5) hypothetical uranium concentrations. The most effective method to test for spatial dependence is to do what door-to-door sales people have always done. Travel from point to point such that each point is called on but once and the shortest distance is traveled. The next step is to derive the variance of the set and the first variance term of the ordered set. Finally, Fisher’s F-test is applied to the variance of the set and this first variance term. The observed F-value of 2.07 between var(x)=4,480 for the set and var1(x)=2,161 for the ordered set is below the tabulated F-value of 4.53 at 95% probability. Hence, her set of measured values does not display a significant degree of spatial dependence. By implication, the distance-weighted average grade of 400 ppm is not an unbiased estimate for this sample space. Dr IC did what Journel had taught her to do. All she did was assume spatial dependence between measured values in the ordered set. Here’s what happens when spatial dependence is assumed and functionally dependent kriged estimates are added to measured values.

Assume, krige, smooth, and be happy!

Dr Isobel Clark is due to teach a 3-day course at Global InfoMine in Vancouver, BC on May 3-5, 2011. She will take students from "no knowledge of statistics or geostatistics to understanding the mysteries of ordinary kriging and its variants in 30 hours (or less)". I wish I were there to shed light on that shameless practice of converting bogus grades and barren rock into a phantom gold resource. But then, why rush the good stuff!

Friday, April 01, 2011

ISO to tackle trueness

ISO has set the stage to tackle trueness. It did so by issuing an NWIP ( New Work In Progress). I am pleased that ISO/TC69/SC6/WG1 has been entrusted with the task. Accuracy (trueness and precision) is way wide of the mark. Trueness (accuracy and precision) reads but a bit better. I would rather work with either Precision and accuracy or Precision and bias. Trueness makes more sense in a court of law. I have worked on a number of ISO Committees since 1974. I would find it a senseless task to keep track of trueness. Testing for bias always makes sense. Student’s t-test not only shows what’s biased and what’s not but also gives intuitive measures for statistical risks. I am pleased that ISO never took to kriging and smoothing. Unlike ASTM which went along with geostatistical thinking in 1994. ASTM was set up in 1898 by chemists and engineers. A precursor of CIMMP was set up in 1898. CIMMP's geostatistical peer review has been biased since the 1990s.

ISO/TC183 has set up a standard method to derive metal contents and grades of mineral concentrates and ores. It would have been just as simple to set up an ISO Technical Committee to derive metal contents and grades of mineral reserves and resources. But the COSMO brains behind the mining industry were too keen to assume, krige, smooth, and rig the rules of applied statistics. That’s why Bre-X’s bogus grades and Busang’s barren rock morphed so smoothly into a massive phantom gold resource. And that’s when Barrick Gold asked me to assist in testing for bias between paired gold assays determined by cyanide leaching and by fire assaying.

Student's t-test for bias

The observed t-value of 11.258 is statistically significant at 99.9% probability. It’s impossible to salt drill core with placer gold. In contrast, salting crushed core with placer gold is a cinch. Early in 1997 I derived confidence limits for the mass of gold for one of Barrick’s many deposits. My report was deemed worth its weight in gold. Here’s what Richard Rohmer wrote in Golden Phoenix, “For Munk and others affected by Bre-X and Busang, the strange news about de Guzman was the first hint that the find might well not be what Walsh and Felderhof were claiming”. He also pointed out, “Peter Munk was appalled when he read the Strathcona report. The damage inflicted on Canada’s national and international mining industry was beyond what anyone would have thought possible a few short weeks before”. Now that’s what crying out late is all about! I took a look in my crystal ball. It seems as if a Munk Debates for and against geostatistics has been put on ice!

Here's what Peter Munk himself pontificated on the jacket of his Golden Phoenix:
"You have to be courageous; you have to learn to take advantages of change. Be non-conventional; don't fritter your energies - be focused; remember to share. Most important, use the biggest weapon of all weapons, the least appreciated yet the most important tool for success, and this is moral integrity; and don't be afraid to dream and don't be afraid to dream big."

Never mind Munk's moral integrity! I dream about scientific integrity in mineral exploration and mining. It’s a piece of cake to derive borehole statistics with spreadsheet software. I like to call such stats the fingerprint of a borehole. Those who have tried like it a lot! Following is a synopsis derived of the fingerprint of one mind-boggling borehole.

Fingerprint of monster borehole

It shows some 5.5 million ounces of gold in 78 million metric tons of ore. SME published in Transactions 2000 a paper on Borehole Statistics with Spreadsheet Software. One reviewer called it “an excellent paper” and the other thought “it would stir up a hornet’s nest”. But the hornets stayed put!

It makes no sense to assume spatial dependence in sample spaces. Fisher’s F-test ought to be applied to the variance of the set of measured values and the first variance term of the ordered set. Interpolation between measured values gives a false positive for spatial dependence. And that’s where geostatistocrats are taking not only mineral reserve and resource estimation but also the study of climate change. Or climate dynamics as I like to call it! I found Loehle’s 2000-year global temperature reconstruction by accident. Loehle and McCulloch pointed out in 2008 that data sets were smoothed with a 30-year running mean. It does create a pretty smooth picture. But here’s what happens with variances when measured values are enriched with variance-deprived kriged estimates.

Test for homogeneity of variances

I partitioned Loehle’s 2000-year set into twenty (20) sets of no more than 100-year each, derived the first variance term of each set, and applied the chi-square test to this set of twenty (20) variances. These variances constitute too homogeneous a set. The probability that this inference is true exceeds 99.9%. The probability that it is false is less than 0.1%.

I trust that ISO/TC69/SC6/WG1 will never assume spatial dependence between measured values in ordered sets. ISO/TC17 on Steel and ISO/TC34 on Food Products are interested in this New Work in Progress. I shall continue to write whatever needs to be put in writing to ensure that applied statistics prevails not only in ISO standards but also in mineral reserve and resource estimation.