geostatscam

Monday, October 27, 2008

How to lie with geostatistics

Here’s how to in a nutshell. The most brazen lie of all was to deny that weighted averages do have variances. The stage for this lie was set at the French Geological Survey in Algeria on November 25, 1954. It came about when a novice in geology with a knack for probability theory put together his very first research paper. The author had called his paper Formule des Minerais Connexes. He had set out to prove associative dependence between lead and silver in lead ore. He worked with symbols on the first four pages. Handwritten on page 5 are arithmetic mean grades of 0.45% lead and 100 g/t silver, variances of 1.82 for lead and 1.46 for silver, and a correlation coefficient of 0.85. He had worked with symbols until page 5 and did omit his set of primary data. Neither did he refer to any of his peers. Those were peculiar practices that would remain this author's modus operandi for life.

This budding author was to be the renowned Professor Dr Georges Matheron, the founder of spatial statistics and the creator of geostatistics. What young Matheron had derived in his 1954 paper were arithmetic mean lead and silver grades of drill core samples. But he had not taken into account that his core samples varied in lengths. So he did derive length-weighted average lead and silver grades and appended a correction to his 1954 paper on January 13, 1955. What he had not done is derive the variances of his length-weighted average lead and silver grades. Neither did he test for, or even talk about, spatial dependence between metal grades of ordered core samples. Matheron’s first paper showed that testing for spatial dependence was beyond his grasp in 1954.

Why was Formule des Minerais Connexes marked Note statistique No 1? Matheron had not derived variances to compute confidence limits for arithmetic mean lead and silver grades but applied correlation-regression analyis. Statisticians do know that the central limit theorem underpins sampling theory and practice. So why didn’t young Matheron derive confidence limits? Surely, he was familiar with this theorem, wasn't he? Or was it because he thought he was some sort of genius at probability theory? That would explain why he worked mostly with symbols and rarely with real data. Had he worked with real data, he would still have cooked up odd statistics because the variances of his central values went missing. That’s why he was but a self-made wizard of odd statistics. It was Matheron who called the weighted average a kriged estimate as a tribute to the first mining engineer who took to working with weighted averages. Matheron never bothered to differentiate area-, count-, density-, distance-, length-, mass- and volume-weighted averages. But then, neither did any of his disciples.

Matheron’s followers, unlike real statisticians, didn’t take to counting degrees of freedom. Statisticians do know why and when degrees of freedom should be counted. Geostatisticians don’t know much about degrees of freedom but they do know how to blame others when good grades go bad. They always blame mine planners, grade control engineers, or assayers whenever predicted grades fail to pan out. They claim over-smoothing causes kriging variances of kriged estimates to rise and fall. Kriging variances rise and fall because they are pseudo variances that have but squared dimensions in common with true variances. Of course, Matheron’s odd new science is never to blame for bad grades or bad statistics.

It is a fact that Matheron fumbled the variance of his length-weighted average in 1954. Several years before the Bre-X fraud I derived the variance of a length- and density-weighted average metal grade. The following example is based on core samples from an ore deposit in Canada. The mine itself is no longer as Canadian as it once was. The Excel template with the set of primary data and its derived statistics are posted on a popular but wicked website.

My website was set up early in the Millennium. I loved to send emails with links to my reviews of Matheron’s new science of geostatistics. Students at the Centre de Géostatistique (CDG) in Fontainebleau, France, ranked on high on my list of those who ought to pass Statistics 101. I was pleased when PDF files of Matheron’s work were posted with CDG’s online library. But I was surprised to find out that Matheron’s first paper was no longer listed as Note statistique No 1 in the column marked Reference but as Note géostatistique No 1. Just the same, the PDF file of this paper and its appended correction are still marked Note statistique No 1. On October 27, 2008, five out of six of Matheron's 1954 papers were still marked Note statistique Nrs 2 to 6.

What was going on? Was the birth date of Matheron’s new science of geostatistics under review? Who reviewed it? And why? Why not retype the whole paper? Why not add the variances of length-weighted average lead and silver grades? And how about testing for spatial dependence between metal grades of ordered core samples? Where have all of Matheron’s sets of primary data gone? And what has happened to his old Underwood typewriter? I have so many questions but hear nothing but silence!

Matheron himself moved from odd statistics to geostatistics in 1959 when he went without a glitch from Note statistique no 19 to Note géostatistique no 20. Check it out before geostat revisionists strike again. I admit to having paraphrased Darrell Huff’s How to lie with statistics. But I couldn’t have made up that this delightful little work was published for the first time in 1954. That’s precisely when young Matheron was setting the stage for his new science of geostatistics in North Africa. Matheron, the creator of geostatistics, never read Huff’s work. But then Huff didn't read Matheron’s first paper either. Thank goodness Darrell Huff’s How to lie with statistics is still in print!

Monday, September 29, 2008

Going green and gone nuts

Is our world going green? It may be a long while before we know. That’s because scores of geoscientists have gone nuts and work with junk statistics. In Canada, too, geoscientists would rather infer than test for spatial dependence in sampling units and sample spaces. The more so since it’s all in The Inspector’s Field Sampling Manual. Nobody should have to read it. Not even EC’s own inspectors. I had to in the early 2000s because Environment Canada had taken a client of mine to court. It was about my statistical analysis of test results determined in interleaved primary samples. So I worked my way through EC’s manual and found all sorts of sampling methods. What I didn’t find was the interleaved sampling method. I had put this method on my list of smart statistics long before global warming got hot.

Here’s what I did find out when I struggled with EC’s manual. Inspectors are taught, “Systematic samples taken at regular time intervals can be used for geostatistical data analysis, to produce site maps showing analyte locations and concentrations. Geostatistical data analysis is a repetitive process, showing how patterns of analytes change or remain stable over distances or time spans.”

Geostatistics already rubbed me the wrong way long before it converted Bre-X’s bogus grades and Busang’s barren rock into a massive phantom gold resource. In fact, Matheron’s new science of geostatistics has been a thorn in my side for some twenty years. That sort of junk statistics still runs rampant in the Journal for Mathematical Sciences. Just the same, EC’s field inspectors read under Systematic (Stratified) Sampling , “1) shellfish samples taken at 1-km intervals along a shore, 2) water samples taken from varying depths in the water column.” Numerical examples are missing as much in A Sampling Manual and Reference Guide for Environment Canada Inspectors as they were throughout Matheron’s seminal work. Not all of EC’s geoscientists know as little about testing for spatial dependence in sampling units and sample spaces as do those who cooked up The Inspector’s Field Sampling Manual.

In his letter of October 15, 1992, to Dr R Ehlich, Editor, Journal for Mathematical Geology, Stanford's Professor Dr A G Journel claimed , “The very reason for geostatistics or spatial statistics in general is the acceptance (a decision rather) that spatially distributed data should be considered a priori as dependent one to another, unless proven otherwise.” He believed that my anger “arises fro [sic] a misreading of geostatistical theory, or a reading too encumbered by classical ‘Fischerian’ [sic] statistics.” JMG’s Editor advised me in his letter of October 26, 1992, “Your feeling that geostatistics is invalid might be correct.”

Each and every geoscientist on this planet ought to know how to test for spatial dependence and how to chart sampling variograms that show where spatial dependence in our own sample space of time dissipates into randomness. Following is an Excel spreadsheet template that shows how to apply Fisher’s F-test. Geoscientists should figure out why Excel's FINV-function requires the number of degrees of freedom both for the set and for the ordered set.

Of course, it’s easy to become a geostatistically smart geoscientist. All it takes is to infer spatial dependence between measured values, interpolate by kriging, select the least biased subset of some infinite set of kriged estimates, smooth its kriging variance to perfection, and rig the rules of real statistics with impunity. All but a few of those who have gone nuts and work with junk statistics have written books about geostatistics!

Friday, September 19, 2008

Metrology in mining and metallurgy

A poster in my office reads, “Metrology, the Science of Measurement." It’s a bit faded because I’ve had it for so long. Standards Council of Canada had it printed for educational purposes. I got my poster with a set of slides about international units of measure. Most of them have since been redefined. The famous platinum-iridium artifact that has so long defined the International Unit of Mass is about to bite the dust. A sphere of pure silicon will take its place. The famous Central Limit Theorem has stood the test of time since Abraham de Moivre (1667-1754) brought to the world The Doctrine of Chances. De Moivre’s work underpins both sampling theory and sampling practice. His work is bound to stand the test of time until our planet runs out of it.

The science of measurement has always played a key role in my work. That’s why I put together Sampling and Weighing of Bulk Solids after I had completed my assignment with Cominco Ltd. I was pleased to see it in print in 1985. What pleased me even more was that ISO Technical Committee 183–Copper, lead, zinc and nickel ores and concentrates approved an ISO standard method based on deriving confidence intervals and ranges for metal contents of concentrate and ore shipments.

Several years later I got a slim paperback the cover of which I didn’t recognize. What I did recognize inside of it were my own charts and graphs embedded between Chinese characters. A friend of mine told me it was a Mandarin translation printed on rice paper. My book is protected by copyright but I have yet to be paid a single yuan. Teaching innovative sampling practices and sound statistical methods ranks much higher on my list of things to do than becoming a small c capitalist.

Sampling and Weighing of Bulk Solids
Mandarin translation, November 1989

My son and I were pleased when Precision Estimates for Ore Reserves was praised by Erzmetall and published in its October 1991 issue. The more so since peer reviewers in Canada, the USA and Britain did reject that very paper. One of CIM Bulletin’s reviewers spotted a lack of references to geostatistical literature. The other was ticked off because we were not “...relying on the abundant geostatistical literature...” We had found out that geostatisticians do not explain how to derive confidence interval and ranges for metal contents of in-situ ore. So we did in our paper and submitted it to CIM Bulletin on September 28, 1989.

Both of us had taken statistics courses at the same university but at different times. Ed leads the Eclipse Modeling Framework project and coleads of the Eclipse Modeling project. He is a coauthor of the authoritative book “EMF: Eclipse Modeling Framework” which is nearing completion of a second edition. He is an elected member of the Eclipse Foundation Board of Directors and has been recognized by the Eclipse Community Awards as Top Ambassador and Top Committer. Ed is currently interested in all aspects of Eclipse modeling and its application and is well recognized for his dedication to the Eclipse community, posting literally thousands of newsgroup answers each year. He spent 16 years at IBM, achieving the level of Senior Technical Staff Member after completing his Ph.D. at Simon Fraser University. He has started his own small company, Macro Modeling, is a partner of itemis AG, and serves on Skyway Software’s Board of Advisors. His experience in modeling technology spans 25 years.

I was proud to have his pre-IBM credentials printed on the backside of Part 1– Precision and Bias for Mass Measurement Techniques. I shall convert all Lotus 1-2-3 files into Excel files and post them on my website. Some time ago Dr W E Sharp, the Editor-in-Chief for what was recently renamed the Journal of Mathematical Geosciences, wanted Dr Ed Merks to review papers on computer applications. Sharp asked me to write a paper on testing for spatial dependence by applying Fisher’s F-test. I did but we couldn’t agree on degrees of freedom for ordered sets.

Metrology in Mining and Metallurgy

First part but also the last

After Part 1 was completed in 1992 I went to work on Part 2– Precision and Bias for Ore Reserves. It was coming along nicely until Barrick Gold asked me in December 1996 to look at Bre-X’s test results for gold in crushed core and Lakefield’s test results for gold in library core. The hypothesis that 2.9 m crushed core and 0.1 m library core were once part of the same 3.0 m whole core proved to be highly improbable. CIM’s statistically dysfunctional but otherwise qualified persons were not at all keen to know how Bre-X’s salting scam could have been avoided altogether. Surely, life after Bre-X couldn’t have been any more bizarre. But that’s another story altogether!

The ISO copyright office in Geneva, Switzerland, suggests that it holds the copyright to ISO/FDIS 12745:2007(E)–Precision and bias of mass measurement techniques. Yet, this ISO standard is an ad verbatim copy of Part 1–Precision and bias for mass measurement techniques. Part 1 is supposed to be protected by Canadian copyright. So what gives? Didn’t ISO have to ask permission to reprint? What’s this world coming to when ISO violated Canadian copyright in 2007 just as much as China did in 1992?

What Ed and I have decided to do is put together a paper on Metrology in Mineral Exploration. I want to present it at APCOM 2009 in Vancouver, BC. Home sweet home! Maybe I’ll talk Ed into coming home for a while. I’ll have to post an abstract before the deadline. By the way, APCOM stands for Applications of Computers and Operations Research in the Mineral Industry. Acronym talk does make a lot of sense, doesn’t it?

Monday, September 01, 2008

Lord Kelvin cool to assumptions

Lord Kelvin (William Thomson, 1824-1907) was a brilliant scientist and an innovative engineer. His honorific name is forever linked to the absolute temperature of zero degrees Kelvin. His work often called for all sorts of variables to be measured. Here's what he once said, “…when you can measure what you are speaking about, and express it in numbers, you know something about it, but when you cannot express it in numbers your knowledge is of the meagre and unsatisfactory kind…” Lord Kelvin’s view struck a chord with me because of the Dutch truism, “Meten is weten.” It translates into something like, “To measure is to know.” It may have messed up a perfect rhyme but didn’t impact good sense. And it's a leitmotif in my life!

Lord Kelvin knew all about degrees Kelvin and degrees Celsius. But he couldn’t have been conversant with degrees of freedom because Sir Ronald A Fisher (1890-1960) was hardly his contemporary. Lord Kelvin might have wondered why today's geoscientists would rather assume spatial dependence than measure it. Sir Ronald A Fisher could have verified spatial dependence by applying his ubiquitous F-test to the variance of a set of measured values and the first variance term of the ordered set. He may not have had time to apply that variant of his F-test because of his conflict with Karl Pearson (1857-1936). It was Fisher in 1928 who added degrees of freedom to Pearson’s chi-square distribution.

Not all students need to know as much about Fisher's F-test as do those who study geosciences. The question is why geostatistically gifted geoscientists would rather assume spatial dependence than measure it. How do they figure out where orderliness in our own sample space of time dissipates into randomness? Sampling variograms, unlike semi-variograms, cannot be derived without counting degrees of freedom. So much concern about climate change and global warming. So little concern about sound sampling practices and proven statistical methods!

I derived sampling variograms for the set that underpins A 2000-Year Global Temperature Reconstruction based on Non-Tree Ring Proxies. I downloaded the data that covers Year 16 to Year 1980, and derived corrected and uncorrected sampling variograms. The corrected sampling variogram takes into account the loss of degrees of freedom during reiteration. I transmitted both to Dr Craig Loehle, the author of this fascinating study. Excel spreadsheet templates on my website show how to derive uncorrected and corrected sampling variograms.

Uncorrected sampling variogram

Spatial dependence in this uncorrected sampling variogram dissipates into randomness at a lag of 394 years. The variance of the set gives 95% CI = +/-1 centrigrade between consecutive years. The first variance term of the ordered set gives 95% CI = +/-0.1 centrigrade between consecutive years.

Corrected sampling variogram

Spatial dependence in the corrected sampling variogram dissipates into randomness at a lag of 294 years. It is possible to derive 95% confidence intervals anywhere within this lag.

Sampling variograms are part of my story about the junk statistics behind what was once called Matheron's new science of geostatistics. I want to explain its role not only in mineral reserve and resource estimation in the mining industry but even more so in measuring climate change and global warming. Classical statistics turned into junk statistics under the guidance of Professor Dr Georges Matheron (1930-2000), a French probabilist who turned into a self-made wizard of odd statistics. A brief history of Matheronian geostatistics is posted on my blog. My 20-year campaign against the geostatocracy and its army of degrees of freedom fighters is chronicled on my website. Agterberg ranked Matheron on a par with giants of mathematical statistics such as Sir Ronald A Fisher (1890-1962) and Professor Dr J W Tukey (1915-2000). Agterberg was wrong! Matheron fumbled the variance of the length-weighted average grade of core samples of variable lengths in 1954. Agterberg himself fumbled the variance of his own distance-weighted average point grade in his 1970 Autocorrelation Functions in Geology and again in his 1974 Geomathematics.

Agterberg seems to believe it's too late to reunite his distance-weighted average point grade and its long-lost variance. I disagree because it's never too late to right a wrong. What he did do was change the International Association for Mathematical Geology into the International Association for Mathematical Geosciences. Of course, geoscientists do bring in more dollars and cents than did geologists alone. I have made a clear and concise case that sound sampling practices and proven statistical methods ought to be taught at all universities on this planet. Time will tell whether or not such institutions of higher learning agree that functions do have variances, and that Agterberg's distance-weighted average point grade is no exception!

Wednesday, August 27, 2008

To have or not to have variances

Not a word from CRIRSCO’s Chairman. I just want to know whether or not functions do have variances at Rio Tinto’s operations. Surely, Weatherstone wouldn’t toss a coin to make up his mind, would he? My functions do have variances. I work with central values such as arithmetic means and all sorts of weighted averages. It would be off the wall if the variance were stripped off any of those functions. But that’s exactly what had come to pass in Agterberg’s work. I’ve tried to find out what fate befell the variance of the distance-weighted average. I did find out who lost what and when. And it was not pretty in the early 1990s. After Matheron's seminal wrok was posted on the web it became bizarre. The geostatistocrats turned silent, and resolved to protect their turf and evade the question. They do know what’s true and what's fals. And I know scientific truth always prevails in the end.

Agterberg talked about his distance-weighted average point grade for the first time during a geostatistics colloquium on campus at The University of Kansas in June 1970. He did so in his paper on Autocorrelation functions in geology. The caption under Figure 1 states; “Geologic prediction problem: values are known for five irregularly spaced Points P₁ –P₅. Value at P₀ is unknown and to be predicted from five unknown values.”

Agterberg’s 1970 Figure 1 and 1974 Figure 64

Agterberg’s 1970 sample space became Figure 64 in Chapter 10. Stationary Random Variables and Kriging of his 1974 Geomathematics. Now his caption states, “Typical kriging problem, values are known at five points. Problem is to estimate value at point P₀ from the known values at P₁ –P₅”. Agterberg seemed to imply his 1970 geologic prediction problem and his 1974 typical kriging problem do differ in some way. Yet, he applied the same function to derive his predicted value as well as his estimated value. His symbols suggest a matrix notation in both his paper and textbook.

The following function sums the products of weighting factors and measured values to obtain Agterberg’s distance-weighted average point grade.

Agterberg’s distance-weighted average

Agterberg’s distance-weighted average point grade is a function of his set of measured values. That’s why the central value of this set of measured values does have a variance in classical statistics. Agterberg did work with the Central Limit Theorem in a few chapters of his 1974 Geomathematics. Why then is this theorem nowhere to be found in Chapter 10 Stationary Random Variables and Kriging? All the more so because this theorem can be brought back to the work of Abraham de Moivre (1667-1754).

David mentioned the “famous" Central Limit Theorem in his 1977 Geostatistical Ore Reserve Estimation. He didn’t deem it quite famous enough to either work with it or to list it in his Index. Neither did he grasp why the central limit theorem is the quintessence of sampling theory and practice. Agterberg may well have fumbled the variance of the distance-weighted average point grade because he fell in with the self-made masters of junk statistics. What a pity he didn’t talk with Dr Jan Visman before completing his 1974 opus.

The next function gives the variance of Agterberg’s distance-weighted average point grade. As such it defines the Central Limit Theorem as it applies to Agterberg’s central value. I should point out that this central value is in fact the zero-dimensional point grade for Agterberg’s selected position P₀.

Agterberg’s long-lost variance

Agterberg worked with symbols rather than measured values. Otherwise, Fisher’s F-test could have been applied to test for spatial dependence in the sample space defined by his set. This test verifies whether var(x), the variance of a set, and var1(x), the first variance term of the ordered set, are statistically identical or differ significantly. The above function shows the first variance term of the ordered set. In Section 12.2 Conditional Simulation of his 1977 work, David brought up some infinite set of simulated values. What he talked about was Agterberg’s infinite set of zero-dimensional, distance-weighted average point grades. I yearn for some ISO Standard on Mineral Reserve and Resource Estimation where a word means what it says, and where text, context and symbols make for an unambiguous read.

But I digress as we tend to do in our family. Do CRIRSCO’s Chairman and his Crirsconians know that our sun will have bloated to a red giant and scorched Van Gogh’s Sunflowers to a crisp long before Agterberg’s infinite set of zero-dimensional point grades is tallied? And I don’t want to get going on the immeasurable odds of selecting the least biased subset of some infinite set. Weatherstone should contact the International Association of Mathematical Geosciences and ask its President to bring back together his distance-weighted average and its long-lost variance. That’s all. At least for now!

Thursday, August 07, 2008

Fighting factoids with facts

Niall Weatherstone of Rio Tinto and Larry Smith of Vale Inco have been asked to study a geostatistical factoid and a statistical fact. I asked them to do so by email on July 8, 2008. Next time they chat I want them to discuss whether or not geostatistics is an invalid variant of classical statistics. I’ve asked Weatherstone to transmit my question to all members of his team. CRIRSCO’s Chairman has yet to confirm whether he did or not. I just want to bring to the attention of his Crirsconians my ironclad case against the junk science of geostatistics.

Not all Crirsconians assume, krige, and smooth quite as much as do Parker and Rendu. The problem is nobody grasps how to derive unbiased confidence intervals and ranges for contents and grades of reserves and resources. Otherwise, Weatherstone would have blown his horn when he talked to Smith. A few geostatistical authors referred per chance to statistical facts. Nobody has responded to my questions about geostatistical factoids. The great debate between Shurtz and Parker got nowhere because the question of why kriging variances “drop off” was never raised. So I’ll take my turn at explaining the rise and fall of kriging variances.

In the 1990s I didn’t geostat speak quite as well as did those who assume, krige and smooth. I did assume Matheron knew what he was writing about but he wasn’t. Bre-X proved it makes no sense to infer gold mineralization between salted boreholes. The Bre-X fraud taught me more about assuming, kriging, and smoothing than I wanted to know. And I wasn't taught to blather with confidence about confidence without limits. It reminds me of another story I’ll have to blog about some other day. It’s easy to take off on a tangent because I have so many factoids and facts to pick and choose from.

Functions have variances is a statistical fact I’ve quoted to Weatherstone and Smith. Not all functions have variances I cited as a geostatistical factoid. Factoid and fact are mutually exclusive but not equiprobable. One-to-one correspondence between functions and variances is a condition sine qua non in classical statistics. Therefore, factoid and fact have as much in common as do a stuffed dodo and a soaring eagle. My opinion on the role of classical statistics in reserve and resource estimation is necessarily biased.

The very function that should never have been stripped off its variance is the distance-weighted average. For this central value is in fact a zero-dimensional point grade. All the same, its variance was stripped off twice on Agterberg’s watch. David did refer to “the famous central limit theorem.” What he didn’t mention is the central limit theorem defines not only the variance of the arithmetic mean of a set of measured values with equal weights but also the variance of the weighted average of a set of measured values with variable weights. It doesn’t matter that a weighted average is called an honorific kriged estimate. What does matter is that the kriged estimate had been stripped off its variance.

Two or more test results for samples taken at positions with different coordinates in a finite sample space give an infinite set of distance-weighted average point grades. The catch is that not a single distance-weighted average point grade in an infinite set has its own variance. So, Matheron’s disciples had no choice but to contrive the surreal kriging variance of some subset of an infinite set of kriged estimates. That set the stage for a mad scramble to write the very first textbook on a fatally flawed variant of classical statistics.

Step-out drilling at Busang’s South East Zone produced nine (9) salted holes on SEZ-44 and eleven (11) salted holes on SEZ-49. Interpolation by kriging gave three (3) lines with nine (9) kriged holes each. Following is the YX plot for Bre-X’s salted and kriged holes.

Fisher’s F-test is applied to verify spatial dependence. The test is based on comparing the observed F-value between the variance of a set and the first variance of the ordered set with tabulated F-values at different probability levels and with applicable degrees of freedom. Neither set of salted holes displays a significant degree of spatial dependence. By contrast, the observed F-values for sets of kriged holes seem to imply a high degree of spatial dependence.

If I didn’t know kriged holes were functions of salted holes, then I would infer a high degree of spatial dependence between kriged holes but randomness between salted holes. Surely, it’s divine to create order where chaos rules! But do Crirsconians ever wonder about Excel functions such CHIINV, FINV, and TINV? Wouldn’t Weatherstone want to have a metallurgist with a good grasp of classical statistics on his team?

High variances give low degrees of precision. I like to work with confidence intervals in relative percentages because it easy to compare precision estimates at a glance. SEZ-44 gives 95% CI= ±23.5%rel whereas SEZ-49 gives 95% CI= ±26.4%rel. By contrast, low variances give high degrees of precision. Three (3) lines of kriged holes give confidence intervals of 95% CI= ±0.8%rel to 95% CI= ±1.6%rel. Crirsconians should know not only how to verify spatial dependence by applying Fisher’s F-test but also how to count degrees of freedom. Kriging variances just cannot help but going up and down as yoyos!

Tuesday, July 29, 2008

Going GIGO with CRIRSCO

Snappy acronyms add spice to the way we blog and talk. GIGO has been tagging along with computing science without losing its punch. CRIRSCO is but one tong twisting tour de force for Combined Reserves International Reporting Standards Committee. Its Chairman is Niall Weatherstone of Rio Tinto. Larry Smith of Vale Inco asked Weatherstone about Setting International Standards. Weatherstone said CRIRSCO was set up in 1993 but its website says it was 1994. CRIRSCO's website makes a tough read because of its dreadfully long lines. So what have Weatherstone and his Crirsconians been doing during all those years?

Smith should have but didn’t ask what CRIRSCO has accomplished. It would seem some sort of semi-international reporting template has been set up. The problem is the Russian Federation has a code of its own, and China’s is sort of similar. As it stands, Crirsconians have yet to develop valuation codes for mineral properties. At the present pace, valuation codes that give unbiased confidence limits for contents and grades of reserves and resources might be ready in 2020, the year of perfect vision. It had better be based on classical statistics!

Here’s what was happening in my life when CRIRSCO came about either in 1993 or in 1994. I talked to CIM Members in Vancouver, BC, about the use and abuse of statistics in ore reserve estimation. Bre-X Minerals raised money to acquire the Busang property. Clark wanted me to go from Zero to Kriging in 30 Hours at the Mackay School of Mines. I didn’t go because her semi-variograms are rubbish. The international forum on Geostatistics for the Next Century at McGill University didn’t want to hear about The Properties of Variances. David S Robertson, PhD, PEng, CIM President, failed to, “… find support for your desire to debate.” What irked me was Jean-Michel Rendu’s 1994 Jackling Lecture on Mining geostatistics - Forty years passed. What lies ahead? He rambled on about, “…an endless list of other ‘kriging’ methods…” and prophesied geostatistics, “… is here to stay with all its strengths and weaknesses.” At that time, Rendu knew about infinite sets of kriged estimates and zero kriging variances.

Rendu’s lecture stood in sharp contrast to A Geostatistical Monograph of The Mining and Metallurgical Society of America. Robert Shurtz, a mining engineer and a friend of mine, wrote The Geostatistics Machine and the Drill Core Paradox. Harry Parker, a Stanford-bred geostat sage, was to find fault in Shurtz’s work. This great debate got nowhere because neither grasped the properties of variances. Otherwise, both of them could have put in plain words why kriging variances drop off. A few of Parker’s geostat pals had already found out why in 1989.

Figure 2 is rather odd in the sense that, “The kriging variance rises up to a maximum and then drops off.” That’s precisely what Armstrong and Champigny wrote in A Study of Kriging Small Blocks published in CIM Bulletin of March 1989. What I saw kriging variances do is what real variances never do. Armstrong and Champigny alleged kriging variances drop off because mine planners over-smooth small blocks. More research brought to light that kriged block estimates and actual grades were “uncorrelated.” That would make a random number generator of sorts for kriged block grades. It was David himself who approved that blatant nonsense for publication in CIM Bulletin.

Figure 2 gives kriging variances as a function of variogram ranges. As such, it was more telling than Parker’s. Neither Shurtz nor Parker scrutinized Armstrong and Champigny’s 1989 A Study of Kriging Small Blocks. Otherwise, Shurtz might have pointed out Parker’s kriging variances looked a touch over-smoothed. Neither did Parker confess he does over-smooth the odd time.

Corrected and uncorrected sampling variograms for Bre-X’s bonanza grade borehole BSSE198 show where spatial dependence between bogus gold grades of crushed, salted and ordered core samples from this borehole dissipates into randomness. The adjective “corrected” implies that the variance of selecting a test portion of a crushed and salted core sample, and the variance of analyzing such a test portion, are extraneous to the in situ variance of gold in Bre-X’s Busang resource. Subtracting the sum of extraneous variances gives an unbiased estimate for the intrinsic variance of bogus gold in Busang’s phantom gold resource. Fisher’s F-test proved this intrinsic variance to be statistically identical to zero.

Harry Parker and Jean-Michel Rendu appear to speak for the Society for Mining, Metallurgy and Exploration (SME) in the USA. What it takes to cook up ballpark reserves and resources are soothsayers who know how to failingly infer mineralization between boreholes, hardcore krigers and cocksure smoothers. What CRIRSCO ought to have done after the Bre-X fraud is set up an ISO Technical Committee on reserve and resource estimation. It’s never too late to do it! GIGO may be a bit dated but Garbage In does stand the test of time. Nowadays, Good Graphics Bad Statistics Out is a much more likely outcome. What a pity that GIGGBSO lacks GIGO’s punch!

Saturday, July 12, 2008

Hooked on junk statistics

Our parents told us not to put all our eggs in one basket. This lesson has passed the test of time ever since the Easter Bunny got to working with real eggs. The world’s mining industry has put its basket full of junk statistics and got egg on its façade. Junk statistics does not give unbiased confidence limits for grades and contents of mineral reserves and resources. Annual reports, unlike opinion polls, do not sport 95% confidence intervals and ranges as a measure for the risks mining investors encounter. Many years ago I put classical statistics in my own basket. I thought I couldn’t go wrong because Sir Ronald A Fisher was knighted in 1953. But was I wrong? Matheron, who is called The creator of geostatistics, knew very little about variances, and even less about the properties of variances.

Matheron deserved some credit because he didn’t put all core samples of a single borehole in one baskett. He would have lost all his degrees of freedom but wouldn't have missed them anyway. He did derive the length-weighted average grade of a set of grades determined in core samples of variable lengths. What he didn’t derive was the variance of this length-weighted average. Matheron wrote a Synopsis for Gy’s 1967 Minerals sampling. Gy, in turn, referred to Visman’s 1947 thesis on the sampling of coal, and to his 1962 Towards a common basis for the sampling of materials. Visman bridged the gap between sampling theory with its homogeneous populations and sampling practice with its heterogeneous sampling units and sample spaces. Matheron never knew there was a gap.

Should a set of primary increments be put in one basket? Or should it be partitioned into a pair of subsets? Gy proposed in his 1977 Sampling of Particulate Matter a set of primary increments be treated as a single primary sample. He claimed the variance of a primary sample mass derives from the average mass and number of primary increments in a set, the properties of the binomial distribution, and some kind of sampling constant. I explained in Sampling in Mineral Processing why Gy’s sampling theory and his sampling constant should be consumed with a few grains of salt.

When I met G G Gould for the first time at the Port of Rotterdam in the mid 1960s, he told me how Visman’s sampling theory impacted his work on ASTM D2234-Collection of a Gross Sample of Coal. Visman’s sampling experiment is described in this ASTM Standard Method. Visman’s 1947 thesis taught me that the sampling variance is the sum of the composition variance and the distribution variance. I got to know Jan Visman in person here in Canada. I treasure my copy of his thesis. I enjoyed his sense of humor when we were griping about those who try to play games with the rules of classical statistics.

On-stream data for slurries and solids taught me all I needed to grasp about spatial dependence in sampling units and sample spaces. Fisher’s F-test is applied to test for spatial dependence, to chart a sampling variogram, and to optimize a sampling protocol.

Selecting interleaved primary samples by partitioning the set of primary increments into odd- and even-numbered subsets is described in several ISO standards. A pair of A- and B-primary samples gives a single degree of freedom but putting all primary increments in one basket gives none. Shipments of bulk solids are often divided in sets of lots so that lower t-values than t0.05; 1=12.706 apply. Those who do not respect degrees of freedom as much as statisticians do may cling to the notion that the cost for preparing and testing a second test sample is too high a price for some invisible degree of freedom. They just don't grasp why confidence limits and degrees of freedom belong together as much as do ducks and eggs.

The interleaved sampling protocol gives a reliable estimate for the total variance at the lowest possible cost. It takes into account var2(x), the second variance term of the ordered set. It makes sense to take interleaved bulk samples in mineral exploration because they give realistic estimates for intrinsic variances in sample spaces. Both Visman and Volk, the author of Applied Statistics for Engineers, were conversant with classical statistics. The geostatistical fraternity made up some new rules and fumbled a few others. They got hooked on a basket of junk statistics and are doomed to end up with egg on their faces.