Monday, September 29, 2008

Going green and gone nuts

Is our world going green? It may be a long while before we know. That’s because scores of geoscientists have gone nuts and work with junk statistics. In Canada, too, geoscientists would rather infer than test for spatial dependence in sampling units and sample spaces. The more so since it’s all in The Inspector’s Field Sampling Manual. Nobody should have to read it. Not even EC’s own inspectors. I had to in the early 2000s because Environment Canada had taken a client of mine to court. It was about my statistical analysis of test results determined in interleaved primary samples. So I worked my way through EC’s manual and found all sorts of sampling methods. What I didn’t find was the interleaved sampling method. I had put this method on my list of smart statistics long before global warming got hot.


Here’s what I did find out when I struggled with EC’s manual. Inspectors are taught, “Systematic samples taken at regular time intervals can be used for geostatistical data analysis, to produce site maps showing analyte locations and concentrations. Geostatistical data analysis is a repetitive process, showing how patterns of analytes change or remain stable over distances or time spans.”


Geostatistics already rubbed me the wrong way long before it converted Bre-X’s bogus grades and Busang’s barren rock into a massive phantom gold resource. In fact, Matheron’s new science of geostatistics has been a thorn in my side for some twenty years. That sort of junk statistics still runs rampant in the Journal for Mathematical Sciences. Just the same, EC’s field inspectors read under Systematic (Stratified) Sampling , “1) shellfish samples taken at 1-km intervals along a shore, 2) water samples taken from varying depths in the water column.” Numerical examples are missing as much in A Sampling Manual and Reference Guide for Environment Canada Inspectors as they were throughout Matheron’s seminal work. Not all of EC’s geoscientists know as little about testing for spatial dependence in sampling units and sample spaces as do those who cooked up The Inspector’s Field Sampling Manual.


In his letter of October 15, 1992, to Dr R Ehlich, Editor, Journal for Mathematical Geology, Stanford's Professor Dr A G Journel claimed , “The very reason for geostatistics or spatial statistics in general is the acceptance (a decision rather) that spatially distributed data should be considered a priori as dependent one to another, unless proven otherwise.” He believed that my anger “arises fro [sic] a misreading of geostatistical theory, or a reading too encumbered by classical ‘Fischerian’ [sic] statistics.” JMG’s Editor advised me in his letter of October 26, 1992, “Your feeling that geostatistics is invalid might be correct.”


Each and every geoscientist on this planet ought to know how to test for spatial dependence and how to chart sampling variograms that show where spatial dependence in our own sample space of time dissipates into randomness. Following is an Excel spreadsheet template that shows how to apply Fisher’s F-test. Geoscientists should figure out why Excel's FINV-function requires the number of degrees of freedom both for the set and for the ordered set.



Of course, it’s easy to become a geostatistically smart geoscientist. All it takes is to infer spatial dependence between measured values, interpolate by kriging, select the least biased subset of some infinite set of kriged estimates, smooth its kriging variance to perfection, and rig the rules of real statistics with impunity. All but a few of those who have gone nuts and work with junk statistics have written books about geostatistics!

Friday, September 19, 2008

Metrology in mining and metallurgy

A poster in my office reads, “Metrology, the Science of Measurement." It’s a bit faded because I’ve had it for so long. Standards Council of Canada had it printed for educational purposes. I got my poster with a set of slides about international units of measure. Most of them have since been redefined. The famous platinum-iridium artifact that has so long defined the International Unit of Mass is about to bite the dust. A sphere of pure silicon will take its place. The famous Central Limit Theorem has stood the test of time since Abraham de Moivre (1667-1754) brought to the world The Doctrine of Chances. De Moivre’s work underpins both sampling theory and sampling practice. His work is bound to stand the test of time until our planet runs out of it.

The science of measurement has always played a key role in my work. That’s why I put together Sampling and Weighing of Bulk Solids after I had completed my assignment with Cominco Ltd. I was pleased to see it in print in 1985. What pleased me even more was that ISO Technical Committee 183–Copper, lead, zinc and nickel ores and concentrates approved an ISO standard method based on deriving confidence intervals and ranges for metal contents of concentrate and ore shipments.

Several years later I got a slim paperback the cover of which I didn’t recognize. What I did recognize inside of it were my own charts and graphs embedded between Chinese characters. A friend of mine told me it was a Mandarin translation printed on rice paper. My book is protected by copyright but I have yet to be paid a single yuan. Teaching innovative sampling practices and sound statistical methods ranks much higher on my list of things to do than becoming a small c capitalist.


Sampling and Weighing of Bulk Solids
Mandarin translation, November 1989

My son and I were pleased when Precision Estimates for Ore Reserves was praised by Erzmetall and published in its October 1991 issue. The more so since peer reviewers in Canada, the USA and Britain did reject that very paper. One of CIM Bulletin’s reviewers spotted a lack of references to geostatistical literature. The other was ticked off because we were not “...relying on the abundant geostatistical literature...” We had found out that geostatisticians do not explain how to derive confidence interval and ranges for metal contents of in-situ ore. So we did in our paper and submitted it to CIM Bulletin on September 28, 1989.

Both of us had taken statistics courses at the same university but at different times. Ed leads the Eclipse Modeling Framework project and coleads of the Eclipse Modeling project. He is a coauthor of the authoritative book EMF: Eclipse Modeling Framework which is nearing completion of a second edition. He is an elected member of the Eclipse Foundation Board of Directors and has been recognized by the Eclipse Community Awards as Top Ambassador and Top Committer. Ed is currently interested in all aspects of Eclipse modeling and its application and is well recognized for his dedication to the Eclipse community, posting literally thousands of newsgroup answers each year. He spent 16 years at IBM, achieving the level of Senior Technical Staff Member after completing his Ph.D. at Simon Fraser University. He has started his own small company, Macro Modeling, is a partner of itemis AG, and serves on Skyway Software’s Board of Advisors. His experience in modeling technology spans 25 years.

I was proud to have his pre-IBM credentials printed on the backside of Part 1– Precision and Bias for Mass Measurement Techniques. I shall convert all Lotus 1-2-3 files into Excel files and post them on my website. Some time ago Dr W E Sharp, the Editor-in-Chief for what was recently renamed the Journal of Mathematical Geosciences, wanted Dr Ed Merks to review papers on computer applications. Sharp asked me to write a paper on testing for spatial dependence by applying Fisher’s F-test. I did but we couldn’t agree on degrees of freedom for ordered sets.


Metrology in Mining and Metallurgy
First part but also the last

After Part 1 was completed in 1992 I went to work on Part 2– Precision and Bias for Ore Reserves. It was coming along nicely until Barrick Gold asked me in December 1996 to look at Bre-X’s test results for gold in crushed core and Lakefield’s test results for gold in library core. The hypothesis that 2.9 m crushed core and 0.1 m library core were once part of the same 3.0 m whole core proved to be highly improbable. CIM’s statistically dysfunctional but otherwise qualified persons were not at all keen to know how Bre-X’s salting scam could have been avoided altogether. Surely, life after Bre-X couldn’t have been any more bizarre. But that’s another story altogether!

The ISO copyright office in Geneva, Switzerland, suggests that it holds the copyright to ISO/FDIS 12745:2007(E)–Precision and bias of mass measurement techniques. Yet, this ISO standard is an ad verbatim copy of Part 1–Precision and bias for mass measurement techniques. Part 1 is supposed to be protected by Canadian copyright. So what gives? Didn’t ISO have to ask permission to reprint? What’s this world coming to when ISO violated Canadian copyright in 2007 just as much as China did in 1992?

What Ed and I have decided to do is put together a paper on Metrology in Mineral Exploration. I want to present it at APCOM 2009 in Vancouver, BC. Home sweet home! Maybe I’ll talk Ed into coming home for a while. I’ll have to post an abstract before the deadline. By the way, APCOM stands for Applications of Computers and Operations Research in the Mineral Industry. Acronym talk does make a lot of sense, doesn’t it?

Monday, September 01, 2008

Lord Kelvin cool to assumptions

Lord Kelvin (William Thomson, 1824-1907) was a brilliant scientist and an innovative engineer. His honorific name is forever linked to the absolute temperature of zero degrees Kelvin. His work often called for all sorts of variables to be measured. Here's what he once said, “…when you can measure what you are speaking about, and express it in numbers, you know something about it, but when you cannot express it in numbers your knowledge is of the meagre and unsatisfactory kind…” Lord Kelvin’s view struck a chord with me because of the Dutch truism, “Meten is weten.” It translates into something like, “To measure is to know.” It may have messed up a perfect rhyme but didn’t impact good sense. And it's a leitmotif in my life!

Lord Kelvin knew all about degrees Kelvin and degrees Celsius. But he couldn’t have been conversant with degrees of freedom because Sir Ronald A Fisher (1890-1960) was hardly his contemporary. Lord Kelvin might have wondered why today's geoscientists would rather assume spatial dependence than measure it. Sir Ronald A Fisher could have verified spatial dependence by applying his ubiquitous F-test to the variance of a set of measured values and the first variance term of the ordered set. He may not have had time to apply that variant of his F-test because of his conflict with Karl Pearson (1857-1936). It was Fisher in 1928 who added degrees of freedom to Pearson’s chi-square distribution.

Not all students need to know as much about Fisher's F-test as do those who study geosciences. The question is why geostatistically gifted geoscientists would rather assume spatial dependence than measure it. How do they figure out where orderliness in our own sample space of time dissipates into randomness? Sampling variograms, unlike semi-variograms, cannot be derived without counting degrees of freedom. So much concern about climate change and global warming. So little concern about sound sampling practices and proven statistical methods!

I derived sampling variograms for the set that underpins A 2000-Year Global Temperature Reconstruction based on Non-Tree Ring Proxies. I downloaded the data that covers Year 16 to Year 1980, and derived corrected and uncorrected sampling variograms. The corrected sampling variogram takes into account the loss of degrees of freedom during reiteration. I transmitted both to Dr Craig Loehle, the author of this fascinating study. Excel spreadsheet templates on my website show how to derive uncorrected and corrected sampling variograms.

Uncorrected sampling variogram

Spatial dependence in this uncorrected sampling variogram dissipates into randomness at a lag of 394 years. The variance of the set gives 95% CI = +/-1 centrigrade between consecutive years. The first variance term of the ordered set gives 95% CI = +/-0.1 centrigrade between consecutive years.

Corrected sampling variogram

Spatial dependence in the corrected sampling variogram dissipates into randomness at a lag of 294 years. It is possible to derive 95% confidence intervals anywhere within this lag.

Sampling variograms are part of my story about the junk statistics behind what was once called Matheron's new science of geostatistics. I want to explain its role not only in mineral reserve and resource estimation in the mining industry but even more so in measuring climate change and global warming. Classical statistics turned into junk statistics under the guidance of Professor Dr Georges Matheron (1930-2000), a French probabilist who turned into a self-made wizard of odd statistics. A brief history of Matheronian geostatistics is posted on my blog. My 20-year campaign against the geostatocracy and its army of degrees of freedom fighters is chronicled on my website. Agterberg ranked Matheron on a par with giants of mathematical statistics such as Sir Ronald A Fisher (1890-1962) and Professor Dr J W Tukey (1915-2000). Agterberg was wrong! Matheron fumbled the variance of the length-weighted average grade of core samples of variable lengths in 1954. Agterberg himself fumbled the variance of his own distance-weighted average point grade in his 1970 Autocorrelation Functions in Geology and again in his 1974 Geomathematics.

Agterberg seems to believe it's too late to reunite his distance-weighted average point grade and its long-lost variance. I disagree because it's never too late to right a wrong. What he did do was change the International Association for Mathematical Geology into the International Association for Mathematical Geosciences. Of course, geoscientists do bring in more dollars and cents than did geologists alone. I have made a clear and concise case that sound sampling practices and proven statistical methods ought to be taught at all universities on this planet. Time will tell whether or not such institutions of higher learning agree that functions do have variances, and that Agterberg's distance-weighted average point grade is no exception!