Friday, April 13, 2012

When to work with Markov chains

Once upon a time a keen geologist measured the degree of associative dependence between lead and silver in lead ore. Next, he put on paper Formule des minerais connexes and called it Note statistique No 1. He didn’t report his primary data but did correct an error. In time, he became famous. So much so that he set up the Centre de GĂ©osciences/GĂ©ostatistique at Fontainebleau, France. Professor Dr G Matheron will be remembered either as the creator of geostatistics or as the founder of spatial statistics. As fate would have it; never in his life did he test for spatial dependence between measured values in ordered sets by applying Fisher’s F-test.


Professor Dr George Matheron (1930-2000)
Creator of geostatistics
Founder of spatial statistics
Abuser of applied statistics
Matheron’s most gifted disciple was Dr A G Journel. He put forward on October 15, 1992, “The very reason for geostatistics or spatial statistics in general is the acceptance (a decision rather) that spatially distributed data should be considered a priori as dependent one to another, unless proven otherwise”. It was a prima facie case of circular logic. He did respond to a request from Professor Dr Robert Ehrlich, Editor, Mathematical Geology. Stanford’s Journel also deemed my reading too encumbered with classical “Fischerian” statistics. So the coauthor of Mining Geostatistics put forward, “In presence of dependence the classical notion of degrees of freedom vanishes: n spatially dependent data do not provide n degrees of freedom”.
Now that’s where I didn't see eye to eye with Professor Dr A G Journel. A set of n measured values always gives df=n-1 degrees of freedom whereas an ordered set of n measured values gives dfo=2(n-1) degrees of freedom for the first variance term. Degrees of freedom are positive integers for sets of measured values with the same weight but positive irrationals for sets of measured values with variable weights. The concept of degrees of freedom has left little space for ifs and buts!
Index A. Geostatistical Concepts in my copy of 1978 Mining Geostatistics does not refer to Degrees of freedom between Deconvolution and Discontinuity at the origin of a sampling variogram. What went missing on Matheron’s watch was the variance of the distance-weighted average AKA kriged estimate. Incredibly, Matheron’s students never told him that too much was lost. On the contrary, the zero kriging variance of an infinite set of kriged estimates took on a silly life of its own. Neither does it list Markov chains above Massive deposits. Stanford’s Professor Dr A G Journel may not have been as hot on Markov chains as McGill’s Professor Dr R Dimitrakopoulos is on its role in stochastic mine planning. Unbiased confidence limits for metal contents and grades of mineral deposits can only be derived with applied statistics.
Andrey Markov (1856-1922)
A Markov chain is a mathematical system that transitions from one state to another between countable (finite) numbers of possible states. One ought to peruse the properties of variances before toiling with Markov chains. Study what McGill’s Dr RD didn’t want to know about the properties of variances when Geostatistics for the Next Century came about at the McGill Conference Office on June 3-5, 1993. What a pity that deriving unbiased confidence limits for metal grades and contents of ore reserves is still beyond Dr RD’s grasp.
Count Leo Tolstoy (1828-1910)
“I know that most men, including those at ease with problems of the greatest complexity, can seldom accept even the simplest and most obvious truth if it be such as would oblige them to admit the falsity of conclusions which they have delighted in explaining to colleagues, which they have proudly taught to others, and which they have woven, thread by thread, into the fabric of their lives”.
Sir Ronald A Fisher (1890-1962) and Karl Pearson (1857-1936)
For quite a while these statisticians feuded about the chi-square distribution. Pearson worked with large data sets whereas Fisher worked with small data sets. Fisher was right! That's why the chi-square distribution takes degrees of freedom for small data sets into account. Take a long look at David’s 1977 Geostatistical Ore Reserve Estimation, Table 1.IV, Copper grade Prince Lyell. How about that? So why not reunite the distance-weighted average and its lost variance? Mining investors are bound to like it! In fact, Barrick Gold liked it before Bre-X's boss salter passed away.
Geostatistocrats such as Professor Dr Michel David (1945-2000) and UBC Emeritus Professor Dr Alastair J Sinclair, PEng, PGeo never got into counting degrees of freedom. Why is it that one-to-one correspondence between functions and variances is sine qua non in applied statistics but irrelevant in geostatistics. Dr Michel David was once listed as a Deceased Fellow with the Royal Society of Canada. He is no longer listed but I still do not know why!
Some institutions of higher learning such as COSMO McGill Mining and Stanford University work with Markov chains to derive stochastic mining plans. What they cannot possibly derive are unbiased confidence limits for metal contents and grades of ore reserves. Geostatisticians stripped the variance off the distance-weighted average AKA kriged estimate. That’s how real functions got surreal variances!

No comments: