Wednesday, December 5, 2012

Are there better indicators for causality than correlations?

In a recent post concerning the use of analytics to do science, I hypothesized that the notion of cause-and-effect is an ill-suited tool for describing the effects of input parameters on complex systems. In other words, cause-and-effect are ideas associated with models and many complex systems do not readily admit description by models. Now, to give fair warning, I have had no formal training in complexity or dynamical systems analysis, so most of what I write on these topics is an exploration of the relevant concepts to further my understanding--a sort of self-teaching, if you will.

I was therefore pleased to read the current thesis by Mark Buchanan in Nature Physics about work from dynamical systems theory concerning ways of determining causation in a complex system. He references an article from this year's Science journal (Science 338, 469-500; 2012) that contains an example of a model for the interaction of two species. The model consists of two equations for the population of each species that contains coupling parameters linking the two populations. Despite the fact that the population of each species affects the other, the populations are uncorrelated in the long run because the model goes through times of correlation, anti-correlation, and no correlation.

This is an example of the maxim "causation does not imply correlation." (Of course, many of us with scientific training have been chided endlessly about the maxim's well-known converse.) On the face of it, this example seems to support my idea about causation.

However, the main focus of Buchanan's article is about finding descriptors for causation beyond correlations. As he states:
Correlation alone isn't informative, nor is the lack of it. Might there be more subtle patterns, beyond correlations, that do really signify causal influence?
He mentions two major works, one old and one new, that address this question. The old one, introduced by Clive Granger in 1969, states that two variables are causally-linked if including one in a predictive scheme improves prediction of the other. The new work addresses problems with this idea and is somewhat technical, but it is capable of solving the problem of two populations mentioned above and one outstanding problem in ecology concerning the population of two species of fish.

I think now that my earlier conclusion about causality was wrong. I thought that a correlation needed to exist for there to be a causal link between two system parameters. Though I was paying heed to "correlation does not imply causation," I was ignorant of its converse, "causation does not imply correlation." Thus, causation can be an important concept for complex systems; we may only have to find better indicators than correlations.