Revealing causal links in complex systems: New algorithm shows hidden influences

(techxplore.com)

136 points | by wglb 8 months ago ago

23 comments

youoy 8 months ago

From the linked article:

> The method, in the form of an algorithm, takes in data that have been collected over time, such as the changing populations of different species in a marine environment. From those data, the method measures the interactions between every variable in a system and estimates the degree to which a change in one variable (say, the number of sardines in a region over time) can predict the state of another (such as the population of anchovy in the same region).

I also read the introduction of the paper. Maybe I misunderstood something about causal inference, but I thought from data alone one could only infer correlations or associations (in general). To talk about "causal" links, I thought you need either to assume a particular model of the data generation process, or perform some interventions on the system to be able to decide the direction of the arrows in the "links" in general.

I'm not saying that the paper is wrong or anything, it looks super useful! It's just that one should be careful when writing/reading the word "causal".

[-]

nabla9 8 months ago

That's the first order truth. If you don't have any knowledge about the system, you can't infer causality with observations alone.

With some generic assumptions, or prior knowledge about the system, you can do causal discovery.

For example, just the assumption that there is additive random noise enables discovering causal arrows just by observing the system.

[-]

nimithryn 8 months ago

Any citation on the additive random noise?

youoy 8 months ago

I was not aware of the additive noise part. I will have to look into that, thanks for the info!

parodysbird 8 months ago

Correlation also assumes model of the data generating process, but you are correct in thinking that talking about causal links imposes even stronger assumption on the model and data structure for making inference. And then further you have to take a very narrow and convenient interpretation of what causality means (e.g. can't be at the actual level of individual samples, can't manifest through cycles or loops in the variables etc), which is even more of a vexing philosophical question than even the thorny questions in classical statistical inference

kqr 8 months ago

You are correct, as far as I know. I'm wondering if there's some sense in which one can infer such a model from conditional correlations.

thanatropism 8 months ago

You can always restrict the meaning of "causality".

E.g. Granger casusality means that A is typically detected before B and not the other way around (so not mere correlation). It's a moby useful concept.

abdullahkhalids 8 months ago

"collected over time" is the operational phrase. You should be able to determine causality, at least partially, if you know the change in correlated variables occur at different times.

p00dles 8 months ago

Nature Communications paper link: https://www.nature.com/articles/s41467-024-53373-4

GitHub link: https://github.com/Computational-Turbulence-Group/SURD

[-]

BenoitP 8 months ago

> Decomposition of causality: It decomposes causal interactions into redundant, unique, and synergistic contributions.

Seen elsewhere: https://github.com/BCG-X-Official/facet, which uses SHAP attributions as inputs:

> The SHAP implementation is used to estimate the shapley vectors which FACET then decomposes into synergy, redundancy, and independence vectors.

But FACET it still about sorting things out in the 'correlation world'.

To get back to SURD: IMHO, when talking about causality one should incorporate some kind of precedence, or order; One thing is the cause of another. Here in SURD they sort of introduce it in a roundabout way by using time's order:

> requiring only pairs of past and future events for analysis

But maybe we could have had fully-fledged custom DAGs, like from here https://github.com/nathanwang000/Shapley-Flow (which don't yet have the redundant/unique/synergistic decomposition)

Also, how do we deal with undetectable "post hoc ergo propter hoc" fallacy, though? (travesting time as causal ordering). How do we deal with confounding? Custom DAGs would have been great.

I'm longing for a SURD/SHAP/FACET/Shapleyflow integration paper. We're so close to it.

Arech 8 months ago

Thanks. This should be the actual submission instead of the marketing-speak blah-blah

[-]

passwordoops 8 months ago

I'm in your camp, but I've found with my own submissions here, marketing-speak gets more engagement than the papers.

So my compromise is to post the PR, but give the paper link in the first comment

djoldman 8 months ago

https://arxiv.org/pdf/2405.12411

[-]

ta988 8 months ago

The Nature Comm paper is open access too.

navaed01 8 months ago

Reminds me of Granger causality, which had a lot of hype at the time, but not a lot of staying power. (I only read the main article, which was v. high level - not the scientific paper )

[-]

JackeJR 8 months ago

Granger's causality is a very restrictive and incomplete view of causality. Pearl's counterfactual system with do calculus is a more general way to think about causality. This SURD appears to be a souped up version of Granger.

[-]

dccsillag 8 months ago

And the potential outcomes framework (Neyman-Rubin) is even more general :)

Either way, Holland's 'Statistics and Causal Inference' paper (1986) is a nice read on the different frameworks for causality, especially in regards to Granger (&friends) versus do-calculus/Neyman-Rubin.

[-]

throwaway81523 8 months ago

> Holland's 'Statistics and Causal Inference' paper (1986)

In case anyone else wants to take a look:

https://www.jstor.org/stable/2289064

https://doi.org/10.2307/2289064

phyalow 8 months ago

Page 2 speaks about the relationship and precedent of GC... Honestly worth a read, this is definitely one of most interesting papers for me in the last few years.

[-]

navaed01 8 months ago

I’ll check it out - so you think this had some noteworthy real world applications?

Xcelerate 8 months ago

I think convergent cross mapping came out after Granger causality. Did that ever go anywhere either?

csbbbb 8 months ago

https://www.youtube.com/watch?v=kxh2X6NjuhY

fuck_google 8 months ago

[dead]