So in the previous module, we talked a little bit about the mathematical details behind dynamic causal modeling, or DCM. In this module we're going to do the same thing for Granger causality. So Granger causality's a technique that was originally developed in economics literature. But has recently been applied to neuroimaging as well. And so Granger causality is a little bit different from dynamic causal models. And it doesn't rely on the a priori specification of a structural model. But rather is an approach for quantifying the usefulness of past values of various brain regions in predicting current values in other regions. So now lets x and y be two time courses of length T extracted from two separate brain regions. Now each time course can be modeled using a linear auto aggressive model of the Mth order. So here for example, we say that from x to M so this is the value to x to time M depends on the past M values through the sort of linear equations. So we have A1 is the weight for the value of x at time m minus 1. A2 would be the weight for x at time m minus 2, etc. And then we have an error term, epsilon, as well. Similarly we model y during its m pass values as well. So this is what's called an autoregressive model of the Mth order. So basically, all we're doing here for modeling the x at a certain point using its past values. Here, epsilon x and epsilon y are both white noise. We can just assume for simplicity that they're IID normal errors. The next step in Granger causality is to expand each model using the autoregressive terms from the other signal. So basically, we still use the past m values of x in modeling the current value of x, but we also use the past m values of y. And then similarly, we do that for y. We use both the past m values of y and the past m values of x. Okay? So now the current value depends both on the past m values of its own time course. But also, the past m values of the other time course. Okay? So now we have four equations here. We have x. How x depends on his own past. How y depends on his own past. How x depends on his own past and the past of y. And how y depends on its own past and the past of x. So using these models, we can test whether the history of x has any predictive value on the current value of y and vice versa. And so if the model fit is significantly improved by the inclusion of these cross-autoregressive terms. Then this provides evidence that the history of one of the time courses can be used to predict the current value of the other and so we infer what's called the Granger-causal relationship between the two. So how do we measure this? Well Goecki proposed a measure of linear dependence F of XY between the two time courses X and Y which implements greater causality in terms of vector auto-regressive models. So here F of XY is a measure of the total linear dependence between X of Y. If nothing about the current value of x can be explained by a model containing all the vales of y, then F of xy will be equal to zero. And similarly, we can say that if nothing about the current value of y can be explained by a model containing all the values of x. Then F of xy is zero. So the term F of x y can be decomposed into the sum of three different components. You have F of x y is equal to F of x two y plus F of y two x, plus F of x y, dot product here. So, F of x to y and F of y to x are measures of the linear directed influence. From x to y and y to x, respectively. So if past values of x improve the prediction of current value of y, then F of x to y is greater than zero. A similar interpretation holds for F of y to x. The term Fx.y is a measure of the undirected instantaneous influence between the two series. The improvement in the prediction of the current value of x by including the current value of y in a linear model already containing the past values of x and y. So how do we compute these three different components. Well this is what Goecki proposed and basically using these vector auto-regressive models. So here we have a lot of equations here, but the key thing here that I just want to show you these equations. We don't have to dig in too deep into them, but basically what we have is we have different variance components here. So if we have a model of X that just depends on the passifies of X. Then we have a variance component sigma one. Similarly for Y, that depends only on its own past value, we have a variance, covariance matrix T1. Now if we start including the other terms into it, we get this equation q and now the variance, covariance matrix changes a little bit. So we have sigma two, which is the variance covariance matrix of x when we include both past values of x and y. And T 2, is similarly the variance, covariance matrix of the u when we're including past values of both X and Y. So using these variance components, the total linear dependence between X and Y can be written in the following way. So again, F of x and y is equal F of x 2Y plus F of y 2x plus F of x.y. And each of these terms can be expressed in the terms of the various components from the previous model. So, for example, if we're looking at F of x two y. This is the natural logger rhythm of the teminance of T1 divided by the terminant of T2 Now, what does this represent? Well T1 is the various component when we only have Y in the model. So we say that the current value of Y only depends on past values of Y. T2 is when we also include X. So if X explains a lot about Y, then T2 should actually be much smaller than T1. And so Fx,y should increase and become big. So that's the general idea behind these equations. And if you're mathematically interested in these equations. You can dig in further into them, but I just want to give you a brief taste of what they look like. So, if past values of x improve on the prediction of the current value of y, then the id here is that F to x to y should be large. A similar interpretation, but in the opposite direction holds for F of y to x. And so the difference between these two terms can be used to affirm which region's history is more influential on the other and this difference is referred to as Granger Causality. A Granger Causal Map is computed with respect to a single selected reference region, such as the seed region. So this is very similar to seed analysis that we talked about in functional connectivity. But here, we basically take a seed region, and we take a time portion of that seed region and we call that x. Then we let every other voxel in the brain be the y that we test the Granger causal relationship to x. So basically, it's sort of like a Granger causal seed analysis. And so it maps both sources of influence to the reference region and targets of influence from the reference region over the whole brain. Here's an example from work by Robrick showing regions that, where the reference to voxel is bigger than the voxel to reference and vice versa. That reference region, the recede region is the region shown in the crosshairs here. So just to end with a few brief comments from the definition of Granger Causality. It's clear that the idea of temper precedence is used to identify the direction and strength of causality using information in the data. So, if the past of y explains a lot about the current value of x, then we infer this Granger causal relationship. And so this is all about temporal precedence. So while it's reasonable that temporal precedence is a necessary condition for causation, it's not really a sufficient condition. So there's things that happen before another event, but that doesn't mean that that event caused the other event. So therefore to directly equate Granger causality and causality as most people mean it requires a leap of faith. We'll talk about this a little bit more in the next module. Okay so that's the end of this module. Both this module and the previous one on DCM were a little bit more mathematical. I just wanted to give you a little bit of a look under the hood, how these methods work. Okay the next module we will finish up talking about effective conductivity. I'll see you then, bye.