On linear vs statistical independence

It is often thought that linear independence and statistical independence are unrelated concepts from different branches of mathematics. In this short note, I beg to differ. Both the concepts of linear independence and statistical independence are encountered in the world of statistics, often in close proximity (e.g., one frequently sees “covariance is a measure of linear dependence”). Furthermore, linear independence has clear mathematical implications vis-à-vis statistical independence. The links and differences between them are frequent sources of confusion in statistics, so I think it is worthwhile clarifying them.

Consider a simple scenario in which you have two non-zero, non-constant, n-dimensional data vectors \mathbf{X} and \mathbf{Y}.

They are linearly independent if there is no non-zero scalar \alpha such that

\alpha \mathbf{X} - \mathbf{Y} = \mathbf{0}

In other words, there is no non-zero multiplicative constant \alpha that will transform \mathbf{X} into \mathbf{Y}. Geometrically, this means that the vectors \mathbf{X} and \mathbf{Y} do not lie on the same line.

The two vectors \mathbf{X} and \mathbf{Y} are statistically independent if and only if their joint probability density is the product of their marginal probability densities, i.e.,

f(\mathbf{X}, \mathbf{Y}) = f_X(\mathbf{X}) \cdot f_Y(\mathbf{Y})

This implies

cov(\mathbf{X}, \mathbf{Y}) = \mathbf{0}

(though the reverse implication is not true generally).

The two concepts are linked insofar as if the two vectors are not linearly independent then they can also not be statistically independent. For example, if for some non-zero scalar \alpha we have

\alpha \mathbf{X} = \mathbf{Y}

then

cov(\mathbf{X}, \mathbf{Y}) = \text{cov}(\frac{1}{\alpha}\mathbf{Y}, \mathbf{Y}) = \frac{1}{\alpha} \text{var} (\mathbf{Y}) \ne \mathbf{0}

However, linear independence of \mathbf{X} and \mathbf{Y} does not guarantee statistical independence. It is possible to have cov(\mathbf{X}, \mathbf{Y}) \ne \mathbf{0} even if \mathbf{X} and \mathbf{Y} are linearly independent. It is only when the linear independence takes a particular form, namely the two vectors being orthogonal, that the covariance between them will also be zero. Therefore one could say that covariance is a measure of `non-orthogonality’ (rather than a measure of linear dependence).