limix.qc.normalise_covariance(K, out=None)[source]ΒΆ

Variance rescaling of covariance matrix 𝙺.

Let n be the number of rows (or columns) of 𝙺 and let mα΅’ be the average of the values in the i-th column. Gower rescaling is defined as

\[𝙺(n - 1)/(πšπš›πšŠπšŒπšŽ(𝙺) - βˆ‘mα΅’).\]


The reasoning of the scaling is as follows. Let 𝐠 be a vector of n independent samples and let 𝙲 be the Gower’s centering matrix. The unbiased variance estimator is

\[v = βˆ‘ (gα΅’-αΈ‘)Β²/(n-1) = πšπš›πšŠπšŒπšŽ((𝐠-ḑ𝟏)α΅€(𝐠-ḑ𝟏))/(n-1) = πšπš›πšŠπšŒπšŽ(𝙲𝐠𝐠ᡀ𝙲)/(n-1)\]

Let 𝙺 be the covariance matrix of 𝐠. The expectation of the unbiased variance estimator is

\[𝐄[v] = πšπš›πšŠπšŒπšŽ(𝙲𝐄[𝐠𝐠ᡀ]𝙲)/(n-1) = πšπš›πšŠπšŒπšŽ(𝙲𝙺𝙲)/(n-1),\]

assuming that 𝐄[gα΅’]=0. We thus divide 𝙺 by 𝐄[v] to achieve an unbiased normalisation on the random variable gα΅’.

  • K (array_like) – Covariance matrix to be normalised.

  • out (array_like, optional) – Result destination. Defaults to None.


>>> from numpy import dot, mean, zeros
>>> from numpy.random import RandomState
>>> from limix.qc import normalise_covariance
>>> random = RandomState(0)
>>> X = random.randn(10, 10)
>>> K = dot(X, X.T)
>>> Z = random.multivariate_normal(zeros(10), K, 500)
>>> print("%.3f" % mean(Z.var(1, ddof=1)))
>>> Kn = normalise_covariance(K)
>>> Zn = random.multivariate_normal(zeros(10), Kn, 500)
>>> print("%.3f" % mean(Zn.var(1, ddof=1)))