Covariance#

class astropy.nddata.Covariance(array=None, data_shape=None, assume_symmetric=False, unit=None)[source]#

Bases: NDUncertainty

A general utility for storing, manipulating, and I/O of covariance matrices.

Covariance matrices are symmetric by definition, \(\Sigma_{ij} = \Sigma_{ji}\). The object therefore only stores the upper triangle of the matrix using a scipy.sparse.csr_matrix. By default, instantiation will check for symmetry and issue a warning if the matrix is not symmetric. This check can be skipped using the assume_symmetric keyword. However, by virtue of how the data is stored, symmetry is always imposed on the matrix. That is, if a non-symmetric matrix is used to instantiate a Covariance object, the stored data will yield a matrix that is different from the original input.

Covariance matrices of higher dimensional arrays are always assumed to be stored following row-major indexing. For example, the covariance value \(\Sigma_{ij}\) for an image of size \((N_x,N_y)\) is the covariance between image pixels \(I_{x_i,y_i}\) and \(I_{x_j,y_j}\), where \(i = x_i + N_x y_i\) and, similarly, \(j = x_j + N_x y_j\).

See Covariance for additional documentation and examples.

Parameters:
arraynumpy:array_like, csr_matrix

Covariance matrix to store. If the array is not a csr_matrix instance, it must be convertible to one. To match the calling sequence for NDUncertainty, array has a default value of None, but the array must be provided for this Covariance object.

data_shapetuple, optional

The covariance data is for a higher dimensional array with this shape. For example, if the covariance data is for a 2D image with shape (nx,ny), set data_shape=(nx,ny); the shape of the covariance array must then be (nx*ny, nx*ny). If None, any higher dimensionality is ignored.

assume_symmetricbool, optional

Assume the matrix is symmetric. This means that a check for symmetry is not performed, and the user is not warned if the matrix is not symmetric.

unitastropy:unit-like, optional

Unit for the covariance values.

Raises:
TypeError

Raised if the input array not a csr_matrix object and cannot be converted to one.

ValueError

Raised if data_shape is provided and the input covariance matrix array does not have the expected shape or if array is None.

Attributes Summary

data_index_map

An array mapping the index along each axis of the covariance matrix to the shape of the associated data array.

data_shape

The expected shape of the data array associated with this covariance array.

nnz

The number of non-zero (NNZ) elements in the full covariance matrix, including both the upper and lower triangles.

quantity

The covariance matrix as an dense Quantity object.

shape

Tuple with the shape of the covariance matrix

stored_nnz

The number of non-zero elements stored by the object, which only counts the non-zero elements in the upper triangle.

uncertainty_type

"cov": Covariance implements a covariance matrix.

variance

The diagonal of the covariance matrix.

Methods Summary

apply_new_variance(var)

Using the same correlation coefficients, return a new Covariance object with the provided variance.

coordinate_data([reshape])

Construct data arrays with the non-zero covariance components in coordinate format.

copy()

Return a copy of this Covariance object.

covariance_to_data_indices(i, j)

Given indices along the two axes of the covariance matrix, return the relevant indices in the data array.

data_to_covariance_indices(i, j)

Given indices of elements in the source data array, return the matrix coordinates with the associated covariance.

find([correlation])

Find the non-zero values in the full covariance matrix (not just the upper triangle).

from_array(covar[, cov_tol, rho_tol])

Define a covariance object from an array.

from_matrix_multiplication(T, covar, **kwargs)

Construct the covariance matrix that results from a matrix multiplication.

from_samples(samples[, cov_tol, rho_tol])

Build a covariance object using discrete samples.

from_table(triu_covar)

Construct the covariance matrix from a table with the non-zero elements of the upper triangle of the covariance matrix in coordinate format.

from_variance(variance, **kwargs)

Construct a diagonal covariance matrix using the provided variance.

match_to_data_slice(data_slice)

Return a new Covariance instance that is matched to a slice of its parent data array.

revert_correlation(var, rho[, assume_symmetric])

Revert a variance vector and correlation matrix into a covariance matrix.

to_correlation(cov[, assume_symmetric])

Convert a covariance matrix into a correlation matrix by dividing each element by the variances.

to_dense([correlation])

Return the full covariance matrix as a numpy.ndarray object (a "dense" array).

to_sparse([correlation])

Return the full covariance matrix as a csr_matrix object.

to_table()

Return the covariance data in a Table using coordinate format.

Attributes Documentation

data_index_map#

An array mapping the index along each axis of the covariance matrix to the shape of the associated data array.

data_shape#

The expected shape of the data array associated with this covariance array.

nnz#

The number of non-zero (NNZ) elements in the full covariance matrix, including both the upper and lower triangles.

quantity#

The covariance matrix as an dense Quantity object.

shape#

Tuple with the shape of the covariance matrix

stored_nnz#

The number of non-zero elements stored by the object, which only counts the non-zero elements in the upper triangle.

uncertainty_type#

"cov": Covariance implements a covariance matrix.

variance#

The diagonal of the covariance matrix.

Methods Documentation

apply_new_variance(var)[source]#

Using the same correlation coefficients, return a new Covariance object with the provided variance.

Parameters:
varnumpy:array_like

Variance vector. Must have a length that matches this Covariance instance; e.g., if this instance is cov, the length of var must be cov.shape[0]). Note that, if the covariance is for higher dimensional data, this variance array must be flattened to 1D.

Returns:
Covariance

A covariance matrix with the same shape and correlation coefficients and this object, but with the provided variance.

Raises:
ValueError

Raised if the length of the variance vector is incorrect.

coordinate_data(reshape=False)[source]#

Construct data arrays with the non-zero covariance components in coordinate format.

Coordinate format means that the covariance matrix data is provided in three columns providing \(\Sigma_{ij}\) and the (0-indexed) matrix coordinates \(i,j\).

This procedure is primarily used when constructing the data arrays for storage. Matching the class convention, the returned data only includes the upper triangle.

Parameters:
reshapebool, optional

If reshape is True and data_shape is defined, the \(i,j\) indices are converted to the expected data-array indices; see covariance_to_data_indices(). These can be reverted to the coordinates in the covariance matrix using data_to_covariance_indices().

Returns:
i, jpython:tuple, numpy.ndarray

The row and column indices, \(i,j\): of the covariance matrix. If reshaping, these are tuples with the index arrays along each of the reshaped axes.

cijnumpy.ndarray

The covariance, \(\Sigma_{ij}\), between array elements at indices \(i\) and \(j\).

Raises:
ValueError

Raised if reshape is True but data_shape is undefined.

copy()[source]#

Return a copy of this Covariance object.

Returns:
Covariance

A copy of the current covariance matrix.

covariance_to_data_indices(i, j)[source]#

Given indices along the two axes of the covariance matrix, return the relevant indices in the data array. This is the inverse of data_to_covariance_indices().

Parameters:
indarray

1D array with the index along the first axis of the covariance matrix. Must be in the range \(0...n-1\), where \(n\) is the length of the covariance-matrix axes.

jndarray

1D array with the index along the second axis of the covariance matrix. Must be in the range \(0...n-1\), where \(n\) is the length of the covariance-matrix axes.

Returns:
i_data, i_datapython:tuple, numpy.ndarray

If data_shape is not defined, the input arrays are simply returned (and not copied). Otherwise, the code uses unravel_index to calculate the relevant data-array indices; each element in the two-tuple is itself a tuple of \(N_{\rm dim}\) arrays, one array per dimension of the data array.

Raises:
ValueError

Raised if the provided indices fall outside the range of covariance matrix.

data_to_covariance_indices(i, j)[source]#

Given indices of elements in the source data array, return the matrix coordinates with the associated covariance. This is the inverse of covariance_to_data_indices().

Parameters:
inumpy:array_like, python:tuple

A tuple of \(N_{\rm dim}\) array-like objects providing the indices of elements in the N-dimensional data array. This can be an array-like object if data_shape is undefined, in which case the values must be in the range \(0...n-1\), where \(n\) is the length of the data array.

jnumpy:array_like, python:tuple

The same as i, but providing a second set of coordinates at which to access the covariance.

Returns:
i_covar, j_covarnumpy.ndarray

Arrays providing the indices in the covariance matrix associated with the provided data array coordinates. If data_shape is not defined, the input arrays are simply returned (and not copied). Otherwise, the code uses ravel_multi_index to calculate the relevant covariance indices.

Raises:
ValueError

Raised if the provided indices fall outside the range of data array, or if the length of the i or j tuples is not \(N_{\rm dim}\).

find(correlation=False)[source]#

Find the non-zero values in the full covariance matrix (not just the upper triangle).

This is a simple wrapper for to_sparse and find.

Parameters:
correlationbool, optional

Flag to return the correlation data, instead of the covariance data. Note that setting this to True does not also return the variance vector.

Returns:
i, jnumpy.ndarray

Arrays containing the index coordinates of the non-zero values in the covariance (or correlation) matrix.

cnumpy.ndarray

The non-zero covariance (or correlation) matrix values located at the provided i,j coordinates.

classmethod from_array(covar, cov_tol=None, rho_tol=None, **kwargs)[source]#

Define a covariance object from an array.

Note

The only difference between this method and the direct instantiation method (i.e., Covariance(array=covar)) is that it can be used to impose tolerances on the covariance value and/or correlation coefficients.

Parameters:
covarnumpy:array_like

Array with the covariance data. The object must be either a csr_matrix or an object that can be converted to one. It must also be 2-dimensional and square.

cov_tolfloat, optional

The absolute value of any covariance matrix entry less than this is assumed to be equivalent to (and set to) 0.

rho_tolfloat, optional

The absolute value of any correlation coefficient less than this is assumed to be equivalent to (and set to) 0.

**kwargspython:dict, optional

Passed directly to main instantiation method.

Returns:
Covariance

The covariance matrix built using the provided array.

classmethod from_matrix_multiplication(T, covar, **kwargs)[source]#

Construct the covariance matrix that results from a matrix multiplication.

Linear operations on a dataset (e.g., binning or smoothing) can be written as matrix multiplications of the form

\[{\mathbf y} = {\mathbf T}\ {\mathbf x},\]

where \({\mathbf T}\) is a transfer matrix of size \(N_y\times N_x\), \({\mathbf x}\) is a vector of size \(N_x\), and \({\mathbf y}\) is a vector of length \({N_y}\) that results from the multiplication. If \({\mathbf \Sigma}_x\) is the covariance matrix for \({\mathbf x}\), then the covariance matrix for \({\mathbf Y}\) is

\[{\mathbf \Sigma}_y = {\mathbf T}\ {\mathbf \Sigma}_x\ {\mathbf T}^\top.\]

If covar is provided as a vector of length \(N_x\), it is assumed that the elements of \({\mathbf X}\) are independent and the provided vector gives the variance in each element; i.e., the provided data represent the diagonal of \({\mathbf \Sigma}\).

Parameters:
Tcsr_matrix, ndarray

Transfer matrix. See above.

covarcsr_matrix, ndarray

Covariance matrix. See above.

**kwargspython:dict, optional

Passed directly to main instantiation method.

Returns:
Covariance

The covariance matrix resulting from the matrix multiplication.

Raises:
ValueError

Raised if the provided arrays are not two dimensional or if there is a shape mismatch.

classmethod from_samples(samples, cov_tol=None, rho_tol=None, **kwargs)[source]#

Build a covariance object using discrete samples.

The covariance is generated using cov for a set of discretely sampled data for an \(N\)-dimensional parameter space.

Parameters:
samplesndarray

Array with samples drawn from an \(N\)-dimensional parameter space. The shape of the input array must be \(N_{\rm par}\times N_{\rm samples}\).

cov_tolfloat, optional

The absolute value of any covariance matrix entry less than this is assumed to be equivalent to (and set to) 0.

rho_tolfloat, optional

The absolute value of any correlation coefficient less than this is assumed to be equivalent to (and set to) 0.

**kwargspython:dict, optional

Passed directly to main instantiation method.

Returns:
Covariance

An \(N_{\rm par}\times N_{\rm par}\) covariance matrix built using the provided samples.

Raises:
ValueError

Raised if the input array is not 2D or if the number of samples (length of the second axis) is less than 2.

classmethod from_table(triu_covar)[source]#

Construct the covariance matrix from a table with the non-zero elements of the upper triangle of the covariance matrix in coordinate format.

This is the inverse operation of to_table(). The class can read covariance data written by other programs as long as they have a commensurate format; see to_table().

Parameters:
triu_covarTable

The non-zero elements of the upper triangle of the covariance matrix in coordinate format; see to_table().

Returns:
Covariance

The covariance matrix constructed from the tabulated data.

Raises:
ValueError

Raised if triu_covar.meta is None, if the provided variance array does not have the correct size, or if the data is multidimensional and the table columns do not have the right shape.

classmethod from_variance(variance, **kwargs)[source]#

Construct a diagonal covariance matrix using the provided variance.

Parameters:
variancendarray

The variance vector.

**kwargspython:dict, optional

Passed directly to main instantiation method.

Returns:
Covariance

The diagonal covariance matrix.

match_to_data_slice(data_slice)[source]#

Return a new Covariance instance that is matched to a slice of its parent data array.

Parameters:
data_sliceslice, numpy:array_like

Anything that can be used to slice a numpy.ndarray. To generate a slice using syntax that mimics accessing numpy array elements, use numpy.s_; see examples here.

Returns:
Covariance

A new covariance object for the sliced data array.

static revert_correlation(var, rho, assume_symmetric=False)[source]#

Revert a variance vector and correlation matrix into a covariance matrix.

This is the reverse operation of to_correlation.

Parameters:
varndarray

Variance vector. Length must match the diagonal of rho.

rhondarray, csr_matrix

Correlation matrix. Diagonal must have the same length as var.

assume_symmetricbool, optional

Assume the matrix is symmetric. This means that a check for symmetry is not performed, and the user is not warned if the matrix is not symmetric.

Returns:
csr_matrix

Covariance matrix.

static to_correlation(cov, assume_symmetric=False)[source]#

Convert a covariance matrix into a correlation matrix by dividing each element by the variances.

Specifically, extract var (\(V_i = C_{ii} \equiv \sigma^2_i\)) and convert cov from a covariance matrix with elements \(C_{ij}\) to a correlation matrix with \(\rho_{ij}\) such that

\[C_{ij} \equiv \rho_{ij} \sigma_i \sigma_j.\]

To revert a variance vector and correlation matrix back to a covariance matrix, use revert_correlation().

Parameters:
covnumpy:array_like

Covariance matrix to convert. Must be a csr_matrix instance or convertible to one.

assume_symmetricbool, optional

Assume the matrix is symmetric. This means that a check for symmetry is not performed, and the user is not warned if the matrix is not symmetric.

Returns:
varnumpy.ndarray

Variance vector

rhocsr_matrix

Correlation matrix

Raises:
ValueError

Raised if the input array is not 2D and square.

to_dense(correlation=False)[source]#

Return the full covariance matrix as a numpy.ndarray object (a “dense” array).

Parameters:
correlationbool, optional

Flag to return the correlation matrix, instead of the covariance matrix. Note that setting this to True does not also return the variance vector.

Returns:
ndarray

Dense array with the full covariance matrix.

to_sparse(correlation=False)[source]#

Return the full covariance matrix as a csr_matrix object.

This method is essentially equivalent to to_dense except that it returns a sparse array.

Parameters:
correlationbool, optional

Return the correlation matrix. If False, return the covariance matrix.

Returns:
csr_matrix

The sparse matrix with both the upper and lower triangles filled (with symmetric information).

to_table()[source]#

Return the covariance data in a Table using coordinate format.

Coordinate format means that the covariance matrix data is provided in three columns providing \(\Sigma_{ij}\) and the (0-indexed) matrix coordinates \(i,j\).

The output table has three columns:

  • 'INDXI': The row index in the covariance matrix.

  • 'INDXJ': The column index in the covariance matrix.

  • 'COVARIJ': The covariance at the relevant \(i,j\) coordinate.

The table also contains the following metadata:

  • 'COVSHAPE': The shape of the covariance matrix

  • 'BUNIT': (If unit is defined) The string representation of the covariance units.

  • 'COVDSHP': (If data_shape is defined) The shape of the associated data array.

If data_shape is set, the covariance matrix indices are reformatted to match the coordinates in the N-dimensional array.

Warning

Recall that the storage of covariance matrices for higher dimensional data always assumes a row-major storage order.

Objects instantiated by this method can be used to re-instantiate the Covariance object using from_table.

Returns:
Table

Table with the covoariance matrix in coordinate format and the relevant metadata.