- Open Access
- Total Downloads : 180
- Authors : Ridawarni P. , Yenny Hermiana A. , Saprilina G.
- Paper ID : IJERTV3IS060820
- Volume & Issue : Volume 03, Issue 06 (June 2014)
- Published (First Online): 02-07-2014
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
A Modified Decomposition Covariance Matrix Estimation for Undirected Gaussian Graphical Model
Ridawarni P., Yenny Hermiana A., Saprilina G.
Graduate School of Mathematics, University of Sumatera Utara Medan, Indonesia
Abstract – A covariance matrix is an undirected graph that associated with a multivariate probability of a given random vector where each vertex represents the different components of the random vector. Graphical model are framework for representing and conditional independence structures with distribution using graph G. This paper discussed a distribution estimation in determining decomposable covariance matrix in an undirected Gauss graphical model related to sparsity of invers covariance (concentration matrix). It showed decomposable covariance estimation with lower computational complexity. The result showed the correlation each different components in a given random vector that determined from decomposition covariance matrix estimation.
Keywords – conditional independence, covariance decomposition, Gauss graphical model, concentration graph
-
INTRODUCTION
Graphical models are a framework in determine a conditional independence structures within distributions that represented by a graph. In representing a distribution, undirected graph is used as a common model to describe a distribution problem in high dimension. Some previous researches that related to undirected graph model has been successfully applied to determine a conditional independence structures within a multivariate distribution and computation techniques that implemented using a graph for complexity enhancement problem of a high dimension data [1].
Covariance is an estimation of the two certain variables, x
and y, in n sizes of data sample. This estimation is used to
sequence of the variables and estimation that based on estimator, permutation of invariance to each available variable index. Some researches of undirected graphical model was studied and developed to determine a conditional independence structure in a multivariate distribution, and extended with a computation method using a representation by a graph, especially for dimension enhancement and complexity problem. The results showed a sum of weighted path of all available paths that connected the two variables in an undirected independence graph [11].
An estimation of decomposition covariance matrix is the main focus topic in this paper. The estimation is focused on an undirected Gaussian graphical model of two random variables that gives correlation of each vertex on a Gaussian graph as a result.
-
REVIEWS
The Gaussian distribution is also referred to as the normal distribution or the bell curve distribution for its bell-shaped density curve. The formula for a d-dimension Gaussian probability distribution is
1 (x )T 1(x )
(2 )d /2 | |1/2 2
p(x) exp (2.1)
where x is a d-element column vector of variables along each dimension, is the mean vector, calculated by
determine a variance and linear correlation of a multivariate or multi-dimension data. Estimation of covariance in a Gaussian distribution is basically a common problem in statistical signal processing such as speech recognition [2], [3], image processing [4], [5], sensor networks [6], computer networks
[7] and other fields that related to statistical graphical models. Efficient Bayesian inference in Gaussian graphical models is well established [8] [10].A conditional independence among some random variables that distributed to some estimated covariance inverse. Estimation of covariance in a high dimension data can be classified to two categories: estimation that based on the
E[x] x(x) dx
and is the d d covariance matrix, calculated by
E[(x )(x T )] (x )(x T ) p(x) dx
with the following form
(2.2)
(2.3)
11
21
1d
2d
(2.4)
E {(i1, j1),, (i|| , j|| )} , where each node is connected to itself, i.e., (i,i) E for all i V . Let x [x ,, x ]T be a
1 p
d1
dd
zero random vector of length
p | V |
whose elements are
The covariance matrix is always symmetric and positive semidefinite, where positive semidefinite means that for all
indexed by the nodes in V. The vector x satisfies the Markov property with respect to G, if for any pair of nonadjacent nodes the corresponding pair of elements in x are
non-zero
x Rd , xT x 0 . We normally only deal with
conditionally independent of the remaining elements, i.e., xi
covariance matrices that are positive definite where for all
and x j are conditionally independent of xr
for any
non-zero x Rd , xT x 0 , such the determinant | | will
{i, j} E
and r {V \ i, j} where
be strictly positive. The diagonal elements,
ii
are the
variances of the respective
x , i.e., 2 , and the off-diagonal
p(x , x
| x ) p(x | x ) p(x
| x )
(2.10)
i i i j r i r j r
elements,
ij , are the covariances of xi
and
x j . If the
variables along each dimension is statistically independent, then ij 0 , and we would have a diagonal covariance
Therefore, the joint distribution satisfies the following factorization:
matrix
2 0 0
p(xi , x j , xr )
p(xi , xr ) p(x j , xr )
p(xr )
(2.11)
1
2
0 2 0
(2.5)
In the Gaussian setting, this factorization leads to sparsity in
the concentration (inverse covariance) matrix. The multivariate Gaussian distribution is defined as
2
0 0 d
p(x; K) c | K |1/2 e1/2xT Kx
(2.12)
If the covariances along each dimension is the same, then we will have an identify matrix multiplied by a scalar,
where c in an appropriate constant and the marginal
2 I
by the Eq. (2.6), the determinant of becomes
(2.6)
concentration matrix is
Kr ([K 1]r,r )1
(2.13)
| | 2d
and the inverse of becomes
(2.7)
Together with Eq. (2.11) this implies that
Kr ([K 1]r,r )1; K [Kir ]0 [K jr ]0 [Kr ]0
(2.14)
0
1
2
| K | | Kir || K jr |
| Kr |
(2.15)
1
(2.8)
where K ,
K and
K are the marginal concentrations of
ir
jr r
1
{x , x },{x , x } and {x } , respectively, and are defined in a
0 2
i r j r r
For 2-d Gaussian where d = 2, formulation becomes
x [x1
x2 ]T , | | 4 , the
similar manner to Eq. (2.13). All the matrices in the right hand side of Eq. (2.14) have a zero value in the {i, j} th position, and therefore
[K ]i, j 0 for all {i, j} E1 x2 x2
p(x1, x2 )
exp 1 2
(2.9)
2 2
2 2
This property is the core of Gaussian graphical models: the
then we often denote a Gaussian distribution of Eq. (2.1) as
p(x) N(, ) .
An undirected graph G (V , E) is a set of nodes
V {1,,| V |} connected by undirected edges
concentration matrix K has a sparsity pattern which represents the topology of the conditional independence graph.
Decomposable models are a special type of graphicl model in which the conditional independence graphs satisfy an appealing structure. A decomposable graph can be divided
into an ordered sequence of fully connected subgraphs known as cliques and denoted by C1,,Ck
=1
-
DECOMPOSITION COVARIANCE MATRIX
where = as a result of sum and multiple matrix. Thus, the likelihood equation of Gaussian matrix can be formulated as
E W n w
(3.4)
In our estimation, we use some notations in determine covariance matrix in Gaussian graphical model. Assume
2 2 2
K 1 w
(3.5)
V
X X
p
where
Vp {1,, p}
is a random vector with n
normal multivariate distribution with p-dimension,
log L(K ) n log(det K ) tr(Kw)
(3.6)
N p (0, K 1) . Set a graph G (Vp , E) where for each vertex 2 2
i V is pair to a variable
Xi and
E Vp Vp which is an
log(det K ) wij
(3.7)
undirected graph. Set the definition that (i, j) E if and only if ( j,i) E . A Gaussian graphical model with conditional independence graph is determined with the limit of diagonal
kij
kij
n
log(det K ) K 1
(3.8)
elements of K that not pair to any edge in G. If (i, h) E ,
then
Xi and X j
is conditionally independence of a given
random variables. Concentration matrix
K (Kij )1i, j p is a
-
DECOMPOSITION COVARIANCE MATRIX FOR
limit to a symmetric positive definite matrix with diagonal
UNDIRECTED GAUSSIAN GRAPH
entry Kij 0 for all (i, j) E . Using G-Wishart distribution,
Adopted by a basic structure of variance component and
WisG ( , D) with density
, , = 1
det (2)/2 exp{ 1
} (3.1)
time series problem, we suggest definition of linear covariance formula model that represented as
,
2 ,
a1U1 aqUq
(4.1)
based on Lebesgue estimation (see [12]; [13]; [14]). If G is a complete graph E (Vp Vp ) , then WisG ( , D) is a G- Wishart distribution WisG ( , D) . We then using Cholesky
are symmetric matrices and is an unknown parameter that supposed to be a requirement so that the matrix is positive definite. Eq. (4.1) represents a common formula of
decomposition for matrix K with
K PG is
K PG where
time-series covariance model, mixed-linear and graph model.
=
1
and =
1
is an upper triangle
Specifically, high dimension q for any covariance matrix can be denoted as
where 1 = is the decomposition Cholesky of 1.
Estimation of covariance matrix is a basic problem of multivariate statistical that related to signal processing,
p q
(ij ) ijUij
i1 j 1
(4.2)
financial mathematics, pattern recognition and convex geometry computation. Set a sample of n points that independent, 1, , , from distribution. We then have a sample of covariance matrix
with is matrix with dimension × with element 1 at
, and 0 otherwise. For each column and row, we can set variance matrix with these following steps.
n
Step 1. Set matrix X as a deviation for x where =
n
1 T 1 T 1
n Xk X k i1
A A n
(3.1)
11 .
Step 2. Calculate as a result of sum and multiple matrix
with is a random matrix. Then we estimated the covariance matrix with rate of accurancy, = 0.01 that represented by a norm operation as follows.
with dimension × at matrix x.
Step 3. Each element of matrix x is divided by n. Thus, we determined variance-covariance matrix of matrix x,
' 1
n (3.2)
Assume = (1 , 2, , ) of a multivariate Gaussian distribution (0, ) where is a regular matrix. Then we determined likelihood function
(det K )n/ 2 etr (Kw)
L(K ) (3.3)
2
= , where
1 : column vector with element 1 of dimension × 1
x : deviation matrix of dimension ×
X : data matrix of dimension ×
V : covariance matrix of dimension ×
: result of sum and multiple matrix
n : number of trial of matrix X
Thus, we then determined a decomposition covariance matrix that showed correlation partial of the two random variables by
B. Matrix m × n
Assume that there exist matrix
ij
kij kii k jj
(4.3)
4.0 2.0 0.60
4.2 2.1 0.59
A. Matrix m × m
-
RESULTS
A53 3.9 2.0 0.58
4.3 2.1 0.62
4.1 2.2 0.63
For illustration, we use a given matrix data of dimension
× . Assume that there exist matrix 3×3 as follows.
which the covariance matrix of matrix A is determined by the definition = 11. Thus,
1
4
1
3×3 = 2
1
4
3
3
1
0.1 0.08 0.004
0.1 0.02 0.014
cov( A) 0.2 0.08 0.024
Then, covariance matrix of matrix A is determined by the definition = 11. Thus,
0.2 0.02 0.016
0.0 0.12 0.026
5 4 cov( A) 4 7
5
2
Because = 5, for each element of covariance matrix we
have
3 5 5
0.1
0.08
0.004
Because = 3, for each element of covariance matrix we
have
0.1 0.02 0.014
1
cov( A) 0.2 0.08 0.024 5
5 4
5 1.667
1.333
1.667
0.2 0.02 0.016
1
cov( A) 4 7
2 1.333
2.333
0.667
0.0 0.12 0.026
3
3 5 5 1.000 1.667 1.667
0.02 0.006 0.0014
with the decomposition of inverse matrix is
0.006 0.0056 0.0011
0.0014 0.0011 0.0003
1.172 0.234 1.266
0.656 0.469 0.469
with the decomposition of inverse matrix is
inv( A)
0.047 0.609 0.890
76.1 55.3 136.2
inv( A) 20.8 492.8 1322.1
and for each correlation of the two random variables, we determined
136.2 1322.1 7612.2
and for each correlation of the two random variables, we
0.2344
0.2344
determined
12 3 = 1.1719 0.4688 = 0.7412 = 0.3162
55.3
55.3
0.0469
13 3 = =
1.1719 0.8906
0.0469
= 0.0459
1.0216
12 3 = 76.1 492.8 = 193.65 = 0.285
55.3 55.3
0.4788
0.4688
13 3 = 7612.2 = 761.109 = 0.072
23 3 = 0.4688 0.8906 = 0.6461 = 0.7255
1322.1
1322.1
23 3 = 492.8 7612.2 = 1936.825 = 0.682
-
CONCLUSIONS
In this paper, we studied the decomposition of covariance for undirected Gaussian graph. Estimation of covariance in a Gaussian distribution is a common problem in statistic. Gaussian graphical model is a method that can be used to represent the structure are independent among different independence random variables in a graph. This paper has examined the results of the covariance estimation for undirected Gaussian by using a likelihood function in order to obtain the concentration of each random variables. The results of the decomposition matrix is decomposed and can be used in solving problems that related to signal transmission, patter recognition and other concentrations matrix problem.
REFERENCES
-
Dobra, A., Hans, C., Jones, B. Nevins, J.R., Yao, G. and West, M. Sparse graphical models for exploring gene expression data, Journal of Multivariate Analysis, special issue on Multivariate Methods in Genomic Data Analysis, Vol. 90 (2004), pp. 196-212.
-
Bilmes, J.A. Factored sparse inverse covariance matrices. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process (ICASSP). (2000), pp. 1009-1012.
-
Bilmes, J.A., and Bartels, C. Graphical model architectures for speech recognition. IEEE Signal Processing Mag., Vol. 22 (2005), pp. 89-100.
-
Willsky, A. Multiresolution Markov models for signal and image processing. Proc. IEEE, Vol. 90 (2002), pp. 1396-1458.
-
Choi, M.J., and Willsky, A. Multiscale Gaussian graphical models and algorithms for large-scale inference. In Proc. IEEE 14-th Work-Shop Statistical Signal Process (SSP). (2007), pp. 229-233.
-
Cetin, M., Chen, L., Fisher, W., Ihler, A.T., Moses, R.L., Wainwright, M.J., and Willsky, A.S. Distributed fusion in sensor networks: A graphical models perspective. IEEE Signal Process Mag., Vol. 23 (2006), pp. 42-55.
-
Wiesel, A. and Hero, A.O. Decomposable principal component analysis. IEEE Trans. Signal Process, Vol. 59 (2009), pp. 4369-4377.
-
Sudderth, E.B., Wainwright, M.J. and Willsky, A.S. Embedded trees: estimation of Gaussian processes on graphs with cycles, Vol. 52 (2004), pp. 3136-3150.
-
Malioutov, D.M., Johnson, J.K., and Willsky, A.S. Walk-sums and belief propagation in Gaussian graphical models. J. Mach. Learn., Res., Vol. 7 (2006), pp. 2031-2064.
-
Chandrasekaran, V., Johnson, J.,K. and Willsky, A.S. Estimation in Gaussian graphical models using tractable subgraphs: A-walk sum analysis. IEEE Trans. Signal Process, Vol., 56 (2008), pp. 1916-1930.
-
Wiesel, A. and Hero III, A.O. Distributed covariance estimation in Gaussian graphical models., IEEE Trans. On Signal Processing, Vol. 60 (2012), pp. 211-220.
-
Roverato, A. Hyper inverse Wishart distribution for nondecomposable graphs and its applications to Bayesian inference for Gaussian graphical models, Scandinavian Journal of Statistics, Vol. 29 (2002), pp.391-411.
-
Atay-Kayis, A. and Massam, H. A Monte Carlo method for computing the marginal likelihood in nondecomposable Gaussian graphical models, Biometrica, Vol. 92 (2005), pp. 317-335.
-
Letac, G. and Massam, H. Wishart distributions for decomposable graphs, The Annals of Statistics, Vol. 35 (2007), pp. 1278-1323.