Entropy and Information Theory by Robert M. Gray

Entropy and Information Theory by Robert M. Gray

Author:Robert M. Gray [Gray, Robert M.]
Language: eng
Format: epub, pdf
Published: 2011-02-19T05:00:00+00:00


7.4. STATIONARY PROCESSES

137

Corollary 7.3.1: Given a stationary source p, suppose that for some K

there exists a K-step Markov source m with distributions MXn >> PXn, n =

1, 2, · · ·. Then for all k ≥ K (7.8)–(7.10) hold.

Proof: If m is a K-step Markov source with the property MXn >> PXn, n = 1, 2, · · ·, then it is also a k-step Markov source with this property for all k ≥ K. The corollary then follows from the theorem. 2

Comment: The corollary implies that if any K-step Markov source dominates p on its finite dimensional distributions, then for all k ≥ K the k-step Markov approximations p(k) also dominate p on its finite dimensional distributions.

The following variational corollary follows from Theorem 7.3.1.

Corollary 7.3.2: For a fixed k let Let Mk denote the set of all k-step Markov distributions. Then infM∈Mk D(PXn||M) is attained by P (k), and n−1

inf

D(PXn ||M ) = D(PXn||P (k)

Xn ) =

Ip(Xl; Xl−k|Xkl−k).

M∈Mk

l=k

Since the divergence can be thought of as a distance between probability distributions, the corollary justifies considering the k-step Markov process with the same kth order distributions as the k-step Markov approximation or model for the original process: It is the minimum divergence distribution meeting the k-step Markov requirement.

7.4

Stationary Processes

Several of the previous results simplify when the processes m and p are both stationary. We can consider the processes to be two-sided since given a stationary one-sided process, there is always a stationary two-sided process with the same probabilities on all positive time events. When both processes are stationary, the densities fXn and f

m

Xn satisfy

dPXn

dPXn

f

m

Xn =

= f

T m,

m

Xn T m =

dMXn

dM

m

Xn

and have the same expectation for any integer m. Similarly the conditional densities fX

, and f

satisfy

n|Xn , fXk|Xn

X

k−n

0|X−1,X−2,···,X−n

fX

T n−k = f

T n

(7.20)

n|Xn = fXk|Xn

X

k−n

0|X−1,X−2,···,X−n

for any k and have the same expectation. Thus n−1

1

1

Hp||m(Xn) =

Hp||m(X0|X−1, · · · , X−i).

(7.21)

n

n i=0

Using the construction of Theorem 5.3.1 we have also that Di = Hp||m(Xi|Xi) = Hp||m(X0|X−1, · · · , X−i) 138

CHAPTER 7. RELATIVE ENTROPY RATES

= D(PX

||S

),

0,X−1,···,X−i

X0,X−1,···,X−i

where now

SX

= M

P

;

(7.22)

0,X−1,···,X−i

X0|X−1,···,X−i X−1,···,X−i

that is,

SX

(F × G) =

M

(F |xi) dP

(xi);

0,X−1,···,X−i

X0|X−1,···,X−i

X−1,···,X−i

F

F ∈ BA; G ∈ BAi.

As before the SXn distributions are not in general consistent. For example, they can yield differing marginal distributions SX . As we saw in the finite 0

case, general conclusions about the behavior of the limiting conditional relative entropies cannot be drawn for arbitrary reference measures. If, however, we assume as in the finite case that the reference measures are Markov, then we can proceed.

Suppose now that under m the process is a k-step Markov process. Then for any n ≥ k (X−n, · · · , X−k−2, X−k−1) → Xk−k → X0 is a Markov chain under m and Lemma 5.5.4 implies that

Hp||m(X0|X−1, · · · , X−n) = Hp||m(Xk|Xk) + Ip(Xk; (X−1, · · · , X−n)|Xk) (7.23)

and hence from (7.21)

¯

Hp||m(X) = Hp||m(Xk|Xk) + Ip(Xk; X−|Xk).

(7.24)

We also have, however, that X− → Xk → Xk is a Markov chain under m and hence a second application of Lemma 5.5.4 implies that Hp||m(X0|X−) = Hp||m(Xk|Xk) + Ip(Xk; X−|Xk).



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.