Home > Computers & Technology > Computer Science > AI & Machine Learning > Computer Vision & Pattern Recognition

Information theory, inference, and learning algorithms by David J C MacKay

Author:David J C MacKay , Date: March 26, 2020 ,Views: 291

Information theory, inference, and learning algorithms by David J C MacKay

Author:David J C MacKay
Language: eng
Format: epub
ISBN: 9780521642989
Published: 2003-08-31T00:00:00+00:00

1.2) when the received signal is r = (1, 1, 0) and the channel is a binary symf1

metric channel with flip probability 0.1. The factors f4 and f5 respectively enforce the constraints that x1 and x2 must be identical and that x2 and x3

Figure 26.1. The factor graph

must be identical. The factors f

associated with the function g(x)

1, f2, f3 are the likelihood functions con(26.4).

tributed by each component of r.

A function of the factored form (26.1) can be depicted by a factor graph, in which the variables are depicted by circular nodes and the factors are depicted by square nodes. An edge is put between variable node n and factor node m if the function fm(xm) has any dependence on variable xn. The factor graph for the example function (26.4) is shown in figure 26.1.

The normalization problem

The first task to be solved is to compute the normalizing constant Z. The marginalization problems

The second task to be solved is to compute the marginal function of any variable xn, defined by

Zn(xn) =

P ∗(x).

(26.5)

{xn }, n =n

For example, if f is a function of three variables then the marginal for n = 1 is defined by

Z1(x1) =

f (x1, x2, x3).

(26.6)

x2,x3

This type of summation, over ‘all the xn except for xn’ is so important that it can be useful to have a special notation for it – the ‘not-sum’ or ‘summary’. The third task to be solved is to compute the normalized marginal of any variable xn, defined by

Pn(xn) ≡

P (x).

(26.7)

{xn }, n =n

[We include the suffix ‘n’ in Pn(xn), departing from our normal practice in the rest of the book, where we would omit it.]

Exercise 26.1:[1] Show that the normalized marginal is related to the marginal Zn(xn) by

n(xn)

n(xn) =

(26.8)

We might also be interested in marginals over a subset of the variables, such as

Z12(x1, x2) ≡

P ∗(x1, x2, x3).

(26.9)

All these tasks are intractable in general. Even if every factor is a function of only three variables, the cost of computing exact solutions for Z and for c David J.C. MacKay. Draft 4.0. April 15, 2003

26.2: The sum-product algorithm

363

the marginals is believed in general to grow exponentially with the number of variables N .

For certain functions P ∗, however, the marginals can be computed efficiently by exploiting the factorization of P ∗. The idea of how this efficiency arises is well illustrated by the message-passing examples of chapter 16. The sum-product algorithm that we now review is a generalization of messagepassing rule-set B (p.263). As was the case there, the sum-product algorithm is only valid if the graph is tree-like.

26.2

The sum-product algorithm

Notation

We identify the set of variables that the mth factor depends on, xm, by the set of their indices N (m). For our example function (26.4), the sets are N (1) =

{1} (since f1 is a function of x1 alone), N (2) = {2}, N (3) = {3}, N (4) =

{1, 2}, and N (5) = {2, 3}. Similarly we define the set of factors in which variable n participates, by M(n). We denote a set N (m) with variable n excluded by N (m)\n.

Download

Information theory, inference, and learning algorithms by David J C MacKay.epub

Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.

Categories

Computer Vision & Pattern Recognition	Expert Systems
Intelligence & Semantics	Machine Theory
Natural Language Processing	Neural Networks