# Qual é a diferença entre uma rede Bayes (dinâmica) e um HMM?

14

Eu li que HMMs, Particle Filters e Kalman filter são casos especiais de redes Bayes dinâmicas. No entanto, conheço apenas HMMs e não vejo a diferença nas redes dinâmicas de Bayes.

Alguém poderia explicar?

Seria bom se sua resposta pudesse ser semelhante à seguinte, mas para a Bayes Networks:

## Modelos ocultos de Markov

Um modelo Markov oculto (HMM) é uma 5-tupla :$\lambda =\left(S,O,A,B,\mathrm{\Pi }\right)$$\lambda = (S, O, A, B, \Pi)$

• $S\ne \mathrm{\varnothing }$$S \neq \emptyset$ : um conjunto de estados (por exemplo, "início do fonema", "meio do fonema", "fim do fonema")
• $O\ne \mathrm{\varnothing }$$O \neq \emptyset$ : um conjunto de possíveis observações (sinais de áudio)
• $A\in {\mathbb{R}}^{|S|×|S|}$$A \in \mathbb{R}^{|S| \times |S|}$: A stochastic matrix which gives probabilites $\left({a}_{ij}\right)$$(a_{ij})$ to get from state $i$$i$ to state $j$$j$.
• $B\in {\mathbb{R}}^{|S|×|O|}$$B \in \mathbb{R}^{|S| \times |O|}$: A stochastic matrix which gives probabilites $\left({b}_{kl}\right)$$(b_{kl})$ to get in state $k$$k$ the observation $l$$l$.
• $\mathrm{\Pi }\in {\mathbb{R}}^{|S|}$$\Pi \in \mathbb{R}^{|S|}$: Initial distribution to start in one of the states.

It is usually displayed as a directed graph, where each node corresponds to one state $s\in S$$s \in S$ and the transition probabilities are denoted on the edges.

Hidden Markov Models are called "hidden", because the current state is hidden. The algorithms have to guess it from the observations and the model itself. They are called "Markov", because for the next state only the current state matters.

For HMMs, you give a fixed topology (number of states, possible edges). Then there are 3 possible tasks

• Evaluation: given a HMM $\lambda$$\lambda$, how likely is it to get observations ${o}_{1},\dots ,{o}_{t}$$o_1, \dots, o_t$ (Forward algorithm)
• Decoding: given a HMM $\lambda$$\lambda$ and a observations ${o}_{1},\dots ,{o}_{t}$$o_1, \dots, o_t$, what is the most likely sequence of states ${s}_{1},\dots ,{s}_{t}$$s_1, \dots, s_t$ (Viterbi algorithm)
• Learning: learn $A,B,\mathrm{\Pi }$$A, B, \Pi$: Baum-Welch algorithm, which is a special case of Expectation maximization.

## Bayes networks

Bayes networks are directed acyclical graphs (DAGs) $G=\left(\mathcal{X},\mathcal{E}\right)$$G = (\mathcal{X}, \mathcal{E})$. The nodes represent random variables $X\in \mathcal{X}$$X \in \mathcal{X}$. For every $X$$X$, there is a probability distribution which is conditioned on the parents of $X$$X$:

$P\left(X|\text{parents}\left(X\right)\right)$

• Inference: Given some variables, get the most likely values of the others variables. Exact inference is NP-hard. Approximately, you can use MCMC.
• Learning: How you learn those distributions depends on the exact problem (source):

• known structure, fully observable: maximum likelihood estimation (MLE)
• known structure, partially observable: Expectation Maximization (EM) or Markov Chain Monte Carlo (MCMC)
• unknown structure, fully observable: search through model space
• unknown structure, partially observable: EM + search through model space

## Dynamic Bayes networks

I guess dynamic Bayes networks (DBNs) are also directed probabilistic graphical models. The variability seems to come from the network changing over time. However, it seems to me that this is equivalent to only copying the same network and connecting every node at time $t$$t$ with every the corresponding node at time $t+1$$t+1$. Is that the case?

2
1. You can also learn the topology of an HMM. 2. When doing inference with BNs, besides asking for maximum likelihood estimates, you can also sample from the distributions, estimate the probabilities, or do whatever else probability theory lets you. 3. A DBN is just a BN copied over time, with some (not necessarily all) nodes chained from past to the future. In this sense, a HMM is a simple DBN with just two nodes in each time-slice and one of the nodes chained over time.
KT.

I asked someone about this and they said: "HMMs are just special cases of dynamic Bayes nets, with each time slice containing one latent variable, dependent on the previous one to give a Markov chain, and one observation dependent on each latent variable. DBNs can have any structure that evolves over time."
ashley

Respostas:

1