HMM的假设:
给定观测序列O=o1o2...oT和模型λ=(π,A,B),求P(O|λ),即给定模型下观测序列的概率是多少?
P(O|λ)=∑_I P(O,I|λ)=∑_I P(I|λ).P(O|λ,I)
=∑_i1,i2,...,iT P(i_1,i_2,...,i_T|λ).P(o_1,o_2,...,o_T|i_1,i_2,...,i_T,λ)
=∑_i1,i2,...,iT P(i_1|λ).P(i_2|λ,i_1),P(i_3|λ,i_1,i_2)...P(i_T|λ,i_1,...,i_T).P(o_1|i_1,λ).P(o_2|i_2,λ)...P(o_T|i_T,λ)
=∑_i1,i2,...,iT π_i1.a_i1i2.a_i2i3...a_iT-1iT.b_i1(o_1).b_i2(o_2)...b_iT(o_T)
定义:
α_t(i)=P(o_1,o_2,...,o_T,i_t=q_i|λ), t=1,2,...,T
α_1(i)=P(o_1,i_1=q_i|λ)=P(i_1=q_i|λ).P(o_1|λ,i_1=q_i)=π_i.b_i(o_1)
已知α_t(i),t=1,2,...,T-1,则
α_t+1(i+1)=P(o_1,o_2,...,o_t,o_t+1,i_t+1=q_i|λ)
=∑_j P(o_1,o_2,...,o_t,i_t=q_j).a_ji.b_i(o_t+1)
=∑_j α_t(j).a_ji.b_i(o_t+1)
则P(O|λ)=∑_i α_T(i)
小例子
# Initialise HMM
library(HMM)
hmm = initHMM(c("A","B"), c("L","R"), transProbs=matrix(c(.8,.2,.2,.8),2),
emissionProbs=matrix(c(.6,.4,.4,.6),2))
print(hmm)
# Sequence of observations
observations = c("L","L","R","R")
# Calculate forward probablities
logForwardProbabilities = forward(hmm,observations)
print(exp(logForwardProbabilities))
#或者得到hmm和observations之后,直接用forward函数得到上述结果
exp(forward(hmm,observations))#forward
#exp(backward(hmm,observations))#backward
给定观测序列O=o1o2...oT和模型λ=(π,A,B),求出最大概率的隐藏序列I=i1,i2,...,iT?
I=argmax_I P(I|λ,O)
=argmax_I P(I|λ,O).P(O|λ)
=argmax_I P(I,O|λ)=argmax_i1,i2,...,iT P(i_1,i_2,...,i_T,o_1,o_2,...,o_T|λ)
i_tϵ{q1,q2,...,qN}={1,2,...,N}, t=1,2,...,T
定义:δ_t(i)为从时刻1到时刻t,且在时刻t的状态取i的所有路径中概率的最大值
即δ_t(i)=max_i1,i2,...,it-1 P(i_1,i_2,...,i_t-1,i_t=i,o_1,o_2,...,o_t|λ),i=1,2,...,N
则有δ_T(i)=max_i1,i2,...,iT-1 P(i_1,i_2,...,i_T-1,i_T=i,o_1,o_2,...,o_T|λ)
则max_I P(I,O|λ)=max_1≤i≤N δ_T(i)
如何用动态规划——维特比(Viterbi)算法求δ_T(i)?
δ_T-1(i)→δ_T(i):δ_T(i)=max_j δ_T-1(j).a_ji.b_i(o_T)
对t=1,2,...,T-1,
δ_t+1(i)=max_1≤j≤N δ_t(j).a_ji.b_i(o_t+1),t=1,2,...,N
δ_1(i)=P(i_1=i,o_1|λ)=π_i.b_i(o_1),t=1,2,...,N
δ_1(i)→δ_2(i)→...→δ_T(i)→max_1≤i≤N δ_T(i)
(i_T)^* =arg max_1≤i≤N δ_T(i)
(i_T-1)^ =arg max_1≤j≤N δ_T-1(j).a_j,(i_t)^ .b_(i_t)^* (o_t)
=arg max_1≤j≤N δ_T-1(j).a_j,(i_t)^
......
(i_t)^ =arg max_1≤j≤N δ_t(j).a_j,(i_t+1)^
最佳路径为(i_1)^ ,(i_2)^ ,...,(i_T-1)^ ,(i_T)^
小例子
# Initialise HMM
hmm = initHMM(c("A","B"), c("L","R"), transProbs=matrix(c(.6,.4,.4,.6),2),
emissionProbs=matrix(c(.6,.4,.4,.6),2))
print(hmm)
# Sequence of observations
observations = c("L","L","R","R")
# Calculate Viterbi path
viterbi = viterbi(hmm,observations)
print(viterbi)
#直接用viterbi函数实现上述结果
viterbi(hmm,observations)
给定观测序列O=o1o2...oT,求λ=(π,A,B)?
λ_MLE=arg max_λ P(O|λ)=arg max_λ ∑_I P(O,I|λ)=arg max_λ ∑_I P(I|λ).P(O|λ,I)
EM :含有隐变量的最大似然估计问题
data : O=o1,o2,...,oT
latent varible :I=i1,i2,...,iT
完全数据 :(O,I)=(o1,o2,...,oT,i1,i2,...,iT)
参数 :λ=(π,A,B)
E-step :Q(λ,λ^t)=E_I|λ^t,o logP(O,I|λ)
M-step :π^t+1=arg max_π Q(λ,λ^t)
∴πi^ =α1(i)β1(i)/∑α1(i)β1(i),i=1,2,...,N
小例子
# Initial HMM
hmm = initHMM(c("A","B"),c("L","R"),
transProbs=matrix(c(.9,.1,.1,.9),2),
emissionProbs=matrix(c(.5,.51,.5,.49),2))
print(hmm)
# Sequence of observation
a = sample(c(rep("L",100),rep("R",300)))
b = sample(c(rep("L",300),rep("R",100)))
observation = c(a,b)
# Baum-Welch
bw = baumWelch(hmm,observation,10)
print(bw$hmm)