The Kalman filter implements belief computational for continuous state. It is not applicable to discrete or hybrid state spaces. The filter represents belief by moments parameterization. At time t, the belief is represented by the mean μt and the covariance Σt. Posteriors are Gaussian if the following three properties hold in addition to Markov assumption.
1. Linear Transition
The state transition probability p(xt∣ut,xt−1) must be a linear function in its arguments with added Gaussian noises.
xt=Atxt−1+Btut+ϵt(1)
The state and control are vectors. In our notation, they are vertical vectors.
Here At and Bt are matrices. At is of size n by n and Bt is of size n by m. The dimension of state vector is n and the dimension of control vector is m. The random variable ϵt has the same dimension as the state vector. Its mean is zero and its covariance will be denoted as Rt. It is there to modifies the uncertainty introduced by the state transition. The mean of the posterior state is given by equation 1 and covariance by Rt.
The measurement probability p(zt∣xt) must also be linear with added Gaussian noise.
zt=Ctxt+δt(2)
Here Ctis a matrix of size k by n where k is the dimension of the measurement vector. The δ describes the measurement noise. The distribution of noise is a multivariate Gaussian with zero mean and covariance of Qt.
Given arguments μt−1, Σt−1, ut, and zt, we have the following update rules.
μt=Atμt−1+BtutΣt=AtΣt−1At−1T+Rt(3a)
The predicted belief μt and Σt are calculated to represent the belief bel(xt), one time step later, but before incorporating the measurement zt.
Kt=ΣtCtT(CtΣCtT+QT)−1(3b)
Before we perform the measurement update, we need to compute Kalman gain from equation 3b, which specifies the degree to which the measurement is incorporated into the new state estimate. Then we use the gain to get the new state.
μt=μt+Kt(zt−Ctμt)Σt=(I−KtCt)Σt(3c)
The key concept here is innovation, which is the difference between the actual measurement and expected measurement, denoted by zt−Ctμt .
Code Example
For simplicity sake, I will omit the time dependence for transformation matrices. Let's define three matrices A, B, and C using numpy . Note that A is the state transition model or function, B is the control input model, and C is the observation model.
If we were to express them in matrix form, they would look like the following.
A=1.00.01.01.0B=1.00.00.01.0C=1.00.0
Suppose our robot is at coordinate (0.0,0.0) initially and we don't apply any external control to it. Let's denote x to be μ, x_cov to be Σ, u to be u, and u_cov to be R.