The Probability Hypothesis Density Filter
A. Taylan Cemgil
Signal Processing and Communications Lab.
7 December 2007 NIPS 07 Workshop
Approximate Bayesian Inference in Continuous/Hybrid Systems
Sinusoidal Tracking
with Clark, Peeling and Godsill0 50 100 150 200 250 300 350
−500 0 500 1000 1500 2000 2500 3000 3500 4000
−0.5 0 0.5 1 1.5 2 2.5 3
x 104
Frequency (Hz)
Time Tracking Results
Amplitude
Outline
• Introduction to Multi Object Tracking
• The Probability Hypothesis Density Filter – A toy model
– A short summary of point process theory
• Summary
Stochastic Dynamical System
• Generic Model
xk ∼ p(xk|xk−1) Transition Model yk ∼ p(yk|xk) Observation Model
• Examples: Hidden Markov Model, Kalman filter
x0 x1 . . . xk−1 xk . . . xK
y1 . . . yk−1 yk . . . yK
• Observations yk ∈ Y are projections of the latent states xk ∈
• Exact inference possible only for few cases
Tracking - Filtering
time
state
10 20 30 40 50 60 70 80 90 100
10 20 30 40
50 0
0.5 1
time
observation
10 20 30 40 50 60 70 80 90 100
10 20 30 40
50 0
0.5 1
x0 x1 . . . xk−1 xk . . . xK
y y y y
Tracking - Kalman Filtering
0 1 2 3 4 5
0 0.5 1 1.5
p(y k|y 1:k−1)
Phase
Period
• Propagate exact sufficient statistics of the filtering density
A harder scenario – clutter and missing detections
time
state
10 20 30 40 50 60 70 80 90 100
10 20 30 40
50 0
0.5 1
time
observation
10 20 30 40 50 60 70 80 90 100
10 20 30 40
50 0
0.5 1 1.5 2
• At each time k, we observe nk observations, at most one corresponding to the true target
A harder scenario – clutter and missing detections
x0 x1 . . . xk−1 xk . . . xK
y11 . . . y1k−1 yk1 . . . yK1
... ... ... ...
y1N . . . yNk−1 ykN . . . yKN
a0 a1 . . . ak−1 ak . . . aK
• The latent discrete switch variables ak denote which of the observations is the true one
Inference in Switching State Space models
• Unlike HMM’s or KFM’s, summing over indicators ak does not simplify the filtering density.
• Number of Gaussian kernels to represent exact filtering density increases exponentially
−7.9036 6.6343
0.76292
−10.3422
−10.1982
−2.393
−2.7957
−0.4593
Approximate Inference
• Sequential Monte Carlo (Particle Filtering)
– Sample branches with a probability propotional to the evidence, Mixture Kalman filter (Chen and Liu 2001)
• Deterministic Approximations
– Assumed density filter (ADF) : Project the filtering density by moment matching onto a tractable family,
• See “Bayesian inference in dynamic models – an overview” by Tom Minka
An even harder scenario
time
state
10 20 30 40 50 60 70 80 90 100
10 20 30 40
50 0
0.5 1
time
observation
10 20 30 40 50 60 70 80 90 100
10 20 30 40
50 0
0.5 1 1.5 2
• At each time slice, a chain can cease to exist, or a new chain is born
• Clutter and misdetections
An even harder scenario
x10 x11 . . . x1k−1 x1k . . . x1K
... ... ... ...
xM0 xM1 . . . xMk−1 xMk . . . xMK
y11 . . . yk−11 yk1 . . . y1K
... ... ... ...
y1N . . . yk−1N yNk . . . yKN
a0 a1 . . . ak−1 ak . . . aK
• Factorial dynamic model with changing number of latent chains (not modeled here)
• Association problem: combinatorial explosion in the number of switch variables
• Multi-hypothesis tracker (MHT), Joint Probabilistic Data Association filter (JPDA), Harmonic analysis on finite groups (Kondor, Howard, Jebara (2007))
time
observation
10 20 30 40 50 60 70 80 90 100
10
20
30
40
50 0
0.5 1 1.5 2
p_sur = 0.98; % Survival probability b = 0.01; % Birth intensity p_det = 0.5; % Detection probability c = 1/30; % Clutter intensity lam0 = 1; % Prior inensity
time
state
10 20 30 40 50 60 70 80 90 100
10
20
30
40
50
0.2 0.4 0.6 0.8 1
time
state
10 20 30 40 50 60 70 80 90 100
10
20
30
40
50 0
0.2 0.4 0.6 0.8 1
A simplified Model
• Suppose we just want to track the number of objects sk
sk−1 s¯k sk s¯k+1 . . .
ˆ sk
mk Survive Birth
Detect
Clutter
• We want to design a filter to track the number of objects, i.e. want to compute sequentially the filtering density p(sk|y1:k)
Notation
• Poisson Distribution
PO(s; λ) = e
−λs! λ
s• Binomial distribution – Number of succesful outcomes in n independent trials with success probability π
BI(s; n, π) = n s
π
s(1 − π)
n−sBasic Model
sk−1 s¯k sk s¯k+1 . . .
ˆ sk
mk
Survive Birth Detect
Clutter
Survive s¯k|sk−1 ∼ BI(¯sk; sk−1, πsur)
Birth sk = ¯sk + vk vk ∼ PO(vk; b) Detect sˆk|sk = BI(ˆsk; sk, πdet)
Observe in Clutter yk|ˆsk = ˆsk + ek ek ∼ PO(ek; c)
Realisation from the process
p_sur = 0.9; % Survival probability b = 3; % Birth intensity
p_det = 0.5; % Detection probability c = 20; % Clutter intensity
lam0 = 10; % Prior inensity
0 10 20 30 40 50 60 70 80 90 100
10 20 30 40 50
Time
Number of Objects
Obs y
t
True s
t
Superposition of Poisson random variables
s ∼ PO(s; λ) e ∼ PO(e; ν) y = s + e
p(y) = PO(s; λ + ν)
Thinning Poisson Random variables
• Suppose we have n objects. Each object survives independently with probability π.
s|n ∼ BI(s; n, π) n ∼ PO(n; λ)
p(s) = X
n
BI(s; n, π)PO(n; λ) = PO(s; λπ)
Observing the sum of two Poisson Random variables
s ∼ PO(s; λ) e ∼ PO(e; ν) y = s + e
s
e
0 5 10 15 20
0 2 4 6 8 10 12 14 16 18 20
p(s|y) = BI(s; y, λ/(λ + ν))
Moment matching
λ∗ = argmin
λ
KL(BI(s; m, π)||PO(s; λ)) = mπ
0 5 10 15 20
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
s
Binom(s; y, λ/(λ + ν)) Poisson(s; y λ/(λ + ν))
Probability generating functions (z-transforms)
• For a discrete random variable n ∼ p(n)
G(z) =
∞
X
n=0
p(n)zn
G(1) = 1 G′(1) = hni
BI(s; N, π) ⇔ GBI(z) = (1 − π + πz)N
G′BI(z) = N π(1 − π + πz)N−1 ⇒ hsi = N π
PO(s; λ) ⇔ GPO(z) = exp(λ(z − 1)) G′PO(z) = λ exp(λ(z − 1)) ⇒ hsi = λ
First moment approximation to BI.
Sketch of the derivation of the ADF
• Start : p(s
k−1|m
1:k−1) = PO(s
k−1; λ
k−1|k−1)
sk−1 s¯k sk s¯k+1 . . .
ˆ sk
mk
Survive Birth Detect
Clutter
• Prediction Step (Survive + Birth) :
p(s
k|m
1:k−1) = PO(s
k; λ
k|k−1)
λ = b + π λ
Sketch of the derivation of the ADF
• Update and Project Step (Observe in Clutter) (πdet = 1)
p(sk|m1:k) ∝ BI(sk; mk; λk|k−1/(c + λk|k−1))
≈ PO(sk; λk|k)
sk−1 s¯k sk s¯k+1 . . .
ˆ sk
mk
Survive Birth Detect
Clutter
λ
k|k= m
kλ
k|k−1c + λ
k|k−1Sketch of the derivation of the ADF
• Update and Project Step (Observe in Clutter) (πdet < 1) p(sk|m1:k) ∝ PO(sk; λk|k)
sk−1 s¯k sk s¯k+1 . . .
ˆ sk
mk
Survive Birth Detect
Clutter
λ
k|k= (1 − π
det)λ
k|k−1+ m
kπ
detλ
k|k−1Assumed Density Filter
λ
k|k−1= b + π
surλ
k−1|k−1λ
k|k= (1 − π
det)λ
k|k−1+ m
kπ
detλ
k|k−1c + π
detλ
k|k−10 10 20 30 40 50 60 70 80 90 100
10 20 30 40 50
Observations True number Filt. Estimate
Point Process, Definition
• A random countable set
• Realizations are from the state space
X
∪=
∞
[
n=0
X
nX
0≡ ∅
• Defined by restrictions P(n) on Xn, where the probability observing n points in A is denoted by
P(n)(A × · · · × A)
Point Process
P(0) P(1)(dx1) P(2)(dx1, dx2)
. . .
X0 X1 X2
• All P(n) are symmetric, e.g. P(2)(dx1, dx2) = P(2)(dx2, dx1)
• Normalisation
∞
X P(n)(Xn) = 1
Realisations from a point process
• Choose a set A
• Generate the number of points in A
N
A∼ p(n) = P
(n)(A
n)
P(0) P(1)(dx1) P(2)(dx1, dx2)
Realisations from a point process
• Generate the coordinates from the joint distribution
(x
1, . . . , x
n) ∼ P
(n)(dx
1, . . . , dx
n)/P
(n)(A
n)
−2 −1 0 1 2
−2
−1.5
−1
−0.5 0 0.5 1 1.5 2
−2 −1 0 1 2
−2
−1.5
−1
−0.5 0 0.5 1 1.5 2
Poisson Point Process
• For any A ⊂ X , we denote a poisson point process by X ∼ PP
A(X; λ) X ⊂ A
• The number of points to be observed in A is distributed by
|X ∩ A| = N
A∼ PO(N
A; Λ
A)
Intensity function λ(x) : X → R
+Intensity measure Λ =
Z
λ(x)dx
Sampling from a Poisson Point Process
• Each P
(n)is fully factorised
P
(n)∝
n
Y
i=1
λ(x
i)
P(0) P(1)(dx1) P(2)(dx1, dx2)
. . .
• Need to only represent λ(x), a positive function
Sampling from a Poisson Point Process
• Generate number of points on A
n ∼ P
(n)(A
n) = PO(n; Λ
A) = exp(−Λ
A) n! Λ
nA• For i = 1 . . . n, generate the coordinates independently from
x
i∼ λ(x)/Λ
ASuperposition of Poisson Processes
X
i∼ PP
A(X
i; λ
i(x)) i = 1 . . . L
X ∼ PP
A(X; X
i
λ
i(x))
X = X1 ∪ X2 ∪ · · · ∪ XL
Thinning a Poisson Process
X ∼ PP
A(X; λ(x))
Each point x
i∈ X survives with probability 0 < π
sur(x
i) < 1
X
thin∼ PP
A(X; π
sur(x)λ(x))
Displacement of a Poisson Process
• X
t∼ PP
At(X
t; λ(x
t))
• Suppose each point independently moves according to a transition kernel p(x
t+1|x
t)
X
t+1∼ PP
At+1(X
t+1; λ(x
t+1)) λ(x
t+1) =
Z
A
dx
tp(x
t+1|x
t)λ(x
t)
Partially observing a Poisson Point Process
X ∼ PPA(X; λ(x)) C ∼ PPA(C; c(x)) Y = X ∪ C
• The posterior process p(X|Y ) is in general not a Poisson process
• Find the Probability generating functional of the posterior process
• Find the first (functional) derivative and calculate the first moment
• Moment matching : This gives the intensity function of the “nearest” Poisson process in the KL sense
Probability generating functional
Daley and Vere-Jones 2003• Generalises the probability generating function
G[z] =
∞
X
n=0
Z
Xn
PX(n)(dx1, . . . , dxn)z(x1) · · · z(xn)
• The functional derivative
G(1)[z; ζ] = lim
ǫ→0
G[z + ǫζ] − G[z]
ǫ
• The intensity is recovered by
ΛA = lim
z→1G(1)[z; IA]
Probability Hypothesis Density Filter – The intensity recursion
• Predict
λt|t−1(xt) = bt(xt) + πsur(xt)R dxt−1p(xt|xt−1)λt−1(xt−1)
• Update
λt(xt) = (1 − πdet(xt))λt|t−1(xt) + P
yt∈Yt
πdet(xt)p(yt|xt)λt|t−1(xt)
ct(yt) + R dx′πdet(x′)p(yt|x′)λt|t−1(x′)
sk−1 s¯k sk s¯k+1 . . .
ˆ sk
Survive Birth Detect
Clutter
time
observation
10 20 30 40 50 60 70 80 90 100
5 10
15
20
25 30
35
40
45
50 0
0.5 1 1.5 2 2.5 3 3.5 4
p_sur = 0.98; % Survival probability b = 0.01; % Birth intensity p_det = 0.5; % Detection probability c = 1/3; % Clutter intensity lam0 = 1; % Prior inensity
time
state
10 20 30 40 50 60 70 80 90 100
5 10
15
20
25 30
35
40
45 50
0.2 0.4 0.6 0.8 1 1.2 1.4
time
state
10 20 30 40 50 60 70 80 90 100
5 10
15
20
25 30
35
40
45
50 0
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Summary
• The PHD filter is an ADF
– Retains only the first moment (the intensity)
– A second moment approximation exists but is costly (Singh et. al. 2007)
• The PHD filter does not solve the association problem – it is a bypass – Tracks the intensity field over time
• Implemented in practice via sequential Monte Carlo
– A stochastic approximation to a variational approximation
• Future ideas
– A PHD Smoother is not known
Bibliography
D. J. Daley and D. Vere-Jones. An Introduction to the Theory of Point Processes, volume I: Springer, New York, second edition, 2003.
R. P. S. Mahler. Multitarget bayes filtering via first-order multitarget moments.
IEEE Transactions on Aerospace and Electronic Systems, October 2003.
S.S. Singh, B-N Vo., A. Baddeley, and S. Zuyev. Filters for spatial point processes.
Technical Report CUED/F-INFENG/TR.591, University of Cambridge, 2007.
B. Vo, S. Singh, and A. Doucet. Sequential Monte Carlo methods for multitarget filtering with random finite sets. IEEE Transactions on Aerospace and Electronic Systems, October 2005.
D. Clark, A. T. Cemgil, P. Peeling, and S. Godsill. Multi-object tracking of sinusoidal components in audio with the gaussian mixture probability hypothesis density filter. Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 2007.