Use of interpolating functions in fast state estimation for dynamic systems with missing observations

(1)

Use of Interpolating Functions in Fast State

Estimation for Dynamic Systems with Missing

Observations 7

by KERiM DEMiRBAS

Dcpurtnzent ~f’Elcctrica1 Engineering und Computer Scierzce (M/C 154).

Universit~~ of Illinois at Chicqo, Clricugo, IL 60680, U.S.A.

I. lntvoduction

For the last three decades, researchers have extensively treated recursive state

estimation of the classical dynamic models (or systems) (l-4). These classical

models must be linear functions of the disturbance noise and (additive) observation

noise. but they can be nonlinear functions of the state; and observations are

assumed to be available for all times within a considered time interval. Proposed

classical estimation schemes, such as the (extended) Kalman filter, have had many

applications in the areas of aerospace and electronic systems (5), and economics.

Recursive state estimation has been also considered for dynamic models which

are nonlinear functions of the state, disturbance noise and observation noise ; and

whose observations are assumed to be available for all times within a considered

time interval (6-8). The proposed schemes have been also applied to practical

systems (9). These schemes prevent the state estimate divergence caused by model

linearization errors which are introduced by the classical estimation schemes

such as the extended Kalman filter for state estimation of nonlinear dynamic

models (9).

Recursive state estimation has been recently considered for nonlinear dynamic

models with missing observations in a considered time interval. in which the Viterbi

decoding algorithm is used (IO). The implementation of the proposed scheme

requires an exponentially increasing memory with time, which makes state esti-

mation impractical for a long time interval.

In this paper, in order to overcome the obstacle that the implementation of the

scheme proposed in (10) requires an exponentially increasing memory with time.

a stack sequential decoding algorithm of Information Theory is used for state

TThis work was carried out while the author was visiting Bilkent University, Ankara, Turkey.

(2)

estimation

of

dynamic systems with missing observations. The proposed scheme is

faster and more practical than the scheme in (10). The performance of the proposed

scheme is also discussed. II. Problem Statement

Consider the closed time interval [0, L], where L is a positive integer. Let A be

the set of all discrete times in [O. L], and STOM be a subset of discrete times in the

open interval (0. L), that is STOM c A n (I, L). It is assumed that observations in

the set STOM are missing. In other words, STOM is the set of times at which the

observations are missing.

This paper deals with state estimation of nonlinear discrete dynamic models

which are defined by

.I-(k + I ) = ,f’(k. s(k). II(~)) the state model (1)

:(/i) = ,g(k. X(A). r(k)) the observation model. ₍₂₎

where li denotes the discrete time; n(k) is an II x I zero mean disturbance noise

vector at time /i with known statistics; .X(O) is an ~1 x I initial state vector with

known statistics; s(k), /i > 0, is an IIT x 1 state vector at time /i ; r(k) is an I’ x I

zero mean observation vector at time k with known statistics; z(k) is an .Y x I

observation at time h- ; ,f’(k, s(A-). II.(~)) and y(li, s(k), I) are given functions

which define the state at time li+ 1 and observation at time /i in terms of the state,

disturbance noise and observation noise at time k; and the initial state and all

samples of the disturbance noise and observation noise are independent.

The interest is to estimate the states from time zero to time L, denoted by

X’- A (.v(O), .Y( I). . , s(L)). by using the set of available observations, denoted by

Z 6 (:(I) : /E STOAi. where STOA is the set of times at which the observations

arc available, that is, STOA = A - STOM.

111. Estimation Scheme

The state model is first approximated by a time varying finite state machine (or

model). This finite state model is represented by a trellis diagram. Then the states

are estimated by using a stack sequential decoding algorithm.

The finite state model which approximates the state model is defined by

.r,,(k+ 1) = Q(f’(k,S<,(li), n,,,(k))). (3)

where I*,, is a discrete disturbance noise vector which approximates the dis-

turbance noise vector 11(/i), and the possible values of n‘,,(k) are denoted by ir,,,(/i). ~i;,~(k), . , and I,,,,,, ( w ere the first subscript h d stands for the discrete random

vector and the second subscript shows the label of the possible value of the discrete

random vector) (6), .x,(O) is an initial discrete random vector which approximates

the initial state vector X(O), and the possible values of X,,(O) are denoted by x,,(O)..

.u,,:(O). . , sy, ,,,, (0). which arc called the initial quantization levels or the quan

tization levels at time zero; .v,(k) is the quantized state at time k, and the quan-

(3)

Fast Stute Estimation

first subscript q denotes the quantized state and the second subscript shows the

label of the quantization level), and Q( .) is the quantizer defined in (6). This is a

function which divides the entire m-dimensional Euclidean space into non-

overlapping generalized rectangles (called gates) of the same size and which then

assigns to each rectangle its center. The size of each rectangle is referred to as the

gate size.

The finite state model is represented by a trellis diagram (Fig. I) in which the

quantization levels of x,(k) are denoted by nodes at the (k+ 1)th column of the

trellis diagram, and the transitions between quantization levels are denoted by

directed lines. The transition probability from a quantization level s,,(k- I) to a

quantization level .u,,(k) is denoted by T(+,(k - I ) + x,,(k)) and defined by

T(X,,(k- 1) +X4, (k)) 4 Prob jr,(k) = .u,,(k)Is,(k- I) = .~(,,(k- I))

= CProb {~(k- 1) = t~.,,~(k- 1)).

where the summation is taken over all Y such that

s,;(k) = Q(,f;(k- 1,X(/G I) = .v,,(k- I), IL.(k- I) = W,,J- 1))).

The following metrics are also assigned to each node, branch and path of the

trellis diagram. The metric of a node (or quantization level) .u,,(k) is denoted by

MN(.u,,(k)), and defined as zero except for an initial node, whose metric is defined

as the natural logarithm of its occurrence probability, that is

In {Prob (.u,(h-) =

+,(k)) j

ifk=O

MM%,,(k)) 5% o

otherwise

where In denotes the natural logarithm. The metric of the branch connecting the

node .~,~(k - 1) to the node -y<,,(k) is denoted by MB(s,,(k - 1) + .u,,(k)) and defined by

M&,,(k- 1) -+.x,,,(k)) e In {T(x,;(k- 1) + .u,,(k))p(-(k)l.~(k) = s,,(k))),

where p(~(k)l.v(k) = r,/(k)) is the conditional density function of the observation

Time 0 Time 1 Time k-l Time k Time L

FIG. 1. Trellis diagram of state.

Vol 327. No. 3. pp. 491-501. ,990

(4)

K. Denliuhq

at time k given x(k) = x,,(k). If:(k) is a missing observation, then it is first estimated

by a function which interpolates available observations in the neighborhood of

STOM (11). The metric of a path is defined as the sum of all metrics of the nodes

and branches along the path.

The trellis diagram from time zero to time L shows all possible paths, the

quantization levels along which can be taken by the state with time. Then the state

estimation problem is to find a path through this trellis diagram so that the

quantization levels along this path become the estimates of the state from time zero

to time L. It can be shown (6) that the optimum rule which minimizes the overall

error probability is to choose the path with the greatest metric (if there is more

than one path with the same metric. choose any one of these at random). The

metric with the greatest metric can be chosen by the Viterbi decoding algorithm

(VDA) (6, 7, 10). But the implementation of the VDA requires an exponentially

increasing memory with time. To overcome this obstacle, in this paper a stack

sequential decoding algorithm (12-14) is used to choose a path so that the quan-

tization levels along this path become the estimates of the state from time zero to

time L. Stack sequential decoding algorithms estimate (not find) the path with the

greatest metric by searching only the paths which most likely contain the path with

the greatest metric. Hence, a stack sequential decoding scheme is suboptimum and

its implementation requires an increasing (not exponentially as in the Viterbi

decoding algorithm) memory with time. Hence, the estimation scheme using a

stack sequential decoding algorithm is more practical and faster than the estimation

scheme using the Viterbi decoding algorithm.

IV. Performance

Performance of the proposed scheme is quantified by an ensemble upper bound

of the Gallager type. When the stack sequential decoding algorithm presented in

(14) is used for estimation, it can be shown (6) that the ensemble overall error

probability for choosing the correct path is bounded by

where

where r’, is the ensemble overall error probability for choosing the correct path;

K is the number of possible paths through the trellis diagram; T$” and nTax are

the minimum and maximum values of the occurrence probabilities of the initial

(5)

Fast State Estimutim

quantization levels ; T$“” and 7~ “‘d’ are the minimum and maximum of the transition

probabilities from time k- 1 to time k, respectively : x’ is the set of all quantization

levels from time I to time L; and N’ is the number of elements in X’.

Consider the discrete models defined by

s(k + 1) = ,f’(k. x(k), iv(k)) the state model

z(k) = g(k, x(k)) +r(k) the observation model,

where x(0) and r(k) are Gaussian noises with means nz(, and 0; and covariances

A,, and A,,(k), respectively. Substituting p(z(k)lx(k) = s), which is Gaussian, into

the expressions for Rk and S, above, we can obtain

U(.ul,s2)

A

exp- [s(k,s,)-y(k,.uZ)]‘A, ‘(k)[~(k,s,)--g(k..u,)l

6

where the superscript .s is the dimension of the observation r(k), the superscript T

indicates the transpose, and det stands for the determinant. The bound in (4) is

used as the performance measure of the proposed scheme.

V. Simulations

State estimation of many examples with Gaussian noise and missing observations

was simulated on the IBM 308 1 K mainframe computer. In simulations : the initial

state and disturbance noise were approximated by the discrete random variables

presented in (6), and the stack sequential decoding algorithm presented in (14) was

used. Simulation results of three examples are presented in Figs 2(a)-4(c), where

the observations z(4) and ~(5) were assumed to be missing. In Figs 2(a)4(c) : the

simulated state and observation models are stated in the first and second rows at

the left upper corner; ACTUAL, SSDS and KALMAN show the actual values.

estimated values by the proposed scheme, and estimated values by the (extended)

Kalman filter of the states ; AAEOP and AAEK indicate averaged absolute errors

for estimates obtained by the proposed scheme and Kalman filter; E(Y) and

VAR(Y) denote the mean value and variance of the random variable Y : BOUND

and ER.COV. yield the bound in (4) and the error variances for the Kalman

estimates, respectively ; NUM. OF DISC. FOR Y denotes the number of possible

values of the discrete random variable which approximates the random variable

Y; and GATE SIZE shows the gate size used in (3).

Figures 2(a*) present simulation results of a linear example, whereas Figs 3(a)-

4(c) present simulation results of two nonlinear examples. The missing observations

z(4) and z(5) were first estimated by using a first-order polynomial which interp-

Vol 327, No. 3. pp. 491-501, 1990

(6)

K. Drmirhrr~

(7)

10 1'.60 3'.20 4'.80 6.40 s'.oo TIME

XlK+11=0.79XiKl+WlK1 ZIK1:7X[Kb+VIKI NUil.

OF

OiSC-

FOR

XlOI=l

VRRIXl011=0.001 ElXIOi1=30.000 NUN.

OF OISC. FOR WC.)=3 “ARIW, .,lk4.000 VRQlVI .11=3.000 GATE SILE=0.250 LEGENO _A: KRLnRN +: 550s fiREK=O.B45866EO AAEOP:O.l62780El FIG. 2(c). Absolute and time averaged absolute errors for estimates of states.

XIK~II~XIKili+O.5SS~NIX!K~11+W~~? ZIK1=6X!<l-VIKi NUti.

OF OiSC. FOR XICIkI 8 VRRlX1011k1.500 4 EIXIOi1=3.000 _, NuM. 0' CISC. FOR kc.;=3 / IRR~WI.il=2.COO _EGENO _0: RCTdAL A: 7RLflRN 7: ssos lo.00 1.60 3.20 4.30 6.40 8.00 TIME FIG. 3(a). Actual and estimated values of states

(8)

(9)

Fat State Estimution

olates the available observations ~(3) and ~(6) for the linear and nonlinear examples

in Figs 2 and 3, but by using a second-order polynomial which interpolates the

available observations z(3), z(6) and ~(7) for the nonlinear example in Fig. 4. In

Fig. 2 : the Kalman estimates are better than the estimates obtained by the proposed

scheme since the Kalman estimates are optimum for linear models with white

Gaussian noise, whereas the proposed scheme is suboptimum. Both the proposed

scheme and extended Kalman filter are suboptimum for nonlinear models. The

Kalman estimates are better than the estimates obtained by the proposed scheme

for the models in Fig. 3, whereas the estimates obtained by the proposed scheme

are better than the Kalman estimates for the models in Fig. 4. One must keep in

mind : (i) A stack sequential decoding algorithm estimates the path with the greatest

metric. Therefore it may also pick up as the path with the greatest metric a path

which does not have the greatest metric. This may cause a state estimate divergence

from the actual state values, as in the state estimate divergence caused by model

linearization errors which are introduced by the extended Kalman filter. (ii) The

bound of (4) may sometimes become a number greater than one (i.e. useless) for

some dynamic models since it is derived by using some inequalities, and this bound

is an ensemble bound for the performance of the proposed scheme. Hence, it does

not exactly determine the performance of the proposed scheme for a given dynamic

system (6).

XIK+1I=1.ZXIKI*WIK1 ZIK1=0.9XfKl*VIKI NUti. CF DISC. FOR XlOl=l B VRRLXlO11~0.800

A _{_- NUtI. OF DISC. FOR Wc.1~3}EIXlO11=1.800 VRRIWI.11=1.900 m VRR~VI.,~:,.OOO wo GRTE SlLE=O.ZSO +9 E?? Lo LEGEND 0: RCTURL A: KRLnRN +: S5D.s ml 1o.oo 1.60 3.20 4.80 6.40 8.00 TIME

FIG. 4(a). Actual and estimated values of states.

Vol.327.No.3,pp.491-SOI,

(10)

500

(11)

VI. Conclusions

A fast estimation scheme is presented for nonlinear discrete dynamic systems (or

models) with missing observations. These models can be nonlinear functions of the

state, disturbance noise and observation noise. The missing observations are first

estimated by interpolating functions. Hence, the accuracy of the proposed scheme

depends upon the estimation accuracy of missing observations, which is determined

by interpolating functions used. The proposed estimation scheme requires a

memory which increases less than exponentially with time, whereas the estimation

scheme using the Viterbi decoding algorithm requires an exponentially increasing

memory with time for the implementation.

References

(I) A. P. Sage and J. L. Melsa, “Estimation Theory with Applications to Communications

and Control”, McGraw-Hill, New York. 1971.

(2) T. Kailath. “A view of three decades in linear filtering theory”, IEEE Twn.s. h/b.

T/wor_r, Vol. IT-20, No. 2, pp. 146-I 81. 1974.

(3) J. Makhoul, “Linear prediction : a tutorial rcvicw”, Ptw. IEEE. Vol. 63. pp. 561-580,

1975.

(4) J. S. Mcdich, “A survey of data smoothing for linear and nonlinear dynamic systems”.

Autonwticu. Vol. 9, pp. I5 l-162, 1973.

(5) C. E. Hutchinson, “The Kalman Filter applied to aerospace and electronic systems”,

IEEE Truns. Awspuc~~ El~ctrorz. Swms, Vol. AES-20, No. 4, pp. 500-504, 1984.

(6) K. Dcmirba?, “New smoothing algorithms for dynamic systems with or without

interfercncc”, T/w N A TO A GA R Dog~.qd~ A thnc~cs in the Tmhicps cd T~c~hnolog~~

of’ Applicutiom of’ Nonlinewr Filtrvs md Kdn~rrn Filtcw, No. 756. AGARD, pp. l9- 11’66, Mar. 1982.

(7) K. Demirba? and C. T. Lcondes, “Optimum decoding based smoothing algorithm for

dynamic systems”, Int. J. .Sj.stc~nz.s Sci., Vol. 16. No. 8, pp. 95l~-966. Aug. 1985.

(8) K. Demirba? and C. T. Leondes, “A stack scqucntial decoding based smoothing

algorithm for dynamic systems”, /of. J. Srsterns SC;.. Vol. 17. No. 2. pp. 269-280,

Feb. 1986.

(9) K. Demirbav, “Maneuvering target tracking with hypothesis testing”. lEEE Trtms.

Arro.spocc Elrctrotz. .~~stm.s, Vol. AES-23, No. 6, pp. 757-766, Nov. 1987.

(IQ) K. Dcmirba$. “State estimation for nonlinear discrete dynamic systems with missing

observations”, J. Franklin ht., Vol. 321, No. I. pp. 49-59, 1990.

(11) S. D. Conte Carl dc Boor. “Elementary Numerical Analysis”, McGraw-Hill, New York, 1972.

(12) G. D. Forney Jr, “Convolutional codes 111. sequential decoding”, Icfh. Control, Vol.

25. pp. 267-297, 1974.

(13) F. Jelinek. “A fast scqucntial decoding algorithm utilizing a stack”, IBM J. Rcs. Dcr.,

Vol. 13, pp. 675-685, 1969.

(14) A. J. Viterbi and J. K. Omura, “Principles of Digital Communication and Coding”,

M&raw-Hill. New York. 1979.

Vol. 127. No. 3. pp. 491-501. 1990