Small moving object detection in video sequences

(1)

Small Moving Object Detection in Video Sequences

Rabi Zaibi’ , A. Enis Getin’, Yasemin Yardimc?

Department of Electrical and Electronics Engineering, Bilkent University, Ankara, Turkey

Middle East Technical University, Ankara, Turkey E-mail: cetin@ee.bilkent.edu.tr

Abstract

In this paper, we propose a method for detection of small moving objects in video. We first eliminate the camera motion using motion compensation. We then use an adaptive predictor to estimate the current pixel using neighboring pixels in the motion compensated image and, in this way, obtain a residual error image. Small moving objects appear as outliers in the residual image and are detected using a statistical Gaussianity detection test based on higher order statistics.

It

turns out that in general, the distribution of the residual error image pixels is almost Gaussian. On the other hand, the distribution of the pixels in the residual image deviates from Gaussian- ity in the existence of outliers. Simulation examples are presented.

1 Introduction

Detection of small moving objects can be a compli- cated task when there is noise and the video camera is in motion. Classical object detection methods [l], [2] are geared for large objects with clear features and boundaries whereas in our problem the moving region or object may consist of only a few pixels.

In our method, we first eliminate the camera motion using motion compensation. After motion com-

and obtain a residual error image.

It

turns out that the distribution of the residual error image pixels is almost Gaussian in general. However, small moving objects produce outliers in the residual image as the pixels of the small moving objects cannot be predicted using the neighboring pixels. We detect the outliers using Higher Order Statistical (HOS) based Gaussianity test [4], [5]. In static regions the test statistics is very close to zero whereas in regions containing the moving object, the distribution of pixels deviate from Gaussianity and test statistic produces large values.

In Section 2, we present the 2-D adaptive filtering method which removes the static background. In Section 3, we review the HOS based statistical tests for moving object detection and present the results

of simulation studies in Section 4.

2 2-D

Adaptive Filtering

In Figure 1, the adaptive filtering scheme in two di- mensions is illustrated. In the adaptive filtering process, an image pixel

z[m,

n3 at location ( m , n ) is predicted as a weighted average of pixels in its region of

support. The region of support,

R,

of the adaptive filter is chosen as the pixels surrounding the pixel t o b e predicted as shown in Figure 1. T h e predicted pixel value ?[m, n] is given as 2[m, n] =

pensation, the resulting image basically contains the age using two-dimensional (2-D) adaptive filter. We adaptively predict each pixel using neighboring pixels

moving regions and objects. We process this im-

2

5 qm,n)

[k

l 1 z b - n

-

11,

IC

= -721

(IC,

I )

#

1 = -722 (0,O) 1 0-7803-6293-4/00/$10.00 02000 IEEE. 207 1

(2)

. . .

a--

: : \ - - -

Figure 1: 2-D adaptive filter structme

711 = 0 , . . . , N I - I , 71 = 0 , . . .

,

N2 - _{1 (1)}

where 2 is the motion compensated video frame of size

N I

x N 2 , 7u(,,,,,,) are the weight values a t (711, 71),

arid (2rrl

+

1) x (2rr2

+

1) is tho sizc of tlie region of support,

R,

of the ;tdaptive filter. In our case, The prediction error of the adaptive filter a t loca-

711 712 = 1.

tion ( 7 1 ~ , 7 i ) is calculated as

e[71?,, ri] = :L'[rri, 711 - ? [ m , n] ₍₂₎ The wcight values w ( r r l , , 1 ) [ I C , I ] are adapted x c o r d i n g

t,o the 2-D LMS-type adaptation algorit,hrn [8]:

l + , , + I , l L ) [ ~ , ~ ~ =

~+,,,,,)[Ic,II

3- w ~ [ ~ ~ ~ , ~ ~ I x ~ [ ~ . , I I ( 3 )

where ( I C , I ) E

R,

and p is the adaptation constan- t. These weights are adapted using Equation 3 while processing tlie image in the horizontal direction. In a similar way, in the vertical direction the weight

w(,,,,,+l)[IC,

I ]

replaces W ( ~ ~ , + ~ , ~ ~ ) [ I C ,

Z]

in Equation 3.

In Figure 2 (a) and (b) a frame containing a moving object arid its following frarie are shown. T h e image obtained from motion compensation is filtered using the 2-D adaptive prediction filter and the resulting residual crror image is shown in Figure 2 ( 1 1 ) . The rnovirig object detection is then performed on the er-

ror image e [ 7 n , n].

3 Higher Order Statistical Test

It has been experimentally observed that in region-

s with no mvirig objects, the error image e [ 7 n , n ]

Figure 2: (a)

A

frame containing a moving object,

(b) the following frame, (c) the error image after 2-D

adaptive filt,eririg (inverted).

has Gaussian like distribution, whereas in regions containing srnall rnoving objects it contains outliers. This is due t o the fact that pixpls of a srnall moving object cannot be accurately predicted using the surrounding pixels.

Higher order statistical tests are successfully used in the detection of rnicrocalcifications in Inarnmo- gram images [5], and detecting object,s in noisy images [6]. In this paper, we use a Gaussianity test developed in [5] and [!4]. The higher order statistic

h(1l

,

1 2 , 13, 1 4 ) is hnsctl on the sample estiniates of

the first four rrionicnts

I1,12,

I;$, 1 4 of the prctlict>ion crror

[4].

Estimates of tho nioIncrit,s are given by

wherc! e [ r n , n] represents the error value at location ( 7 3 1 , 7 ~ ) calculated in Equation 2 and

M x

N is the size of the region in which 1 k is estimated. The statistic It( I1 , 1 2 , 1 : 3 , 1 4 ) is defined as follows:

h ( I l , 1 2 , 1 3 , 14) = 1, +I4 - 311 ( 1 2 -1;) - 31; - 1; +21:

(5)

It is ideally equal to zero when the distribution is Gaussian. It, takes large values when tjhe underlying distribution deviates from Gaussianity. Outliers in the error image are mainly due t o moving objects and h(11,12, I3

,

14) takes large values in such regions containing outliers.

In our approach, the image is analyzed block by block. The statistic is calculated within each block inside the image. These blocks may overlap as shown in Figure 3.

In

our experimental work we used blocks of size M = 15 by N = 15 where overlapping occurs

at 3 pixel steps.

2

(3)

Vertical scanning Regions W i t h moving object W i t h o u t moving object Horizontal scanning > Minimum Maximum 2 . 3 7.5 -0.41 0.5 window First window Algorithms

\

Vertically second window

False Alarms Miss

Figure 3: Illustration of overlapping windows.

Adaptive wavelet

I

2

The detection method can be considered as a hypothesis testing problem in which thc null hypothesis

Ho

corresponds t o the no moving object case and

H I

corresponds t o the presence of a moving object:

Hu:

11L(Il,I2,I3,I4)1

<

T/L

H1: l 1 L ( I l , J 2 , ~ 3 , I 4 ) l

2

T/l

The threshold T'tL is experirncntally dct,ermined. The blocks in which tlie test, statistic exceeds tlie threshold,

TlL,

are marked as rcgions containing the srriall moving oh j ects.

0

4 Experimental Results and

Conclusion

In t,his section, we present sirriulation studies. We test the performance of the det,ect,ion scheme by an- alyzing 27 video scqwnct's containing sniall moving objects on various backgrounds. As described in Sec- tion 1, motion compensated images are obtained in the first step. A classical block matching based nio- t,ion compensation algorithm with subpixel accuracy is used

[a].

In the second step, motion compensated images are filtcrcd using the adaptive predictor and the residual error images are obtaincd. The values of the test statistic h in 12 video sequences are given in Table 1. It is clear from this table that a threshold can be sclectcd which can distinguish moving: objects from the background.

Table 2: Detectmion performance of each method.

In our detection scheme we use a n adaptive threshold value which is determined from the first image of the video sequence according t o the following formu- la:

( 6 )

A

Th = - llm(Il,I2, 1 3 , w

I l l

where 1 i m ( I ~ , 1 2 , 1 3 , 1 4 ) represents the valne of the s- tatistic h inside the mtlL block, and

L

represents the total number of bloclts inside the image. In our case

A is chosen as 1 2 so that Th is well above the maxi- mum h value of the blocks containing no moving tar- get (maximum h valne is 0.5 in the training set of 12 videos). Detection results are summarized in Table 2. In all of the 15 test videos the moving objects are tleterrnined successfully.

We also compared the performance of the adaptive predictor t o the wavelet transform, and adaptive subband decomposition

[7].

Motion compensated images are analyzed using (i) the adaptive predictor described in Section 2, (ii) wavelet t,ransforrn (subband decomposition), and (iii) adaptive subband decomposition [7].

Typical results of the above methods are shown in Figure 4. The detcctiori performance of these meth-

ods arc sumninrizcd in Table 2. In the test videos, the adaptive predictor produces the best results. Adap- tive subband decomposition also detects all of the moving objects but, in two cases, it produces false alarms. In ordinary wavelet transform, 4 targets are 3

(4)

missed. By reducing the threshold all of the targets can be detected but in this case, the number of false

alarms drastically increases.

References

[l]

A4.

A.

Alatan,

L.

Onural, M. Wollborn,

R.

Mech,

E. Tuncel,

and

T.

Sikora. “Iniage sequence anal-

ysis for emerging interactive multimedia services - the European COST 211 framework,” IEEE

Trans. on CAS for Video Tech., Nov. 1998.

[2]

A.

X4. Telbilp, Digital Vzdeo Processing.

Prentice-Hall, 1995.

[3] R. R.ajagopalan,

E.

Feig, and M. T. Orchard, “Motion optimization of ordercd blocks for overlappctl block motion compensation”. IEEE

T rm s. on CAS for Video Tech., h p r 1998.

[4] M. N. Giircari, Y. Yardirrici and

A .

E. Getin,

"Influence function based Gaussianity tests for

(b) tletcction of niicrocalcificatioIis in mamniograni

images

”,

ICIP’N, 1999.

[5]

R . Ojcda, .J. Cardoso, and E. Moulines, “Asynip-

totically Irivai iarit Gaussianity Test for Causal invertible Time Series”, Procperlrngs of IEEE I-

CASSP’97, vol. 5, pp 3713-3716, April 21-24,

Figure 4: (a) Prediction error image c[ni,n] arid [6] G. B. Giannakis and M. K. Tsatsa,nis, the detected region (right), (b) sum of the subini- “Time domain tests for Gaussianity and tirne- ages z l f L [ n t , n ] and xh1[7n2,n] and the tletccted re- reversibility” , IEEE Trans. on Signal Process-

gion (right), (c) sum of the subirnsges ~ 1 , ~ [ 7 1 1 , _. n] and _. i n g , vol. 42, pp. 3460-3472, 1994.

: c I , ~ [ m , n] obtained after adaptive subband deconipo-

sitio and the detected region (right). [7]

0.

N.

Gerek, A.

E.

Cetin, “Linearlnonlinear adaptive polyphase subband decomposition structure for iniagc compression” Proc. of IEEE

ICASSP’98, Seattle, U.S.A, May 1998.

[8] 0 . Arikan, A. E. Cetin, Engin Erzin, “Adap-

tive Filtering for non-Gaussian stable process-

es”, IEEE Signal Processing Letters, vol. 1, No. 11, pages 163-165, Nov. 1994.

4