Detection and Tracking System of Moving Objects Based on MATLAB

DOI : 10.17577/IJERTV3IS100721

Download Full-Text PDF Cite this Publication

Text Only Version

Detection and Tracking System of Moving Objects Based on MATLAB

Habib Mohammed Hussien

Electrical and Electronics Technology Department Federal TVET Institute

Addis Ababa, Ethiopia

Abstract: Moving Object detection and tracking are receiving a growing attention with the emergence of surveillance systems. Video surveillance has been in used in the monitor security sensitive areas (such as banks, department stores, highways, crowded public places and borders, and etc.). In this thesis, video surveillance system with moving object detection and tracking capabilities is presented. This thesis is committed to the problems of defining and developing the basic building blocks of video surveillance system. The video surveillance system requires fast, reliable and robust algorithms for moving object detection and tracking. The system can process both color and gray images from a stationary camera. It can handle object detection in indoor or outdoor environment and under changing illumination conditions. This paper presents detection and tracking system of moving objects based on matlab.It is described for segmenting moving objects from the scene .The proposed system is capable of adapting to dynamic scene, removing shadow, and distinguishing left/removed objects both in indoor and outdoor. The proposed technique combines simple frame difference (FD), simple adaptive background subtraction (BS), and accurate Gaussian modeling to benefit from the high detection accuracy of Mixture of Gaussian solution (MoG) in outdoor scenes while reducing the computations .Thus, making it faster and more suitable for real time surveillance applications, This study used IFD(Inter- Frame Differencing algorithm) and bounding box method to track the objects.

Keywords: Video Surveillance, Moving Object Detection, Background Subtraction, Moving Object Tracking, Frame differencing, Mixture of Gaussian model

  1. INTRODUCTION

    As surveillance systems are becoming more popular, robust detection and tracking techniques are needed to determine moving objects. Moving object detection and tracking in the military, intelligence monitoring, human machine interface, virtual reality, motion analysis and many other fields have Wide application prospects in science and engineering. It has important research value, which attracting more and more researchers at home and abroad. The current detection and tracking system is mainly using two techniques: First, the use of radar technology for tracking; while another is based on image processing technology to achieve target tracking. Our paper is based on image processing to achieve the objectives of detection and tracking system. Image sequence of moving target tracking, is detected in each

    frame of the goal of the various independence movements, or the Users interested in the movement areas (such as the human body, vehicles, etc.), and extract the object's position information, to receive the various objectives Trajectory. The making of video surveillance system best requires fast, reliable and robust algorithm for moving object detection and tracking system. Identifying moving objects from a video sequence is a fundamental and critical task in many computer-vision applications. A common approach is to perform background subtraction, which identifies moving objects from the portion of a video frame that differs significantly from a background model. There are many challenges in developing a good object detection algorithm. First, it must be robust against changes in illumination Second, it should avoid detecting non- stationary background objects such as swinging leaves, rain, snow, and shadow cast by moving objects. Finally, its internal background model should react quickly to changes in background such as starting and stopping of vehicles. After identifying moving object in a given scene, the next step in video analysis is tracking. Tracking can be defined as the creation of temporal correspondence among detected objects from frame to frame. This procedure provides temporal identification of the segmented regions and generates cohesive information about the objects in the monitored area such as trajectory, speed or direction. In everyday life, humans visually keep detecting and tracking a multitude of objects with certain objectives in mind. Examples are orientation in the environment, recognizing persons in the surroundings, locating, recognizing, monitoring and handling of objects. Although much has been learned from biological and human vision for image processing and analysis, the markedly different tasks and capabilities of the function modules have to be kept in mind. It is not possible to simply transfer or copy the human biological vision to machine vision system. However, the basic functionality of the human vision system can guide us in the task of how these principles can bet transferred to technical vision system such as motion analysis. The analysis of motion gives access to the dynamics of processes. Motion analysis is generally a very complex problem. The true motion of objects can only be inferred from their motion after the 3-D reconstruction of the scene. The increasing use of video sensors, with Pan- Tilt and Zoom capabilities or mounted on moving

    platforms in surveillance and autonomous driving vehicle. The outputs of these algorithms can be used both for providing the human operator with high level data to help him to make the decisions more accurately in a shorter time and for offline indexing and searching stored video data effectively. The advances in the development of these algorithms would lead to breakthroughs in applications that use visual surveillance. Papers inspired me to develop the proposed systems algorithm are [1-5,10

    ,15,22,23,24,25,26,27,28]

    This paper presents a detection and tracking Detection and Tracking systems of moving objects for monitoring outdoor and indoor scenes with gradual illumination changes and swinging tree branches. The technique combines frame differencing (FD), Background subtraction (BS) and Mixture of Gaussian Model (MoG) to get a robust detection results by focusing the attention on the most probable foreground pixels. And we used IFD (inter-frame differencing tracking algorithm) to track the detected objects.

  2. RELATED WORKS

    There are several background modeling schemes ranging from simple techniques keeping single background state estimates and providing acceptable accuracy for simple applications, all the way to complicated techniques with full density estimates; thus providing better accuracy at the expense of increased memory and complexity. Background subtraction is particularly a commonly used technique for motion segmentation in static scenes [6].

    FD is the simplest technique with the background being the previous frame. This method has low memory requirements, O (1), high speed, and high adaptability; but suffers from the aperture problem. IVSS [3] uses FD to generate hypotheses about the objects, then verifies them by extracting the features using Gabor filter, and classifies objects using support vector machine.

    Collins et al. developed a hybrid method that combines three-frame differencing with an adaptive background subtraction model for their VSAM project [3]. The hybrid algorithm successfully segments moving regions in video without the defects of temporal differencing and background subtraction.

    The W4 [4] system uses a statistical background model where each pixel is represented with its minimum (M) and maximum (N) intensity values and maximum intensity difference (D) between any consecutive frames observed durng initial training period where the scene contains no moving objects.

    Stauffer and Grimson [5] presented a novel adaptive online background mixture model that can robustly deal with lighting changes, repetitive motions, clutter, introducing or removing objects from the scene and slowly moving objects

  3. PROBLEM STATEMENT

    Problems concerning about this system can be classified into two categories; motion detection and motion tracking. Motion detection involves verifying the presence of a moving object in image sequences based on the objects temporal change and possibly locating it precisely for tracking or

    recognition purpose. Whereas motion tracking is an iterative process of determining the trajectory of a moving object during a video sequence, by monitoring the objects spatial and temporal changes, including its presence, position, size, shape, etc. This is done by solving the temporal correspondence problem, i.e. the problem of matching the target region in successive frames of a sequence of images taken at closely-space time intervals. These two processes are closely related because motion tracking usually starts with detecting moving objects, while detecting an object repeatedly in subsequent image sequence is often necessary to help and verify tracking. Given a live feed frames sequences from a fixed camera, detecting all the foreground objects and estimate the trajectory of the object of interest moving in the scene. There are many challenges in developing a good object detection algorithm. First, it must be robust against changes in illumination. Second, it should avoid detecting non-stationary background objects such as swinging leaves, rain, snow, and shadow cast by moving objects. Finally, its internal background model should react quickly to changes in background such as starting and stopping of moving objects. In video surveillance in addition to detection problems there are also problems in tracking objects. One of the problems of tracking is that dynamic objects move in patterns that are highly non-linear. Another important problem is simultaneously tracking multiple moving objects, correspondence, partial overlapping and occlusions. The proposed system must solve these problems correctly.

  4. METHODLOGY

Each application that benet from video processing has dierent needs, thus requires dierent treatment. However, they have something in common: moving objects. Thus, detecting regions that correspond to moving objects such as people and vehicles in video is the rst basic step of almost every vision system and video processing as shown in the fig. 1 below.

0bject detection

Post processing

Object tracking

Connected component analysis

Display

Alert management

Fig. 1 A Generic Framework for Video Processing Algorithm

An overview of the entire moving object detection and tracking system is shown in Fig.2 The technique combines Frame differencing, adaptive background subtraction and MoG to reduce the computations of MoG and improve its accuracy by focusing the attention on the most probable

foreground pixels. This proposed system able to detect and track moving object in a given scene. The system uses stationary camera to acquire video frames from outside world. The system starts by feeding video frames from a static camera that supposed to monitor a given site. The system can work both indoor as well as outdoor. The first step of our approach is identifying foreground objects from stationary background. Here we use hybrid adaptive scheme algorithm, to identify the foreground object. Next step is post processing to remove noises that cannot handled by proposed system. After applying post processing, connected component analysis will follow and group the connected regions in the foreground map to extract individual objects features such as bounding box, area, center of mass etc.The final step of this system is tracking. The tracking algorithm is inter-frame differencing (IFD).This algorithm uses object features to

processing modules such as object tracking, recognition, and counting. Many approaches are proposed on this topic based on the background module and procedure used to maintain the model. In the past, the computational power of the processor limited the complexity of Foreground detection implementation.

4.1.2. THRESHOLDING

Thresholding operation is an operation used to convert a gray scale image to binary image. A binary image consists of 2 colours, whether its black (0) or white (1). A suitable threshold value must be selected in order to separate the object from the background. The equation for the thresholding is given by Equation (1). Fig.3 illustrates an example of thresholding operation of sub-image.

track objects from frame to frame.

f (x, y) 1

0

('255'

('0')

f (x, y) Th

f (x, y) Th

(1)

Input video sequences

fi 1

(current frame)

Intializati on update

fi (previous frame)

Background model

Where T = Threshold value

1

8

15

80

200

3

20

168

4

250

10

40

60

70

100

4

0

0

0

1

1

0

0

1

0

1

0

1

1

1

1

0

T=30

Frame differencing

Background Subtraction

Segment into Static/Non-static Background and Threshold update

Regiom of motion

update

Fig. 3 Illustration of Thresholding Operation

In other words, the thresholded image g(x, y) is defined as

ga(x, y) 1 if

Match/Update/Sort Distribuyions

0 if

fr(x, y) thresh fr(x, y) thresh

(2)

Foreground Detection Using Hystheresis thresholding

Connected component analysis

Forground pixels Post processing

Object tracking

Display

Alert management

Fig. 2 Block Diagram for Detection and Tracking System

4.1. MOVING OBJECT DETECTION

4.1.3. ADAPTIVE BACKGROUND SUBTRACTION MODEL

Our implementation of background subtraction algorithm is partially inspired by the study presented in [3] and works on grayscale video imagery from a static camera. Our background subtraction method initializes a reference background with the rst few frames of video input. Then it subtracts the intensity value of each pixel in the current image from the corresponding value in the reference background image. Let Bg (x, y)

be the corresponding background intensity value for pixel position

0

( x, y) estimatedovertimefromvideoimages I (x, y) through

Distinguishingforegroundobjectsfromthe stationary

I t 1

(x, y) . As the generic background subtraction scheme

background is both a significant and dicult research

problem. The first step of almost all visual surveillance systems is detecting foreground objects. This both creates a

suggests, a pixel at position (x, y) in the current video image belongs to foreground if it satises

| I (x, y) B (x, y) T |

focus of attention for higher processinglevelssuchastracking

t g h

(3)

and reduces computation time considerably since only pixels belonging to foreground objects need to be dealt with. Short and long term dynamic scene changes such as repetitive motions (e. g. waiving tree leaves), light reectins, shadows, camera noise and sudden illumination variations make the reliable and fast object detection difficult. So that the proposed system solved those problems.

4.1.1. FOREGROUND DETECTION

Foreground detection plays a very important role in a video content analysis system. It is a foundation for various post-

This algorithm (Equation (3)) provides the most complete features data. Moving objects in the scene are detected by the difference between the current frames and the current background model.

The algorithm of adaptive Background subtraction detection is described below.

  • fi: A pixel in a current frame, where i is the frame index.

  • : A pixel of the background model (fi and m are located

    at the same location).

  • di: Absolute difference between fi and m.

  • bi: F mask – 0: background. 0xff: foreground

  • T: Threshold

  • a: Learning rate of the background.

    1. di = |fi – |

    2. If di > T , fi belongs to the foreground; otherwise, it belongs to the background.

    Although these are usually detected, they leave behind

    holes where the newly exposed background imagery differs from the known background model. While the background model eventually adapts to these holes. They generate false alarms for a short period of time. In this algorithm, a pixel in the current frame is considered as foreground or part of the region of motion if the absolute difference between its current values and its background values is larger than thresh.

    | I (x, y) Bg (x, y) | thresh

        1. TEMPORAL DIFFERENCING

          Temporal dierencing makes use of the pixel-wise dierence between two or three consecutive frames in video imagery to extract moving regions. It is a highly adaptive approach to dynamic scene changes; however, it fails in extracting all relevant pixels of a foreground object especially when the object has uniform texture or moves slowly. When a foreground object stops moving, temporal differencing method fails in detecting a change between consecutive frames and loses the object. Special supportive algorithms are required to detect stopped objects. We implemented a two- frame temporal dierencing method in our system. Let It (x, y) represents the gray-level intensity value at pixel position (x) and at time instance n of video image sequence

          unable to detect a stopped target object start to move, can be fully overcome by frame differencing. By combining the advantages of these algorithms we can get algorithm that is robust to illumination change, high computational efficiency or accuracy and highly adaptive to dynamic changes.

          Frame differencing algorithm

  • fi-1 : A pixel in a previous frame, where i is the frame index.

  • fi : A pixel in a current frame

  • Background initialization —–Bg =fr0

  • di: A pixel-wise absolute difference between fi and fi-1.

  • di+1: A pixel-wise absolute difference between fi and fi+1.

  • bi: foreground mask – 0: background. 0xff: foreground.

  • T: Threshold

    1. di = |fi – fi-1| and di+1 = |fi – Bg|

    2. If di > T & di+1 > T, fi belongs to the foreground; otherwise, it belongs to the background.

    Input video sequence

    Intializati on update

    fi (previous frame)

    fi 1

    (current frame)

    Background model

    Frame differencing

    Background Subtraction

    Segment into Static/Non-static Background and Threshold update

    Regiom of motion

    update

    Match/Update/Sort Distribuyions

    Foreground Detection Using Hystheresis thresholding

    Post processing

    Forground pixels

    I (x, y)

    which is in the range [0, 255]. The two-frame

    temporal dierencing scheme suggests that a pixel is moving if it satises the following(Equation (6)).

    | It (x, y) It 1 (x, y) | Th

    (5)

    Fig. 4 Proposed technique (Hybrid adaptive scheme algorithm) block

    diagram

    The algorithm of temporal differencing based foreground detection is described below.

  • fI : A pixel in a current frame, where I is the frame index.

  • fI-1: A pixel in a previous frame (fI and fI-1 are located at the same location.)

  • dI: Absolute difference of fI and fI-1.

  • bI: F mask – 0: background. 0xff: foreground

  • T: Threshold value

  1. dI = |fI – fI-1|

    1. COMPUTING REGION OF MOTION

      The first step is to compute the region of motion. FD (Frame differencing) is applied to get the boundaries of the portion of the scene that is moving. A pixel It(x, y) is classified as foreground if the difference between It(x, y) and its predecessors at time t-1 is larger than a threshold ITh as shown in Equation(6).

  2. If dI > T, fI belongs to the foreground; otherwise, it

It (x, y) It 1 (x, y) Ith

(6)

belongs to the background

      1. PROPOSED SYSTEM (HYBRID ADAPTIVE

ESCHEME ALGORITHM)

We have developed a hybrid adaptive scheme algorithm for detection of moving object as you see from fig.4 by combining temporal differencing technique with background subtraction technique. As discussed above .frames differencing algorithm cannot extract all entire shape or interior object information. On the other hand background subtraction has a capability of extracting entire shape and all interior pixels information of the moving object. The drawback of background, means generating false alarm, and

comparisons for all the pixels in the frame, we update only the parameters of the portion of the scene where an object is suspected and sort the corresponding distributions. Since objects are usually very small compared to the whole frame, this leads to huge reductions in computations. In other words, the proposed technique will be beneficial in two folds: Even though simple FD is less immune to noise than 3FD, any noise introduced will be corrected in the later detection stage. FD is very adaptive to dynamic changes and requires keeping one previous frame only. However, FD cannot detect all interior object information and fails to detect any if the object stops moving at a certain frame. Thus an additional background

subtraction technique is required to form the region of motion. A simple background is kept and selectively updated at each frame. A pixel in the current frame is considered part of the region of motion if the absolute difference between its current value and its background value is larger than ITh:

| It (x, y) Bt (x, y) | ITh (7)

Initially, the background is the first frame, assuming there are no objects. ITh may be initialized as shown in Equation (8)

match [15]. The idea is to have two thresholds, Tlow and Thigh Thigh. If the difference between the current pixel value and the distribution mean is less than Tlow, the pixel is strongly classified as background. If the difference between the current pixel value and the distribution mean is larger than Thigh, the pixel is classified

as foreground. Otherwise, the pixel is a foreground candidate or a weak candidate .Above K = 5. For each new frame, the

pixels inside the motion area are checked and the

Ith (x, y) m(x, y) c * s(x, y)

(8)

corresponding parameters are updated. Given a pixel It in the

where m(x, y) and s(x, y) are the mean and standard deviation of a local area, which is supposed to be small enough to preserve local details but large enough to suppress noise, and c defines how much of the total print object boundary is taken as a part of the given object. The

motion area; we check the corresponding distributions for the distribution that It best fits or is most likely to belong to. The best match is defined as the distribution whose mean is not just the closest to It but also close enough to e considered alike. Let dk,t be the distance(It, µk,t) for k=1:K,

whole image could also be taken as one area since the

threshold will be corrected for each pixel in the training phase

math {(k ,t , k ,t )dk ,t

and

dk ,t min[d1,t …..dk ,t ]}

(12)

anyway. At each new frame, B(x,y) and ITh(x,y) are then updated for nonmoving pixels:

For grayscale images, is usually 2.5, which accounts for almost 98.76% of the values from that distribution [14].

Bi,t (x, y) 1 Bt (x, y) (1 1 )It (x, y)

(9)

If there is a match, the corresponding parameters will be

updated as

th

I

i 1

(x, y) 1 Ith (x, y) (1 1 )(5* | It (x, y) Bt (x, y) |)

(10)

(1 )

  • I

2

(13)

i

Where 1 is a time constant specifying the rate of adaptation.

i,t

i,t 1 t

The final motion region contains interesting moving objects or uninteresting swinging of trees or even both.

  • 2 i,t

    (1 ) 2 (I

    i,t )

    (14)

    t

    Eventually, only interesting moving objects should be left.

    Where /

    i,t

    So, the next step is to cluster/fill the objects in the motion area and then perform selective MoG to get the interesting

    objects out of the whole motion area.

    1. SELECTIVE MATCH AND UPDATE Each pixel is modeled as a mixture of K Gaussians [9]:

      The approximation in(10) is faster and more logical than the one =0.9, v = 0.09, =0.005, T=0.7 (non busy scene), T =2, T =3.1 init low high used in [9] for winner-takes all scenarios [15]. Note that distance the proposed technique outperforms the other techniques and is able computations may be modified to avoid square root operations. All to

      f (It

      u) k

      i,t(u :

      i,t

      , i,t )

      (11)

      handle scenes with waving trees and light changes. The holes in weights are then updated as:

      iL

      i,t i,t

      Where (u;µ , ) is the ith Gaussian distribution also

      called component, wi,t is the weight or probability of each

      i,t (1 )i,t 1 M i,t

      1 if matched

      (15)

      (16)

      Gaussian, and K is the number of distributions, which ranges

      from 3 to 5. Usually, 3 Gaussian distributions are kept per

      M i,t

      0

      if nonmatched

      pixel; at least one distribution is needed to represent foreground objects and two distributions to represent multimodal backgrounds. Increasing the number of distributions improves the performance to a certain extent at the expense of increasing memory requirements and computations. Even though K may be go up to 7, not much improvement is obtained The threshold T is a value between 0 and 1. If T is very small, then most of the distributions will be classified as foreground and consequently one distribution at most will correspond to the background, which means that we will not be able to handle multimodal backgrounds. If T is very large, then most of the distributions will be classified as background; thus objects will quickly become part of the background. A trade

      off would be to choose T around 0.6. This value may vary depending on the type of the scene, whether it is a busy scene

      Otherwise, the component with the least weight is replaced by a Foreground Pixels correctly identified by algorithm new component with mean I, large variance and small weight and Precision(16) maintain the means and variances of the other components, but lower Total Fore ground detected Pixels by algorithm their weights according to (16).

    2. FOREGROUND DETECTION USING HYSTERESIS THRESHOLDING

All the components are then sorted by their values of wi/i, with the higher ranked components being classified as background. This is because high values correspond to bigger weight values and smaller variances, which means more prominent components. The first B

(

distributions that verify (17) are chosen as background distributions:

with lots of moving objects or just few objects. The next step

B arg min b

n

k 0 k

T )

(17)

is to compare the current pixel to these background distributions and classify it as background if it matches any of the background distributions, otherwise as a foreground pixel. Instead of using simple matching, Power and Schooners suggested using hysteresis thresholding when looking for a

The threshold T is a value between 0 and 1.if T is very small, the most of the distributions will be classified as foreground and consequently one distribution at most will correspond to the background, this means that we will not be

able to handle multimodal backgrounds. If T is very large, then most of the distribution will be classified as background; thus objects will quickly become part of the background. A trade off would be to choose T around 0.6.this value may vary depends on the type of the scene, whether it is a busy scene with lots of moving objects or just few objects.

The next step is to compare the current pixel to this background distribution and classify it as background if it matches any of the background distributions, otherwise as a foreground pixel. Instead of using simple matching, powers and schooners suggested using hysteresis thresholding when looking for a match [15].The idea is to have two thresholds

further operation so as to improve the efficiency of foreground detection mechanism. The next step is noise removal.

      1. NOISE REMOVAL

        The algorithms that are applied to detect moving object produce the expected foreground. However, it is highly expected to observe some noises that cannot be handled by background model. This noise affects the output of many calculation stages during the processing of a frame and overall mask becomes inaccurate due to noise. In order to get improved results, noise removal is a crucial step.

        and Thigh . If the difference between the current pixel value

        Morphological operation, dilation and erosion, are applied to the foreground pixel map in order to remove noise caused by

        and the distribution mean is larger than

        T ,the pixel is

        high

        camera noise, reflectance noise, background colored object

        classified as foreground. Otherwise, the pixel is a foreground candidate or weak candidate as shown in (14).Additional connected component check procedure is needed to classify the candidate pixels as foreground or background. If a candidate pixel is found to be 8- connected to a foreground pixel, it becomes a foreground pixel. Afterwards, morphological operators are applied to form the final foreground mask.

        di | fr fri 1 | or di 1 | fr Bg | Or for both. (18)

        As in equation (18) explained In this proposed algorithm or

        hybrid adaptive scheme algorithm we will have three kinds of foreground pixels

        Foreground pixels belongs to both frame differencing and adaptive background subtraction. Foreground pixels belongs to either frame differencing or adaptive background subtraction .Foreground pixels that belong to only either to frame differencing or adaptive background subtraction. As we know the main problem with frame differencing is that pixels interior to an object with uniform intensity are not included in the set of moving object pixels. Interior pixels can be filled by applying adaptive background subtraction to extract all of the moving pixels. Let Bgt(x, y) represent the current background intensity value at pixel(x,y) ,then pixels

        that are significantly different from the background model Bgt(x,y)

        noise, and any noises that cannot handled by background model [18],[19].

      2. MORPHOLOGICAL OPERATIONS

Morphological operations, erosion and dilation[20], are applied to the foreground pixel map in order to remove noise that is caused by the rst three of the items listed above. Our am in applying these operations is removing noisy foreground pixels that do not correspond to actual foreground regions.

    1. CONNECTED COMPONENT ANALYSIS

      A set of pixels in an image which are all connected to each other is called a connected component. Finding all connected components in an image and marking each of them with a distinctive label is called connected component labeling After detecting foreground regions and applying post-processing operations to remove noise and shadow regions, the filtered foreground pixels are grouped into connected regions (blobs) and labeled by using a two-level connected component labeling algorithm presented in [18].After finding individual blobs that correspond to objects, the bounding boxes of these regions are calculated.

    2. MOVING OBJECT TRACKING

bn(x, y) (x, y) 😐 It (x, y) Bg(x, y) | thresh

(19)

Tracking is generating the trajectory of an object over

time by locating its position in every frame of the video. The

(X, y) moving pixels. Equation(19) will overcome the drawback of frame differencing. Then for moving and non moving object pixels the background and the threshold can be update as follows in Equation (20) and (21).

(FG=foreground pixels, BG=Non-moving object pixels)

aim of object tracking is to establish a correspondence between objects or object parts in consecutive frames and to extract temporal information about objects such as trajectory,

posture, speed and direction. The tracking method we have developed is inspired by the study presented in [21]. Our

Bgt 1

(x, y) * Bgt (x, y) (1 ) * It (x, y),(x, y) BG

Bgt (x, y),(x, y) FG

thresh for (x, y) FG

(20)

approach shown in fig.5 makes use of the object features such as size, center of mass, bounding box etc. which are extracted in previous steps to establish a matching between objects in

thresh * thresh (1 )( * ( fr _ diff (x, y))) for(x.y) BG

4.2. POST PROCESSING

21)

consecutive frames. Obtaining the correct track information is crucial for subsequent actions, such as event modeling and activity recognition. Once we get the filtered foreground

The outputs of foreground region detection algorithms we explained in previous three sections generally contain noise and therefore are not appropriate for further processing without special post-processing. In the post-processing step the following basic tasks must be done before doing any

pixels, in the next step, connected regions are found by using a connected component labeling algorithm and objects bounding rectangles are calculated. The labeled regions may contain near but disjoint regions due to defects in foreground segmentation process. Hence, it is experimentally found to be effective to merge those overlapping isolated regions. Also,

some relatively small regions caused by environmental noise are eliminated in the region-level post-processing step.

INPUT FROM DETECTION ALGORITHM

Post Processing

Connected Component analysis

Object Tracking Algorithm

Plotting Method

Edge Detection

Bounding Box

Display

Fig. 5 Block Diagram for Object tracking system

The moving object detection and, tracking steps are dependent on each other. Thus, the tracking system would deliver inappropriate results, if one of the previous steps does not achieve good performance. An important condition in an object tracking algorithm is that the motion pixels of the moving objects in the images are segmented as accurately as possible.

    1. MOVING A OBJECT TRACKING BASED ON REGION

      This method identifies and tracks a blob token or a bounding box, which are calculated for connected components of moving objects in 2D space. The method relies on properties of these blobs such as size, color, shape, velocity, or centroid. A benefit of this method is that it time efficient, and it works well for small numbers of moving objects. For example, [40] presents a method for blob tracking. Kalman filters are used to estimate pedestrian parameters. Region splitting and merging are allowed. Partial overlapping and occlusion is corrected by defining a pedestrian model. From the above Object tracking system flow chart, we can see that we apply morphological filters based on combinations of dilation and erosion to reduce the influence of noise, followed by a connected component analysis for labeling each moving object region. Very small regions are discarded. At this stage we calculate the following feature for each moving object region: bounding rectangle: the smallest rectangle that contains the object region. We keep record of the coordinate of the upper left position and the lower right position, what also provides size information (width and height of each rectangle).The basic principle is, after detecting foreground regions and applying post- processing operations to remove noise and shadow regions, the filtered foreground pixels are grouped in to connected regions (blobs) and labeled by connected component labeling algorithm. After finding individual blobs that correspond to objects, the bounding boxes of these regions are calculated.

      1. IFD (INTER- FRAME DIFFERENCING)

        The external memory is also limited that we cannot store multiple frames for processing, as we did in the matlab. This

        makes impossible many complicated algorithms. From these considerations, we decide to implement a simple, direct, but effective algorithm, which here we refer to as IFD algorithm. It simply finds the difference between the current frame and a reference frame and labels the non-stationary pixels. The mean and standard deviation of the non-stationary pixels are calculated and they are used as an indication of the position and size of the foreground object. This procedure is repeated for each frame, and the tracking output is displayed on the monitor as a rectangular box.

    2. EVENT MODELING

      This thesis presents a system for adaptive moving object detection and tracking applications. The system has been designed to monitor outdoor as well as indoor environments. The final task of the architecture is to automatically provide alarms when specific events of interest are detected

    3. GRAPHICAL USER INTERFACE DESIGN

      We used Matlab software to develop the GUI shown in Fig.6.The GUI was designed to facilitate interactive system operation.GUI can be used to setup the program, launch it, stop it and display results. During setup stage the operator is promoted to choose motion detection and tracking algorithm. Whenever the start/stop toggle button is pressed the system will be launched and the selected program will be called to perform the calculations until the start/stop button is pressed again which will terminate the calculation and return control to GUI. Results can be viewed as detection and tracking consequently.

      Fig. 6 GUI Layout Design

    4. EXPERIMENTAL RESULTS

      This section demonstrates some of the tested image sequences that are able to highlight the effectiveness of the proposed detection system. These experimental results are obtained using the proposed detection and tracking algorithm that has been discussed above. Fig.7 shows the difference between different algorithms and fig.8 shows the better result of the proposed detection system.

      1. DIFFERENT ALGORITHMS DETECTION RESULTS

Fig. 7 Results from different algorithm,((a1), (a2), and (a3)) are background images, ((b1) ,(b2), and (b3)) are feed video inputs from camera, (c1) ,(c2), and (c3) are detection results from frame differencing, ((d1), (d2) and (d3

))are detection results from background subtraction ,((e1), (e2) and(e3)) are detection results from proposed system( hybrid adaptive scheme

    1. PROPOSED DETECTION SYSTEM DETECTION RESULTS

      Fig. 8 Proposed detection sstem detection results

      As shown in fig.7, detection results form single object and multiple objects video feed, (a1)-(a3) are background image (b1)-(b3) are feed video sequences from static camera and (e1)-(e3) are results from proposed detection system, hybrid adaptive scheme algorithm. From the detection result we can see that, the algorithm determine the legitimate region(s) as well as it extract all information of moving object(s).fig 9 shows the proposed tracking algorithm results and we got the result we need.

    2. PROPOSED TRACKING ALGORITHM RESULTS

      Fig. 9 Results from proposed tracking algorithms

    3. CONCLUSIONS AND FUTURE WORKS

      Generally, this project is to develop an algorithm for moving object detection and tracking system. This algorithm is successfully implemented using Matlab integrated development environment. As a result, the algorithm is able to detect and track a moving object that is moving, as long as the targeted object emerged fully within the camera view range. The input for this project is video sequences which were captured via USB camera or built-in integrated webcam. The first step of the algorithm is to sample the video sequence into static frames. This train of frames is originally in red-green-blue (RGB). To ease computation, then RGB frames are converted to a suitable format (binary). Each frame is then being put into the filtering process, adaptive scheme algorithm, thresholding, post processing and finally, tracking process done.

      The main objective of this project is to develop an algorithm that is able to detect and track moving objects. In this thesis we presented a set of methods and tools for moving object detection and tracking system. We implemented three dierent object detection algorithms and compared their detection quality and time-performance. Our thesis (detection and tracking system of moving objects based on matlab) includes the following two main building blocks: Moving Object Detection and Object Tracking. Moving Object detection segments the moving targets from the background and it is the crucial first step in video surveillance. In this thesis we implemented and tested temporal differencing object detection algorithms, background subtraction, and Hybrid adaptive scheme algorithms, and compared their detection quality and time performance.

      The proposed technique combines simple frame difference(FD), simple adaptive background subtraction (BS), and accurate Gaussian modeling to benefit from the high detection accuracy of Mixture of Gaussian solution (MoG) in outdoor scenes while reducing the computations required, thus, making it faster and more suitable for surveillance applications. Furthermore, it gives the most promising results in terms of detection quality, speed and computational complexity to be used in surveillance system

      with stationary cameras. The tracking is based on Inter-frame differencing and bounding box method. Every system has its own limitation and needs improvements.

      The proposed system is not perfect in all direction. As some future works, shadow removal and sudden illumination changes process can be achieved with more robust detection algorithms which improves object detection and tracking. From tracking point of view, improvements in partial overlapping, multiple object tracking and occlusions problems can be achieved with more robust algorithm and finally we can get more robust and accurate system for video surveillance.

      BIBLIOGRAPHY

      1. A. J. Lipton, H. Fujiyoshi, and R.S. Patil. Moving target classication and tracking from real-time video. In Proc. of Workshop Applications of Computer Vision, pages 129 136, 1998.

      2. L. Wang, W. Hu, and T. Tan. Recent developments in human motion analysis. Pattern Recognition, 36(3):585 601,

        March 2003.

      3. R. T. Collins et al. A system for video surveillance and monitoring: VSAM nal report. Technical report CMU-RI- TR-00-12, Robotics Institute, Carnegie Mellon University, May 2000.

      4. I. Haritaoglu, D. Harwood, and L.S. Davis. W4: A real time system for detecting and tracking people. In Computer Vision and Pattern Recognition, pages 962967, 1998.

      5. C. Stauffer and W. E. L. Grimson. Adaptive background mixture models for real-time tracking. In Proc. of the IEEE

        Computer Society Conference on Computer Vision and Pattern Recognition, page 2: 246–252, 1999.

      6. A. M. McIvor. Background subtraction techniques. In Proc.of Image and Vision Computing, Auckland, New Zealand, 2000.

      7. M. Xu and T. Ellis. Colour-Invariant Motion Detection under Fast Illumination Changes, chapter 8, pages 101

        111.Video-Based SurveillanceSystems. Kluwer Academic Publishers, Boston, 2002.

      8. S.J. McKenna, S. Jabri, Z. Duric, and H. Wechsler. Tracking interacting people. In Proc. of International Conference on

Automatic Face and Gesture Recognition, pages 348353, 2000

  1. T. Horprasert, D. Harwood, and L.S. Davis. A statistical approach for real-time robust background subtraction and shadow detection. In Proc. of IEEE Frame Rate Workshop, pages 1 19, Kerkyra, Greece, 1999.

  2. H.T. Chen, H.H. Lin, and T.L. Liu. Multi-object tracking using dynamical graphs matching. In Proc. Of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 210217, 2001.

  3. I. Haritaoglu. A Real Time System for Detection and Tracking of People and Recognizing Their Activities. PhD thesis, University of Maryland at College Park, 1998.

[12] X. Zhou, R. T. Collins, T. Kanade, and P. Metes. A master- slave system to acquire biometric imagery of humans at distance. In First ACM SIGMM International Workshop on Video Surveillance, pages 113120. ACM Press, 2003.

  1. J. Owens and A. Hunter. A fast model-free morphology-based object tracking algorithm. In Proc. of British Machine Vision Conference, pages 767 776,Cardi, UK,

    September 2002.

  2. A. Amer. Voting-based simultaneous tracking of multiple video objects. In Proc. SPIE Int. Symposium on Electronic Imaging, pages 500511, Santa Clara, USA, January 2003.

  3. I. Haritaoglu, D.Harwood, and L.S. Davis. W4: A real time system for detecting and tracking people. In Computer Vision and Pattern Recognition, pages 962967, 1998.

  4. Y. Ivanov, C. Stauffer, A. Bobick, and W.E.L. Grimson. Video surveillance of interactions.InInternational Workshop on Visual Surveillance, pages 8289,Fort Collins, Colorado, June 1999.

  5. C. R. Wren, A. Azarbayejani, T. J. Darrell, and A. P. Pentland. Pfinder: Real-time tracking of the human body. IEEE Pattern Recognition and Machine

  6. I. Haritaoglu, D. Harwood, and L. S. Davis: W4: Who? When?

    Where? What? A real-time system for detecting and tracking people. In Proc. 3rd Face and Gesture

    Recognition Conf., pages 222-227, 1998.

  7. F. Heijden.Image Based Measurement Systems: Object Recognition and Parameter Estimation. Wiley, January 1996.

  8. J. Owens and A. Hunter. A fast model-free morphology-based object tracking algorithm. In Proc. of British Machine

    Vision Conference, pages 767776, Cardiff, UK, September 2002.

  9. Zelnio, Edmund G.; Garber, Frederick D Algorithms for Synthetic Aperture Radar Imagery XIV. Proceedings of the SPIE, Volume 6568, pp. 65680U (2007).

  10. J. Heikkila and O. Silven. A real-time system for monitoring of cyclists and pedestrians. In Proc. of Second IEEE Workshop on Visual Surveillance, pages 7481, Fort Collins, Colorado, June 1999.

  11. R. Rosales and S. Sclaroff. Improved tracking of multiple humans with trajectory prediction and occlusion modeling. In Proc. of IEEE CVPR Workshop on the Interpretation of Visual Motion,

    Santa Barbara, CA, 1998.

  12. R. Rosales and S. Sclaroff. Improved tracking of multiple humans with trajectory prediction and occlusion modeling. In Proc. of

    IEEE CVPR Workshp on the Interpretation of Visual Motion, Santa Barbara, CA, 1998.

  13. R.T. Collins, A.J. Lipton, and T. kanade, A system for video surveillance and monitoring, Proceedings of the American Nuclear Society (ANS) Eighth International Topical Meeting on Robotics and Remote Systems, April, 1999.

  14. I. Haritaoglu, D. Harwood, L.S. Davis, W4: real-time surveillance of people and their activities, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp.809830, August 2000.

  15. C. Stauffer and W.E. Grimson, Adaptive background mixture models for real time tracking, IEEE Proc. CVPR, pp.246-252,

June 1999.

ACKNOWLEDGEMENTS

First and for most, I want thank ALLAH (s.w.t), for giving me the source of power, knowledge and strength to finish the project and dissertation for completion of my master degree final project.

I am eager to express my most sincere thankfulness to some invaluable people who made this work possible. My gratitude and appreciation go to Prof. Lili, for her always helpful suggestions in discussion and for her support and guidance in all faiths of this work .Her energy and great vision have inspired me many times; her knowledge and understanding of science as a whole have made a difficult subject simple, and turned the complicated one into uncomplicated. Also, I would like to express my sincere gratitude to my friends in TUTE.

Finally my biggest gratitude is to my family members, my Mother, Father, brothers, sisters and my beloved wife Ferdos Seid, for all their endless love, encouragement and support to this journey.

Leave a Reply