Research Projects
 
  
  
 

   "Intelligent" Visual Tracking
We can just look at someone and figure out what  the person is doing -- running, playing or just sitting around. Can a machine do that ? In addition, can it also perform autonomous detection of abnormal activities ? This project aims to answer these questions and attempts to develop intelligent visual tracking systems which can perform tracking, recognition and change detection of motion activities.

We represent  the body shape of the target by a set of key feature points of interest called 'Landmark points' as shown in the picture. The corresponding landmark shape is defined as the ordered set of landmark points which basically represent the body posture at a given time instant. Landmark shapes could be both 2D and 3D depending on whether the landmark points are considered on a 2D plane or in 3D space.
For the tracking part, we have developed efficient Particle Filtering algorithms in order to deal with large dimensional state-space and multimodality of the observation likelihood. This is because while tracking nonstationary shape deformations (i.e. time evolution of landmark shapes) under noisy/cluttered observations (from video sequences) ,  we encounter both large dimensional state space and multimodal observation likelihood (quite often due to outlier noise from clutter [details]). Depending on the type of importance sampling, the new PF algorithms were named as PF-EIS (PF with Efficient Importance Sampling), PF-MT (PF with Mode Tracking) and PF-EIS-MT (PF with Efficient Importance Sampling coupled with Mode Tracking). We have shown improved performance of these algorithms over existing PF algorithms (details). The results for body motion activity  tracking  with Particle Filter have been shown above. These trackers can not only say 'where' the person is but also  tracks the body movements of the target which eventually can lead to activity recognition to be incorporated to a visual tracking system.
Change/Abnormal Activity Detection

   Compression of 2D and 3D Landmark Shape Data

   Target Tracking Across Illumination Changes

   Deformable Contour Tracking : A Compressed Sensing Approach
Related Publications


Talks  

   Information Hiding Inside Structured Shapes



      
                                                                                                                                                                                                                   Copyrights : samarjit@iastate.edu  (2009) 
The key to activity recognition and tracking relies on developing  a dynamical model for landmark shape changes/deformations over time. In other words, each motion activity (e.g.running, jumping etc.)  has a unique pattern for the time evolution of landmark shapes corresponding to the body postures. We have developed dynamical model for time varying 2D/3D landmark shape deformations corresponding to various activities. We call it "Nonstationary Shape Activity" or NSSA in short. The model relies on defining a time varying mean shape under the assumption of nonstationary shape deformations. More details can be found in these slides

Landmark Shape
Shape space
Tracking : Run
Tracking : Jump
Tracking : Playing basketball
We did a simple experiment to test the ability of the NSSA-based tracker to detect changes in activity while still not completely losing track. The ELL-based (see here) change detection statistics was able to detect the change from run to leap, and for sometime after the change also, our tracker did not lose track. The results are shown below. At the instant of activity transition i.e. model switching, there happens to be a sharp spike in ELL measure which is used to detect the change.
Activity change : from Run to Leap
ELL based Change Detection
The goal of this work is to develop a visual tracking algorithm that is robust to illumination variations in the scene. The changes in lighting conditions changes the appearances of the target objects. Thus, standard template matching based trackers would eventually fail, if the illumination changes are not taken into account. This problem has been addressed by modeling the dynamics of illumination change and incorporating it in the state-space of the tracker. The increased dimensionality of the state vector necessitates an increase in the number of particles required to maintain tracking accuracy. In order to get around this problem, Particle Filter with Mode Tracker (PF-MT) is used to estimate the illumination vector.
This use of PF-MT is motivated by the fact that, except in case of occlusions, multimodality of the state posterior is usually due to multimodality in position and scale parameters (e.g. there may be multiple objects in the scene that roughly match the template). In other words, given the position and scale at time t, the posterior of the illumination is unimodal. In addition, it is also true that this posterior is usually quite narrow since illumination changes over time are slow. The choice of the illumination model permits the illumination coefficients to be solved in closed form. We also consider the occlusion scenario during tracking. We have used an outlier noise model to incorporate the effect of occlusion in the observation equation (see the third image).
Related Publications


Talks


Outdoor : Tracking a car
Indoor : Face tracking (minor occlusion)
Indoor : Face tracking (severe occlusion)
We have proposed a novel model-based compression technique for non- stationary landmark shape data extracted from video sequences. The main goal is to develop a technique for the compact storage of landmark shape data. We use Nonstationary Shape Activity (NSSA) to model the shape sequences. The shape data is encoded by applying Differential Pulse Code Modulation (DPCM) on the shape velocity coefficients under the NSSA model. We have studied the system performance in terms of compressibility-distortion trade off. NSSA based compression technique has been compared with two other methods based on existing shape modeling techniques namely, Stationary Shape Activity (SSA) and Active Shape Model (ASM). We tested our system with landmark shape data extracted from multiple video sequences of the CMU mocap database. It was found that NSSA outperforms both SSA and ASM in terms of compressibility for a given distortion tolerance. Thus NSSA based compression technique could be very useful in the applications like storage of large volumes of biomedical landmarks’ data. The RD plots corresponding to NSSA, SSA and ASM is shown here.
Related Publications



Talks


In this project, we have proposed a compressed sensing based novel algorithm for contour deformation estimation and tracking. We have shown that real-life contour deformations are sparse in frequency domain and can be reconstructed for a very small number of random observations using compressed sensing. We have demonstrated the working of the algorithm for multiple examples both for simulated and real-life sequences. Our goal is to use this algorithm for tracking deformable contours in biomedical image sequence corrupted with noise and clutter. (Ongoing work)
Basic Idea of the Algorithm

Goal : Use the contour at 't'  and image frame at 't+1' to estimate the contour at 't+1'



Deformable Contours Tracking : Examples

This work was performed during my internship at Mitsubishi Electric Research Labs (MERL), Cambridge MA . We developed a new technique for embedding a message within structured shapes. It is desired that any changes in the shape owing to the embedded message are invisible to a casual observer but detectable by a specialized decoder. The message embedding algorithm represents shape outlines as a set of cubic Bezier curves and straight line segments. By slightly perturbing the Bezier curves, a single shape can spawn a library of similar-looking shapes each corresponding to a unique message. This library is efficiently stored using Adaptively Sampled Distance Fields (ADFs) which also facilitate rendering of the modified shapes at the desired resolution and fidelity. Given any modified shape, a forensic detector applies Procrustes analysis to determine the embedded message. We could decode up to 90 % of the embedded messege. (Details omitted. Two Patents are expected to be filed later this year)
Application : Data Hiding in Text Documents
Do you see a difference ?

  Projects from Graduate Course Work
Related Publications





  Teaching : EE528 (Digital Image Processing)
Original

Data embedded