Publications

Published Papers

2013

Fundamental relationship between bilateral kernel and locally adaptive regression kernel
- Description: The relationship between the bilateral kernel function and the recently proposed locally adaptive regression kernel is examined. Despite the difference in implementation, both locally adaptive approaches are designed to prevent averaging across edges while smoothing an image. Their similarity suggests that they can reasonably be linked although both filtering approaches have grown to become well-established theories in their fields. First, the locally adaptive regression kernel is analyzed theoretically. Then, the connection between the methods is explored by applying the spectral distance measure to the bilateral kernel. Finally, a direct relation is established between the bilateral kernel and the locally adaptive regression kernel.
Occlusion filling in stereo: Theory and experiments
- Description: A number of stereo matching algorithms have been developed in the last few years, which also have successfully detected occlusions in stereo images. These algorithms typically fall short of a systematic study of occlusions; they predominantly emphasize matching and regard occlusion filling as a secondary operation. Filling occlusions, however, is useful in many applications such as image-based rendering where 3D models are desired to be as complete as possible. In this paper, we study occlusions in a systematic way and propose two algorithms to fill occlusions reliably by applying statistical modeling, visibility constraints, and scene constraints. We introduce a probabilistic, model-based filling order of the occluded points to maintain consistency in filling. Furthermore, we show how an ambiguity in the interpolation of the disparity value of an occluded point can safely be avoided using color homogeneity when the point’s neighborhood consists of multiple scene surfaces. We perform a comparative study and show that statistically, the new algorithms deliver good quality results compared to existing algorithms.
Integration of multispectral face recognition and multi-PTZ camera automated surveillance for security applications
- Description: Due to increasing security concerns, a complete security system should consist of two major components, a computer-based face-recognition system and a real-time automated video surveillance system. […] Thus, we present an automated method that specifies the optimal spectral ranges under the given illumination. Experimental results verify the consistent performance of our algorithm via the observation that an identical set of spectral band images is selected under all tested conditions. Our discovery can be practically used for a new customized sensor design associated with given illuminations for an improved face recognition performance over conventional broad-band images. In addition, once a person is authorized to enter a restricted area, we still need to continuously monitor his/her activities for the sake of security. Because pantilt- zoom (PTZ) cameras are capable of covering a panoramic area and maintaining high resolution imagery for real-time behavior understanding, researches in automated surveillance systems with multiple PTZ cameras have become increasingly important. Most existing algorithms require the prior knowledge of intrinsic parameters of the PTZ camera to infer the relative positioning and orientation among multiple PTZ cameras. To overcome this limitation, we propose a novel mapping algorithm that derives the relative positioning and orientation between two PTZ cameras based on a unified polynomial model. […]

2012

Outdoor Scene Image Segmentation Based on Background Recognition and Perceptual Organization
- Description: In this paper, we propose a novel outdoor scene image segmentation algorithm based on background recognition and perceptual organization. We recognize the background objects such as the sky, the ground, and vegetation based on the color and texture information. For the structurally challenging objects, which usually consist of multiple constituent parts, we developed a perceptual organization model that can capture the nonaccidental structural relationships among the constituent parts of the structured objects and, hence, group them together accordingly without depending on a priori knowledge of the specific objects. Our experimental results show that our proposed method outperformed two state-of-the-art image segmentation approaches on two challenging outdoor databases (Gould data set and Berkeley segmentation data set) and achieved accurate segmentation quality on various outdoor natural scene environments.
Bilateral Kernel-Based Region Detector
- Description: In this paper, we present a new method for a locally adaptive region detector called Bilateral kernel-based Region Detector (BIRD). This work is to detect stable regions from images by consecutively computing a multiscale decomposition based on the bilateral kernel. The BIRD regards a region as covariant if it exhibits predictability in its photometric distance over spatial distance. Distinctiveness and robustness across scales are achieved by selecting the extremely stable regions through sequential scales. Our method is simple and easy to implement. Experimental results show that our method outperforms competing affine region detection methods in efficiency on region detection.

2011

Depth Map Enhancement Using Adaptive Steering Kernel Regression Based on Distance Transform
- Description: In this paper, we present a method to enhance noisy depth maps using adaptive steering kernel regression based on distance transform. Data-adaptive kernel regression filters are widely used for image denoising by considering spatial and photometric properties of pixel data. In order to reduce noise in depth maps more efficiently, we adaptively refine the steering kernel regression function according to local region structures, flat and textured areas. In this work, we first generate two distance transform maps from the depth map and its corresponding color image. Then, the steering kernel is modified by a newly designed weighing function directly related to joint distance transform. The weighting function expands the steering kernel in flat areas and shrinks it in textured areas toward local edges in the depth map. Finally, we filter the noise in the depth map with the refined steering kernel regression function. Experimental results show that our method outperforms the competing methods in objective and subjective comparisons for depth map enhancement.
Depth Data Calibration and Enhancement of Time-of-flight Video-plus-Depth Camera
- Description: In this paper, we present a method to calibrate and enhance depth information captured by an infrared (IR)-based time-of-flight video-plus-depth camera called “Kinect camera”. For depth data calibration, we use color and IR images of a chessboard with on-off halogen light sources to calculate camera parameters of the video and IR sensors in the Kinect camera. For depth data enhancement, we introduce weighted joint bilateral filtering based on distance transform of the color and depth images. Experimental results show that our method calibrates the video and depth sensors successfully and reduces the noise in captured depth images efficiently
A novel performance evaluation paradigm for automated video surveillance systems
- Description: Most existing performance evaluation methods concentrate on defining various metrics over a wide range of conditions and generating standard benchmarking video sequences to examine the effectiveness of a video tracking system. It is a common practice to incorporate a robustness margin or factor into the system/algorithm design. However, these methods, deterministic approaches, often lead to overdesign, thus increasing costs, or underdesign, causing frequent system failures. In order to overcome the aforementioned limitations, we propose an alternative framework to analyze the physics of the failure process via the concept of reliability. In comparison with existing approaches where system performance is evaluated based on a given benchmarking sequence, the advantage of our proposed framework lies in that a unified and statistical index is used to evaluate the performance of an automated video surveillance system independent of input sequences. Meanwhile, based on our proposed framework, the uncertainty problem of a failure process caused by the system’s complexity, imprecise measurements of the relevant physical constants and variables, and the indeterminate nature of future events can be addressed accordingly.

2010

Can You See Me Now? Sensor Positioning for Automated and Persistent Surveillance
- Description: Most existing camera placement algorithms focus on coverage and/or visibility analysis […]. In this paper, we propose sensor-planning methods that improve existing algorithms by adding handoff rate analysis. Observation measures are designed for various types of cameras so that the proposed sensor-planning algorithm is general and applicable to scenarios with different types of cameras. The proposed sensor-planning algorithm preserves necessary uniform overlapped FOVs between adjacent cameras for an optimal balance between coverage and handoff success rate. In addition, special considerations such as resolution and frontal-view requirements are addressed using two approaches: 1) direct constraint and 2) adaptive weights. The resulting camera placement is compared with a reference algorithm published by Erdem and Sclaroff. Significantly improved handoff success rates and frontal-view percentages are illustrated via experiments using indoor and outdoor floor plans of various scales.
Imaging-based thermal modelling and reverse engineering of as-built automotive components: A case study
- Description: Virtual prototyping of objects with thermal characteristic requirements depends on several aspects such as the geometry of the component, material properties, ambient environmental conditions and most importantly temperature curves/heat patterns of the component when functional. In this case study, we present the data acquisition methodology towards thermal modeling of as-built automotive parts into their virtual prototypes. We explain the imaging-based reverse engineering pipeline suitable for our application towards recovering 3D geometry and identify tools for measuring temperature curves in the data collection process. Further, we show results of immersing the reverse engineered mesh in the thermal simulation environment and verify the finite-element based simulation results to agree with the thermal image sequences. Our experimental results based on automobiles are able to address the issue of thermal modeling and verification of virtual vehicle components even when the computer aided design (CAD) models are not available. Finally, we conclude our investigation with the experimental achievability and limitations of imaging-based thermal modeling of vehicle components.
3D Video Generation and Service based on a TOF Depth Sensor in MPEG-4 Multimedia Framework
- Description: In this paper, we present a new method to generate and serve 3D video represented by video-plus-depth using a time-of-flight (TOF) depth sensor. In practice, depth images captured by the depth sensor have critical problems, such as optical noise, unmatched boundaries with their corresponding color images, and depth flickering artifacts in the temporal domain. In this work, we enhance the noisy depth images by performing a series of processing steps including joint bilateral filtering with inner-edge selection, outer-boundary refinement by a robust image matting method, and temporal consistency based on motion estimation. Thereafter, the generated high-quality video-plus-depth is combined with computer graphics models in the MPEG-4 multimedia framework. Finally, the immersive video content is streamed to consumers to enjoy 3D view. Experimental results show that our method can minimize the inherent problems of depth images significantly and serve 3D video successfully in the MPEG-4 multimedia framework.
Fusing continuous spectral images for face recognition under indoor and outdoor illuminants
- Description: Novel image fusion approaches, including physics-based weighted fusion, illumination adjustment and rank-based decision level fusion, for spectral face images are proposed for improving face recognition performance compared to conventional images. A new multispectral imaging system is briefly presented which can acquire continuous spectral face images for our concept proof with fine spectral resolution in the visible spectrum. Several experiments are designed and validated by calculating the cumulative match characteristics of probe sets via the well-known recognition engine – FaceIt . Experimental results demonstrate that proposed fusion methods outperform conventional images when gallery and probes are acquired under different illuminations and with different time lapses. In the case where probe images are acquired outdoors under different daylight situations, the fused images outperform conventional images by up to 78%.
Camera handoff with adaptive resource management for multi-camera multi-object tracking
- Description: Camera handoff is a crucial step to obtain a continuously tracked and consistently labeled trajectory of the object of interest in multi-camera surveillance systems. Most existing camera handoff algorithms concentrate on data association, namely consistent labeling, where images of the same object are identified across different cameras. However, there exist many unsolved questions in developing an efficient camera handoff algorithm. In this paper, we first design a trackability measure to quantitatively evaluate the effectiveness of object tracking so that camera handoff can be triggered timely and the camera to which the object of interest is transferred can be selected optimally. Three components are considered: resolution, distance to the edge of the camera’s field of view (FOV), and occlusion. In addition, most existing real-time object tracking systems see a decrease in the frame rate as the number of tracked objects increases. To address this issue, our handoff algorithm employs an adaptive resource management mechanism to dynamically allocate cameras’ resources to multiple objects with different priorities so that the required minimum frame rate is maintained. Experimental results illustrate that the proposed camera handoff algorithm can achieve a substantially improved overall tracking rate by 20% in comparison with the algorithm presented by Khan and Shah.
A new method for the registration of three-dimensional point-sets: The Gaussian Fields framework
- Description: In this paper, we present a 3D automatic registration method based on Gaussian Fields and energy minimization. A continuously differentiable energy function is defined, which is convex in a large neighborhood of the alignment parameters. We show that the size of the region of convergence can be significantly extended reducing the need for close initialization and overcoming local convergence problems of the standard Iterative Closest Point (ICP) algorithms. Moreover, the Gaussian criterion can be applied with linear computational complexity using Fast Gauss Transform methods. Experimental evaluation of the technique using synthetic and real datasets demonstrates the usefulness as well as the limits of the approach.
Multi-camera Positioning for Automated Tracking Systems in Dynamic Environments
- Description: Most existing camera placement algorithms focus on coverage and/or visibility analysis, which ensures that the object of interest is visible in the camera’s field of view (FOV). According to recent literature, handoff safety margin is introduced to sensor planning so that sufficient overlapped FOVs among adjacent cameras are reserved for successful and smooth target transition. In this paper, we investigate the sensor planning problem when considering the dynamic interactions between moving targets and observing cameras. The probability of camera overload is explored to model the aforementioned interactions. The introduction of the probability of camera overload also considers the limitation that a given camera can simultaneously monitor or track a fixed number of targets and incorporates the target’s dynamics into sensor planning. The resulting camera placement not only achieves the optimal balance between coverage and handoff success rate but also maintains the optimal balance in environments with various target densities. The proposed camera placement method is compared with a reference algorithm by Erdem and Sclaroff. Consistently improved handoff success rate is illustrated via experiments using typical office floor plans with various target densities.
Spatial and Temporal Enhancement of Depth Images Captured by a Time-of-flight Depth Sensor
- Description: In this paper, we present a new method to enhance depth images captured by a time-of-flight (TOF) depth sensor spatially and temporally. In practice, depth images obtained from TOF depth sensors have critical problems, such as optical noise existence, unmatched boundaries, and temporal inconsistency. In this work, we improve depth quality by performing a newly designed joint bilateral filtering, color segmentation based boundary refinement, and motion estimation based temporal consistency. Experimental results show that the proposed method significantly minimizes the inherent problems of the depth images so that we can use them to generate a dynamic and realistic 3D scene.
A Reliability Assessment Paradigm for Automated Video Tracking Systems
- Description: Most existing performance evaluation methods concentrate on defining separate metrics over a wide range of conditions and generating standard benchmarking video sequences for examining the effectiveness of video tracking systems. In other words, these methods attempt to design a robustness margin or factor for the system. These methods are deterministic in which a robustness factor, for example, 2 or 3 times the expected number of subjects to track or the strength of illumination would be required in the design. This often results in overdesign, thus increasing costs, or underdesign causing failure by unanticipated factors. In order to overcome these limitations, we propose in this paper an alternative framework to analyze the physics of the failure process and, through the concept of reliability, determine the time to failure in automated video tracking systems. The benefit of our proposed framework is that we can provide a unified and statistical index to evaluate the performance of automated video tracking system for a task to be performed. At the same time, the uncertainty problem about a failure process, which may be caused by the system’s complexity, imprecise measurements of the relevant physical constants and variables, or the indeterminate nature of future events, can be addressed accordingly based on our proposed framework.
Camera handoff and placement for automated tracking systems with multiple omnidirectional cameras
- Description: […] In this paper, we design an observation measure to quantitatively formulate the effectiveness of object tracking so that we can trigger camera handoff timely and select the next camera appropriately before the tracked object falls out of the field of view (FOV) of the currently observing camera. In the meantime, we present a novel solution to the consistent labeling problem in omnidirectional cameras. […] Experiments show that our proposed observation measure can quantitatively formulate the effectiveness of tracking, so that camera handoff can smoothly transfer objects of interest. Meanwhile, our proposed consistent labeling approach can perform as accurately as the geometry-based approach without tedious calibration processes and outperform Calderara’s homography-based approach. Our proposed camera placement method exhibits a significant increase in the camera handoff success rate at the cost of slightly decreased coverage, as compared to Erdem and Sclaroff’s method without considering the requirement on overlapped FOVs.