A Novel SFP Extracting/Tracking Framework for Rigid Microstructure Measurement

3D measurement and reconstruction of rigid microstructures rely on the precise matches of structural feature points SFPs between microscopic images. However, most of the existing algorithms fail in extracting and tracking at microscale due to the poor quality of the microscopic images. This paper presents a novel framework for extracting and matching SFPs in microscopic images under two stereo microscopic imaging modes in our system, that is, fixed-positioning stereo and continuous-rotational stereo modes, respectively. A 4-DOF degree of freedom micro visual measurement system is developed for 3D projective structural measurement of rigid microstructures using the SFPs obtained from microscopic images by the proposed framework. Under the fixed-positioning stereo mode, a similarity-pictorial structure algorithm is designed to preserve the illumination invariance in SFPsmatching, while a method based on particle filter with affine transformation is developed for accurate tracking of multiple SFPs in image sequence under the continuous-rotational stereo mode. The experimental results demonstrate that the problems of visual distortion, illumination variability, and irregular motion estimation in micro visual measurement process can be effectively resolved by the proposed framework.


Introduction
Micro stereovision technology makes it possible to achieve 2D/3D information extraction and 3D reconstruction.It has been extensively used in micromanipulation, microassembly, microrobot navigation and bioengineering, and so forth.3D reconstruction of the small objects in microscopic image is a challenging work according to the small sizes.Existing methods as structured-light-based 3D reconstruction 1, 2 , and binocular stereo methods 3 .are not suitable directly for micro applications and then definitely need to be modified.For rigid 3D microstructures, the accurate matching of structural feature points between microscopic images is one of the most important steps in micro stereovision computation.In this paper we define the structural feature points SFPs as those key points that can fully represent the rigid microstructures, which are those intersection points of 3 planes on the micro 3D structure Figure 1 a .Although the vision method based on tracking of feature points plays an important role in normal-scale computer vision applications as 3D reconstruction, segmentation, and recognition, there is little research focused on the microscale applications except for medical applications.This is due to poor contrast and worse quality of microscopic image mainly resulting from the imaging complexity of optical microscope.The complex structure of microscopic lens system 4 requires the variety of optical elements that can introduce a wide array of image distortions.
In this paper we present a framework for extracting and matching SFPs in microscopic images for dealing with fixed-positioning and continuous-rotational stereo imaging modes, respectively.We believe that the proposed framework perfectly plays an important role in 3D reconstruction based on microscopic image.
First, since there exist some specific drawbacks in microscopic images as blurred edges, geometrical distortions, serious dispersion, and disturbance by noises especially coming from illumination changes , which result in more troubles for SFPs' detection and matching, many popular key point detecting and matching methods 5-9 failed to achieve similar accuracy in microscopic images Figures 1 b -1 d in comparison with those results obtained when employed on normal-scale images.New methods that are suitable for SFPs' extracting and matching in microscopic images are highly required.We develop a novel illumination-invariant method based on similarity-pictorial structure algorithm similarity-PS to solve the SFPs' extracting and matching problem in the fixed-positioning stereo mode.
Second, existing feature point tracking algorithms for continuous image sequences are usually classified as methods based on template, motion parameters, and color patch 10 , none of them meet the needs of SFPs' tracking in microscale images.The representative research achievements include: continuously adaptive mean shift Camshift 11 taking color histogram as object mode in rich-color images for the tracking tasks, which performs poorly when tracking the feature points in complicated background with areas of similar color.Yao and Chellappa first designed a probabilistic data association filter with extended Kalman filter EKF 12 to estimate the rotational motion and continuously track the feature points in frames, which effectively resolved occlusion problem.However, the real location must be predicted by probability analysis during arbitrary moving process of the object.
Buchanan proposed a combining local and global motion models with Kanade-Lucas-Tomasi KLT tracker to accurately track multiple feature points of nonrigid moving objects Mathematical Problems in Engineering 3 13 .But if motion predictions cannot be made for the subsequences consisting of the initial frames, this strategy must fail.Kwon et al. presented a template tracking approach based on applying the particle filtering PF algorithm on the affine group 14, 15 , which can accurately track a large single object, but it performs well only in single template tracking.We extended their method to the multiple-points tracking case in the proposed monocular microvision system in the continuous-rotational stereo mode.Since the images of the measuring rigid micro structure keep fixed spatial relationship of global structures and local features affine invariabilities during the rotating process, the tracking problem is greatly simplified with affine transformation and covariance descriptor in our framework.

Imaging Modes of Proposed Micro Stereoscopic Measurement System
We developed a 4-DOF micro stereoscopic measurement system for 3D micro objects measurement.The system consists of a 4-DOF stage, a fixedmounted illumination and a SLM-based optical imaging system.A monocular tube microscope is used in the system for reducing the imaging complexity.In our system the stereoscopic imaging relationship can be realized in two modes.

(i) Fixed-Positioning Stereo Mode (Mode 1)
The rotational transform by a fixed rotation angle is performed, followed by a tilting motion.Images are captured at the end of each movement.Problem: The changes of illumination direction bring huge contrast and intensity changes to the microscopic images captured at different position.

(ii) Continuous-Rotational Stereo Mode (Mode 2)
Tilting movement is performed firstly, followed by a continuously rotational movement.Image sequences are captured during the rotation process.Problem:The motion blur caused by continuous rotational motion will decrease the quality of microscopic image sequences.

Pictorial Structure Method
A PS method is represented by a collection of parts which has spatial relationships between certain pairs.The PS model can be expressed by a graph G V, E , where the vertices V {v 1 , v 2 , v 3 , . . ., v n } correspond to the parts and {v i , v j } ∈ E present the edge for each pair of connected parts v i and v j .An object is expressed by a configuration L l 1 , l 2 . . ., l n where l i represents the location for each part v i .For each part v i , the appearance match cost function a i I, l represents how well the part matches the image I when placed at location l.A simple pixel template matching is used for this cost function in 16 .The connections between the locations of parts present the structure match cost.The cost function t ij vi, vj represents how well the locations l i of v i and l j of v j agree with the object model.Therefore, Mathematical Problems in Engineering the cost function L * for PS includes 2 parts the appearance cost function and structure cost function . 2.1 The best match of SFPs can be obtained by minimizing L * .

The Local Self-Similarity Descriptor
Self-similarity descriptor is proposed by Buchanan and Fitzgibbon 13 .Figure 2 illustrates the procedure for generating the self-similarity descriptor d q centered at q with a large image.q is a pixel in the input image.
The green square region is a small image patch typically 5 * 5, 3 * 3 centered at q.The lager blue square region is a big image region typically 30 * 30, 40 * 40 centered with q, too.First, the small image patch is compared with a larger image region using sum of square differences SSDs .The CIE L * a * b color space transforming is needed for color images.Second, the correlation surface is normalized to eliminate illumination influences.Finally, the normalized correlation surface is transformed into a "correlation surface" S q x, y S q x, y exp − SSD q x, y max var noise, var auto q , 2.2 where SSD q x, y is the normalized correlation surface and var noise is a constant number that corresponds to acceptable photometric variations in color, illumination or due to noise, which is 150 in the paper .var auto q is the maximal variance of the difference of all patches within a very small neighborhood of q of radius 1 relative to the patch centered at q.The correlation surface SSD q x, y is then transformed into log-polar coordinates centered at q and portioned into 20 * 4 bins m 20 angles, n 4 radius .We choose the maximal value in every bin it can help the descriptor to adapt to nonrigid deformation .We choose all the maximal values to form an m * n vector as a self-similarity descriptor centered at q. Finally, this descriptor vector is normalized to the range 0 • • • 1 by linearly stretching its values.

Similarity-PS Algorithm
We introduce the local "self-similarity" descriptor-matching approach for SFPs' detection into a simplified PS model.This subsection covers 3 main steps as follows

Extraction of the "Template" Description T
Manually marked SFPs' coordinates on the micro structure are used for the training process.We describe the marked points by self-similarity descriptors.For every point, the average values of all the examples of descriptor as a trained descriptor are calculated, acting as the appearance description for the model.

Dense Computation of the Local Self-Similarity Descriptor
These descriptors d q are computed throughout the tested image I with 2 pixels apart from each other in this paper.The higher precision will be obtained if the searching process covers every pixel.

Detection of Similar Descriptors of T within I
The region centered at SFPs is chosen as the interesting region in I, which has the smallest weighted Euclidean distance between self-similarity descriptor vector within I and the one from the training descriptor.The coordinates of all interest points in I are recorded as the locations for the candidate key points.We usually find out many candidate key points for one marked point.In our experiment there are 200 points being chosen but it varies for different templates , and their Euclidean distances between the candidate key points descriptors and the trained descriptors are recorded as a i I, l j which are linearly normalized to the range of 0 • • • 1 .Therefore the appearance cost function a i I, l j is determined.We substituted the appearance model in 2.1 with the obtained self-similarity model; then, the best-matching SFPs for the PS model can be obtained by minimizing L * .Since the microscopic images are not always of the same size, to satisfy the scale invariance, we calculate the self-similarity descriptors at multiple scales both on the patterns and testing images.Moreover, the scale-invariant structures are obtained by multiplying a scale factor to all the mean and variance value of structural distances of SFPs.

The Affine Motion of Tracking Points
The second form of 2.1 using the exponential map is for the convenience of denoting the affine transformation of the imaging process.

Tracking by Particle Filter with Affine Motion
State estimate and sample for tracked points in each frame: the efficiency of the particle filter tracking algorithm mainly relies on the importance of random sampling and calculate the weights w i,j t .The measurement likelihood p y i t X i t is independent of the current state particles X i,j 0:t .The measurement state equation can be expressed in the discrete setting as where v t is a measurement zero-mean Gaussian noise.a i is the AR process parameter.τ t is zero-mean Gaussian noise.ε 1/12 1/2 represents obtaining 12 frames per second.The affine transformation basis elements ξ m m 1, . . ., 3 denoting the templates have produced translation, shearing or scaling transformation.
It is necessary to approximate the particles {R i,1 t , . . ., R i,G t } and resample according to their weights at every timestep.To optimize the computational procedure and avoid directly calculating, all the resample particles are expected to be quite similar to each other.Denote M i t as the arithmetic mean of M i,j t , the sample mean of X i,j t can be approximated as Calculate covariance with descriptor for the tracked point templates: the spatial structure constraint vector h x represent the distance and the angle between origin or polar axis and each tracked pixel in polar coordinate system.I X i t denotes the image pixel intensity.I x , I y , I xx , I yy represent the first-and second-order image derivatives in the Cartesian coordinates system 14, 16 .s is the size of template window h is the mean value of h X i t .The covariance descriptor S of the point templates patches can be given as

2.6
Measure relative distance: for ensuring the covariance descriptors are changing successively, it is necessary to collect image covariance descriptors and calculate the principal eigenvector and the geodesic distance between two group elements {S X i t , S} and {I X i t , I}.The measurement function is defined using the distance-from-feature space, distance-in-feature space, and similarity comparison purposes 18 then the measurement equation in 19 can be more explicitly expressed as where c n is the projection coefficient and ρ n is the eigenvalues corresponding to principal eigenvector, and the two parameters are used to calculate the distance-in-feature space, and I represents the point mean intensity.The measurement likelihood is described as where R is the covariance of zero-mean Gaussian noise v t .When the measurement likelihood p y i t | X i t is gained, we can calculate and normalize the importance weights for the tracked points and then realize multiple SFPs' tracking in long microscopic sequences.

Experimental Results and Discussion
The SFPs' detecting/matching experiments are implemented on a standard microchip with 3D microstructures.Some of the results by both PS algorithm and proposed method are shown in Figure 3, while the contrast accumulated error measured by the Euclidean distance in pixels of the tracking points to the ground truth positions is shown in Figure 4.It proves that our method enhances the correct detecting rate of SFPs' for images under large changed illumination in Mode 1.The reason for this improvement is that the changed gray value is largely caused by the varying illumination while the local structure keeps static.Therefore, it is a good choice to introduce the local self-similarity descriptors into PS for our application.
We also provide some SFPs' tracking results in Mode 2 by our proposed method and KLT for comparison, as shown in Figure 5.The tracking results by the proposed method obviously show a higher localization accuracy of the SFPs' in long micro image sequence.
For further demonstrating the effectiveness of our method, we present some of the 3D projective reconstruction results based on the tracking results in Figure 6.Obviously the results of the proposed method perfectly reconstruct the 3D structure from the microscopic sequences, while the KLT method failed.

Conclusions
This paper developed a framework of SFPs' extracting and matching in microscopic images for 3D micro stereoscopic measurement in two stereo imaging modes.The proposed SFPs' tracking framework ensures the illumination invariance and the robustness in the fixed-positioning stereo mode and continuous-rotational stereo mode, respectively.The effectiveness of our tracking framework has been empirically verified in visual 3D projective reconstruction with microscopic images.In Mode 2 of our system, there is an inevitable tracking error caused by motion blur.Therefore we plan to use the method described in 20 to deal with this problem in our future work.Our future research will also focus on the micro visual measurement planning similar to the method described in 21 and optimal 3D micro model representation 22 .

Figure 1 :
Figure 1: Structural feature points on a rigid 3D microstructure a and the failed detecting and matching results of some popular methods b , c , and d .These results failed to locate precisely on the positions of SFPs.
x , I y , tan −1 I x /I y , I xx , I yy T is added in our Mathematical Problems in Engineering 7 method for minimizing the influence of illumination, distortion, motion-blur, and noise interference.Here b

Figure 3 :Figure 4 :
Figure 3: Tracking results in Mode 1 by PS algorithm left two images and our proposed method right two images .Each image pairs are corresponding to different illumination conditions.

Figure 5 :Figure 6 :
Figure 5: The first and second columns show the accurate tracking results of proposed PF with affine transformation, while the third and fourth columns demonstrate the tracking results of KLT algorithm with lost or dislocated tracks.