Automatic Segmentation and Measurement on Knee Computerized Tomography Images for Patellar Dislocation Diagnosis

Traditionally, for diagnosing patellar dislocation, clinicians make manual geometric measurements on computerized tomography (CT) images taken in the knee area, which is often complex and error-prone. Therefore, we develop a prototype CAD system for automatic measurement and diagnosis. We firstly segment the patella and the femur regions on the CT images and then measure two geometric quantities, patellar tilt angle (PTA), and patellar lateral shift (PLS) automatically on the segmentation results, which are finally used to assist in diagnoses. The proposed quantities are proved valid and the proposed algorithms are proved effective by experiments.


Introduction
Patellar dislocation occurs when the patella slips out from the patellar surface of the femur. It is a common knee injury that may happen when people, especially teenagers and athletes, do vigorous physical exercises, e.g., playing basketball and football. To help the diagnosis, computerized tomography (CT) images are often taken at the knee area. On the knee CT images, clinicians usually make manual measurements and make diagnosis according to the measured results. e manual measurement is often complex, tedious, and error-prone. erefore, a fully automatic approach by computers is highly wanted.
Computed tomography has been widely used to diagnose knee joint pathologies. Correspondingly, knee CT images have been automatically or semiautomatically processed and analyzed (e.g., [1][2][3][4][5]) for computer-aided diagnosis. Subburaj et al. [1] proposed a computer graphicsbased method to automatically localize and label anatomical landmarks on the 3D bone model reconstructed from knee CT images of a patient. Krcah et al. [2] proposed to segment the femur in 3D CT volumes based on graph cuts and a bone boundary enhancement filter. Jang et al. [3] compared and validated various segmentation algorithms to segment the knee CT images and construct a corresponding 3D model. Wu et al. [4] proposed to segment multiple bones around the knee joint with severe pathologies to help patient-specific orthopedic knee surgery planning. Mezlini et al. [5] proposed to measure the knee joint space based on semiautomatic CT image segmentation for the monitoring of osteoarthritis progression. However, to the best of our knowledge, no efforts have been published specifically for automatic measurement on knee CT images for the purpose of patellar dislocation diagnosis. e major contributions of our work reside in the following aspects. Firstly, we propose two quantities, patellar tilt angle (PTA) and patellar lateral shift (PLS), to measure on the knee CT images. Secondly, in order to make the automatic measurement, we propose computing algorithms to segment the patella and femur regions in the CT images and measure the proposed quantities on the segmented regions. Finally, we make experiments to verify the validity and effectiveness of the measured results for the computeraided diagnosis (CAD). Note that a preliminary version of our work has been published in reference [6]. Extending the preliminary work [6], we utilize the correlation between adjacent CT images by bone region prediction for bone region segmentation and make more complete experimental validation in terms of accuracy of measurement and applicability for CAD in this work.

Scheme Overview
e proposed scheme takes a specific portion of knee CT images as input and conducts a complete and automatic process of bone regions segmentation and geometric measurement.

Input Images.
e source CT images for a patient are acquired by scanning the middle part of his or her leg. While being scanned, the patient may move his or her leg naturally through a range of knee angles, resulting in multiple sequences of CT images sampled at a preset temporal frequency. For each image sequence acquired at a time instance, we only use a portion that corresponds to cross sections through the femur and the patella. As an example, the anatomical structure of the middle part of a leg is shown in Figure 1(a) with the femur, patella, and tibia labelled. As shown in Figure 1(a), the portion of CT images that we use corresponds to the cross sections between the two planes as marked with blue parallelograms. Examples of the CT images in this portion are shown in Figure 1(b), and the ideal segmentation result for a CT image is shown in Figure 1(c) where the femur and the patella regions are neatly segmented and the narrow gap between them corresponds to the sutura.
We presume that the input CT images are ordered such that images of higher scanning positions on the leg go earlier in the sequence. Figure 2 to illustrate the automatic segmentation and measurement process. For an input CT image sequence as shown in Figure 2(a), we first segment the femur and the patella regions on each image to get the result as shown in Figure 2(b). Based on the profiles of the segmented regions, we use least squares fitting to find the central planes of the femur and the patella, respectively, as shown in Figure 2(c). Finally, we quantify the geometric relationship between the two central planes by PTA and PLS, which provide the basis for patellar dislocation diagnosis.

Segmentation of Femur and Patella Regions
Segmenting two solid bone regions corresponding to the femur and the patella, respectively, in each knee CT image is a key step in our scheme. It is a challenging task due to the following characteristics of knee CT images (as illustrated by Figure 3): (1) a single CT image usually contains responses of bones and other tissues (e.g., soft tissues) simultaneously and is contaminated with noises; (2) the patella and the femur regions may be very close to each another (e.g., only a couple of pixels apart or even locally fused) in many CT images; and (3) different parts (e.g., cortical bone tissue, spongy bone tissue, bone marrow, and bone cavity) inside a bone usually have different radiological densities, leading to highly variant gray levels of pixels in one bone region. e generic problem of image segmentation has been researched for decades. For a survey of the early-days algorithms, we refer to references [7,8]. Later on, with the development of medical imaging technology, intensive and specific efforts have been made to segment various types of medical images. e existent medical image segmentation algorithms can be classified as threshold-based methods [2], region-based methods [9][10][11], edge-based methods [12], active-contour-model-based methods [13][14][15][16][17][18][19][20][21][22], hybrid methods [23][24][25][26], and others [27][28][29][30][31][32].
Among the various methods, the active-contour-modelbased ones appear more advantageous to us. Relatively speaking, they handle structures with high topological complexity well and achieve subpixel accuracy and robustness against noise. In addition, they incorporate easily with other segmentation techniques and facilitate intuitive interaction [33,34]. In particular, we choose the Chan-Vese (C-V) region-based active contour models [17] for our knee CT image segmentation, as it is in general less sensitive to initialization and noise than many other methods [13][14][15][16][17][18][19] of its category. Further, according to our experiments (see Section 5), it yields better segmentation results than the other selected active-contour-model-based methods [18,21,22], when used with our proposed framework. e existent image or medical image segmentation algorithms usually assume that the pixels inside a meaningful region have highly uniform levels of intensity. In addition, the contrast between meaningful and nonmeaningful regions and the noise level in an image also influence the segmentation results. ese algorithms cannot be directly applied for our purpose due to the highly challenging characteristics of the knee CT images, as described earlier in the text. erefore, we propose to improve the quality of the knee CT images first, by increasing the uniformity of pixel intensities, enhancing the contrast, and suppressing the noises, before making the final segmentation. Specifically, we process the CT images in an input sequence one by one in spatial order. For each CT image, we enhance its contrast to increase (decrease) the gray levels of the bone tissue (soft tissue and noise) pixels, predict the bone and sutura regions in it and modify its pixel values accordingly, utilizing the segmentation result of the previously processed CT image, if any, and employ the C-V region-based active contour method to make the final segmentation on the modified image. Details of these steps are given in the following sections.

Contrast Enhancement.
e common global contrast enhancement method based on histogram manipulation does not work well for our case. e reason is that the pixels' gray levels concentrate around very low and very high values (see Figure 4), leaving little room for contrast enhancement. Instead, we propose a contrast enhancement method based on local characteristics around each pixel. Observing that higher (lower) gray levels correspond to bone tissues (soft tissues and probably noise), we increase (decrease) the gray level of a pixel with brighter (darker) neighborhood.
Specifically, we perform a nonlinear scaling of each pixel's gray level according to its neighboring pixels' gray levels [6]. For a pixel, p 0 , we denote its gray level as g 0 and     the gray levels of all the other pixels in p 0 's 3 × 3 neighborhood as g i (i ∈ 1, 2, . . . , 8). Assuming that the maximum gray level is 255, we update g 0 to g 0 ′ according to (1) We find by experiments that the above process, when iterated for two or three passes, yields good results.
We show the effect of contrast enhancement on an example CT image in Figure 5 , however, we find that the intensity of some pixels in the sutura region is unwantedly increased at the same time, reducing the gap between two bone regions and adding to the difficulty of bone regions segmentation. is issue is addressed by the proposed bone regions prediction technique, as described in the following section.

Prediction of Bone
Regions. Narrow and vague gap between bone regions and inhomogeneous pixel intensity within bone regions are limiting factors for accurate bone region segmentation. In order to address these issues, we propose a bone region prediction process that further improves the CT image quality to facilitate accurate segmentation, as detailed below.
On any input CT image sequence used in our experiments, we observe two facts: firstly, the femur and the patella regions are relatively small and wide apart and contain highly homogeneous pixel intensity in the initial CT images of the sequence; secondly, the shape and the position of a bone's profile vary only slightly between two adjacent CT images in the sequence.
e former implies that we may apply a prevalent image segmentation algorithm on the first CT image (after contrast enhancement) in the sequence to obtain a good result, while the latter implies that a good segmentation result on a CT image may be utilized to predict the bone and sutura regions in the next image to be segmented.
Assume that we are currently processing the (n + 1)-th (n ≥ 0) original CT image, I n+1 ori , in the sequence. After the contrast enhancement, we obtain I n+1 enh . If n � 0, we simply use I n+1 enh as the modified image, I n+1 mod , which is to be segmented. Otherwise, we already have the segmentation result for the n-th image, which is a binary image, I n seg , with "255"pixels for the bone regions and "0"-pixels for the background. Using I n seg , we improve the quality of I n+1 ori by a proposed process of bone regions prediction to obtain I n+1 mod , as detailed below.
Firstly, based on I n seg , we predict in I n+1 ori the sutura region, Q 1 , and the local bone region, Q 2 , around the sutura, enabling us to treat these local regions with special care in the following steps. Specifically, in I n seg , we morphologically dilate the femur region, F n , and the patella region, P n , to F n d and P n d , respectively, by a disk with a radius of r big pixels. Empirically, we take r big ∈ [8,12]. Locations in Q � F n d ∩ P n d which correspond to nonbone pixels ("0"-pixels) in I n seg form the predicted sutura region, Q 1 , and Q 2 � Q − Q 1 gives the predicted local bone region around the sutura in I n+1 ori . An example of the local regions prediction is shown in Figure 6(a) with Q 1 and Q 2 colored in yellow and blue, respectively.
Secondly, we selectively revert pixels in I n+1 is is based on the observation that the contrast enhancement tends to narrow the gap between the two bone regions (see Figures 5(c) and 5(d)), adding to the difficulty of bone regions segmentation.
irdly, in order to increase the bone regions' density homogeneity, we combine I n+1 enh with I n seg to obtain I n+1 mod according to where α, β, and th a are parameters to control the degree of fusion. We empirically use α � 0.4, β � 0.6, and th a � 0.5 × 255. In extreme cases, if th a � 0, I n+1 mod � I n+1 enh and if th a � 256, I n+1 mod � αI n seg + βI n+1 enh . Fourthly, we reduce the intensities of the predicted local region, Q, around the sutura in I n+1 mod , making a clearer separation of the two bone regions.
is is achieved by selectively updating pixels in I n+1 mod according to where th b is a threshold set to the mean gray level of all our test CT images, and we empirically use μ 1 � 0 and μ 2 � 0.5. By equation (3), we weaken pixels in Q whose original intensities are below a threshold. If a pixel's original intensity is above the threshold, however, it is probably a bone pixel and we leave it untouched. Note that we weaken pixels in two subregions, Q 1 and Q 2 , since the predicted sutura region, Q 1 , is usually not completely precise and we choose to weaken selected pixels in a wider local area, i.e., Q 1 + Q 2 .
Lastly, based on I n seg , we predict in I n+1 mod a thin layer, B, of pixels between the two bone regions when they get close to each other and set these pixels to "0" for further separation of the bone regions. Specifically, in I n seg , we morphologically dilate F n and P n by a disk with a radius of r small pixels to F ′n d and P ′n d , respectively, and obtain B � F ′n d ∩ P ′n d . Empirically, we take r small ∈ [4,6]. An example is shown in Figure 6(b) with B colored in red. Depending on the shapes of and the distance between the two bones regions, there may be none, one, or multiple connected components in B.

C-V Region-Based Active Contour Segmentation.
After I n+1 ori (n ≥ 0) is modified to I n+1 mod , we employ the C-V models to segment I n+1 mod . e C-V model was originally proposed by Chan and Vese [17] and is based on the following energy model: where λ 1 , λ 2 , µ, and ] are constants, C in and C out represent the regions inside and outside contour C, respectively, and c 1 and c 2 correspond to the average pixel intensity in C in and  Figure 6: Predictions of various local regions around the sutura using a sample segmentation image. As shown in (a), the predicted sutura region (Q 1 ) and local bone region around (Q 2 ) are colored in yellow and blue, respectively. As shown in (b), the predicted thin pixel layer (B) between the bone regions is colored in red. An overlaid visualization of all these predicted local regions is given in (c). We use r big � 12 and r small � 5 when constructing these local regions.
Computational and Mathematical Methods in Medicine C out , respectively. e solution of optimal contour C is reached by minimizing the energy function F CV (c 1 , c 2 , C), resulting in an optimal segmentation of the image I.
As an example, we show an original CT image in Figure 7(a), the image modified by the contrast enhancement in Figure 7(b), and the image modified further by the bone regions prediction in Figure 7(c). We observe in Figure 7(c) that the bone region prediction leads to improved intensity homogeneity of the bone regions, suppressed soft tissue intensities, and well cleared sutura between the bone regions. Applying C-V models on the three images, we obtain the corresponding segmentation results as shown in Figures 7(d), 7(e), and 7(f ), respectively. Comparing these three figures, we observe that the proposed contrast enhancement and bone regions prediction techniques lead to significantly improved segmentation results.

Automatic Measurement
In a segmented CT image, we expect to have two major regions with right shapes, corresponding to the femur and the patella, respectively. In rare cases, it may happen that more or less than two regions are segmented on a CT image or/and the shapes of segmented bone regions change tremendously between CT images, mostly due to low CT image quality.
ese cases can be easily detected based on the number of and the geometric properties (e.g., position and area) of the segmented regions. We simply discard these outlier cases and do not use them for measurement. e CT images are acquired on parallel cross sections of the knee region, as shown in Figure 1. As such, we locate a few key points on each CT image on the boundaries of the femur region and the patella region, respectively, and then compute the central planes for the femur and the patella bones by optimally fitting those key points on the CT images.

Selection of Key Points.
For the femur region in each CT image, we select three points as the key points: the two central valley points along the boundary and the middle of the leftmost and the rightmost points. For the patella region in each CT image, we select three points as the key points: the two central peak points along the boundary and the middle of the leftmost and the rightmost points. ese key points can be easily identified through boundary tracking and inflection point detection. is key point selection scheme is illustrated in Figure 8(a).

Plane Fitting.
e central plane of the femur (patella) bone is determined by optimally fitting a plane to the key points of the femur regions (patella regions) on the stack of CT images. In general, denoting the points as p i (x i , y i , z i ) (i � 1, 2, . . . , K) and the plane equation as z � ax + by + c, the plane that optimally fits those points can be obtained by which can be solved with the least squares method.

PTA and PLS Measurement.
We measure the patella tilt angle, θ, between the femur and the patella's central planes, as illustrated in Figure 8(b). It is measured by the angle between the normals of the two bone's central planes. Further, we measure the patella lateral shift, D, between a pair of parallel approximate central planes of the femur and the patella, as illustrated in Figure 8(c). For this purpose, we fit a pair of parallel planes to the femur regions' and the patella regions' key points, respectively, and measure the distance, D, between the planes. Assuming that the equations of the two parallel planes are z � ax + by + c 1 and z � ax + by + c 2 , given the femur regions' key points as p i (x i , y i , z i ) (i � 1, 2, . . . , K) and the patella regions' key points as 1, 2, . . . , K), the parallel plane fitting is done by using the least squares method.

Results and Discussion
In the experiments, we conduct automatic segmentation and measurement on our dataset of knee CT images using the proposed scheme and validate the results of both the segmentation and the measurement.

Dataset.
Our dataset is composed of fifteen patients' knee CT images that were acquired using the Toshiba Aquilion ONE CT scanner in the affiliated hospital of Shandong University of TCM. Among the fifteen patients, ten are female and five are male. While being scanned, each patient was asked to move her/his legs freely from 0°to about 90°, and 22 CT image sequences were sampled at 22 time instances, one at each, during the scanning process. Each CT image sequence includes 320 images, 70 of which corresponding to the upper part of the leg (ref. Figure 1(a)) are used as the input to our system. e CT scanner is set up such that the thickness of each slice and the interval between two adjacent slices are both 0.5 mm, the default window width is 30 HU, the window level is 320 HU, and every CT image has a resolution of 512 × 512.

Validation of Bone Region Segmentation.
In this section, we validate the bone region segmentation results both visually and quantitatively. In order to validate our choice of the C-V models [17], we compare with the following benchmark methods for image segmentation: the biascorrected fuzzy c-means method (BCFCM) proposed by Mohamed et al. [35], the updated region-based active contour method using region-scalable fitting (RSF) energy function proposed by Li et al. [18], the level set method with bias field (LSEBFE) proposed by Li et al. [21], and the active contours driven by local image fitting energy (LIF) proposed by Zhang et al. [22]. For each image segmentation method, we run it both without and with our proposed framework, meaning that we run it both directly on the original CT images and on the CT images after modification with the approach proposed in Sections 3.1 and 3.2.
BCFCM modifies the objective function of the standard fuzzy c-means (FCM) algorithm to compensate for intensity inhomogeneities and allows the labeling of a pixel (voxel) to   be influenced by the labels in its immediate neighborhood, which leads to better segmentation results than the standard FCM. RSF is a modified region-based active model using local intensity information at a controllable scale, which can preserve local details better and have higher robustness to intensity inhomogeneity. Note that BCFCM and RSF have been widely used in medical image segmentation. LSEBFE is a region-based level set method with bias field. It derives a local intensity clustering property of the image intensities and defines a local clustering criterion function, which are integrated with respect to the neighborhood center to give a global criterion of image segmentation. is criterion defines an energy in terms of the level set functions and a bias field that accounts for the intensity inhomogeneity of the image. It is more robust to initialization, faster, and more accurate than the well-known piecewise smooth model. LIF is a region-based active contour model that embeds the image local information. It uses Gaussian filtering for variational level set to regularize the level set function. It can not only ensure the smoothness of the level set function but also eliminate the requirement of reinitialization. Both LSEBFE and LIF are proposed to segment images with intensity inhomogeneities.

Visual Validation.
In this section, we present the segmentation results of C-V, BCFCM, RSF, LSEBFE, and LIF on two representative challenging CT images, as shown in Figure 9. In the first image (in Figure 9(a)), the two bone regions are very close to each other while in the second image (in Figure 9(m)), there is more significant noise and weaker bone boundary response. Besides, both images have a high level of intensity inhomogeneity. In Figure 9, the first column shows the original CT images and the ground truth of their segmentations provided by experienced clinicians (i.e., Jiushan Yang, Shaoshan Wang, and Ruiqi Zou), and the following columns show the segmentation results by the five image segmentation methods, respectively. Further, the segmentation results in the first and the third rows are obtained with our framework (i.e., CT image modification followed by image segmentation) while those in the second and the fourth rows are obtained without our framework (i.e., they are obtained by direct image segmentation).
Comparing the segmentation results with and without our framework in Figure 9, we observe that for any of the image segmentation methods, our proposed framework promotes the performance by a large margin, leading to more neatly and accurately segmented femur and patella regions. is also demonstrates the robustness of the proposed framework to soft tissues responses, intensity inhomogeneities, and noises in the CT images. Comparing the segmentation results of all the five image segmentation methods with our framework, we observe that the C-V method is the most advantageous in terms of accuracy and smoothness of segmented bone boundaries, confirming our choice of the C-V method in the proposed scheme.

Quantitative Validation.
For the quantitative validation, we randomly choose the automatic segmentation results on 30 CT images from each leg of each patient's dataset and also manually mark the bone regions segmentation on each chosen CT image which is used as the ground-truth reference. Similar to Yao et al. [36], we use three metrics, i.e., overlap rate (OLR), false-positive rate (FPR), and Dice similarity coefficient (Dice), to quantitatively validate the segmentation accuracy. On a CT image, if we denote the automatic and the ground-truth segmentations of a bone region as R a and R g , respectively, these metrics are defined as OLR � |R a ∩ R g |/|R g | × 100%, FPR � |R a − R g |/|R a | × 100%, and Dice � 2 × |R a ∩ R g |/|R a | + |R g |. In addition, we measure the separation rate, SR, of the patella and the femur regions in the segmentation result. If they are completely separated, we set SR � 100%; otherwise, SR � 0%.
In Table 1, we show the mean and standard deviation statistics of OLR, FPR, and Dice and the mean statistics of SR for all the five image segmentation methods (C-V, BCFCM, RSF, LSEBFE, and LIF) with and without our framework (i.e., CT image modification followed by binary image segmentation) on all the 30 × 2 × 15 test CT images. Here, OLR, FPR, and Dice metrics are computed by treating the patella and the femur regions as one united bone region in each CT image.
From Table 1, we observe that (1) for any of the five methods, its mean OLR, mean Dice, and mean SR values are all increased and its mean FPR value is decreased when our framework is used, showing the effectiveness of our proposed CT image modification technique; (2) for any of the five methods, its mean SR value with our framework reaches 100%, showing the effectiveness of our proposed bone regions prediction technique in separating the two bone regions; and (3) when used with our framework, the C-V method yields the largest mean Dice, a SR value of 100%, the second largest mean OLR, and the third smallest mean FPR value and appears superior to the other methods considering all metrics overall.
In Table 2, we show the overlap rate and false-positive rate statistics for the femur and the patella regions on both legs of all the patients. For each bone region on each leg of each patient, we compute the two rates on all the 30 chosen CT images, average the rates over the 30 samples, and list the average in Table 2 where OLR F (OLR P ) and FPR F (FPR P ) mean the overlap rate and the false-positive rate of the femur (patella) region, respectively. Further, we compute the mean and the standard deviation on each of the OLR F , OLR P , FPR F , and FPR P statistics of each leg and place them at the bottom two rows in Table 2.
From Table 2, we observe that (1) for either leg and either bone region, the mean overlap rate is close to 95% and the mean false positive rate is close to 2%, showing the high accuracy of our bone regions segmentation scheme; (2) the standard deviations of the various rate statistics are all below or slightly above 3%, showing the stability and robustness of our bone regions segmentation scheme; and (3) rates of the same type on both legs are quite comparable, again confirming the stability and robustness of our bone regions segmentation scheme.
We show the Dice statistics for the femur and the patella regions on both legs of all the patients in Table 3. For each 8 Computational and Mathematical Methods in Medicine Figure 9: Segmentation results by different image segmentation methods on two representative challenging CT images. e first column shows the original CT images and the ground truth of their segmentations, and the following columns show the segmentation results by C-V, BCFCM, RSF, LSEBFE, and LIF, respectively. e segmentation results in the first and the third rows are obtained with our proposed framework while those in the second and the fourth rows are obtained without our framework.  Table 3. From Table 3, we see that all the Dice statistics are above or slightly below 0.96 and the standard deviations of the Dice coefficient are close to 0.02, further confirming the high accuracy, stability, and robustness of our bone region segmentation scheme.

Measured Angles and Distances.
In this validation, we pick up the CT image sequences taken at four random time instances, T 1 , T 2 , T 3 , and T 4 , for four randomly picked patients' left or right legs. For the CT image sequence at each time instance, we use our system to automatically measure the angle (i.e., the PTA), θ, and the distance (i.e., the PLS), D, between the two bones' central planes. Note that when θ > 15 ∘ , we do not measure D. For the purpose of comparison, we ask several radiologists to measure the same parameters on the CT images manually. We use the unit of degree for angle measurement and the unit of millimeter for distance measurement. Note that for the automatic measurement, we have converted the unit of pixel to the unit of millimeter, knowing that one pixel corresponds to 0.95 millimeters in the photographing. e corresponding statistics is given in Table 4, from which we see that there is very  little difference between the automatically and the manually measured angle numbers. Similarly, the automatically and the manually measured distances also closely match each other.

Diagnosis by Measured Results.
We further test the accuracy and reliability of using the automatically measured results for diagnosis. According to orthopedists, the angle, θ, between the femur and the patella bones' central planes provides the most important basis for patellar dislocation diagnosis. us, patellar dislocation may be straightforwardly diagnosed by comparing the measured angle against threshold values. Specifically, as an initial test, we set our system to automatically diagnose normal if θ ≤ 10 ∘ , patellar subluxation if 10 ∘ < θ < 30 ∘ , and patellar dislocation if θ ≥ 30 ∘ .
For this test, we have the dataset of 30 legs from 15 patients. For each leg, we use all the 22 CT image sequences and the average results of the 22 image sequences to make the diagnosis. Among the 30 samples, 11 samples (36.7%) are diagnosed as normal, 16 samples (53.3%) as patellar subluxation, and 3 samples (10%) as patellar dislocation, based on the automatically measured angles and the above-described diagnosis rule. On the same set of data, our orthopedists made diagnosis as well through manual measurement and clinic analysis and diagnosed 10 samples (33.3%) as normal, 17 samples (56.7%) as patellar subluxation, and 3 samples (10%) as patellar dislocation. Comparing the automatic and the manual diagnosis results, we find that the error rates of the automatic diagnosis on normal, patellar subluxation, and patellar dislocation are 9.1%, 7.1%, and 0%, respectively. Further, we visualize the distribution of the automatically measured angles with respect to the orthopedists' manual diagnosis results in Figure 10. We see from Figure 10 that all the cases with θ ≥ 30 ∘ are diagnosed by the orthopedists as patellar dislocation, and the majority of the cases with θ ≤ 10 ∘ and 10 ∘ < θ < 30 ∘ are diagnosed by the orthopedists as normal and patellar subluxation, respectively. ere is fuzziness only for a small portion of cases with θ closely around 10 ∘ .
As a refined test, we further investigate the effectiveness of using distance as an auxiliary means for the patellar dislocation diagnosis. We only focus on the samples with 5 ∘ < θ < 15 ∘ , as there is fuzziness for samples with θ around 10 ∘ in our initial test. For these samples, the distances between the two bones' central planes are automatically measured and their distribution with respect to the orthopedists' diagnosis results are plotted in Figure 11. From Figure 11, we see that a distance threshold of 4.5 mm will accurately separate the cases of normal and patellar subluxation, thus eliminating the errors of diagnosis in our initial test where only angles are used.  Orthopedists' diagnosis Measured angles Figure 10: Distribution of automatically measured angles with respect to orthopedists' diagnosis: 0-normal, 1-patellar subluxation, and 2-patellar dislocation.

Conclusions
In this work, we have developed a system for automatic segmentation and measurement on knee CT images. Firstly, on each CT image in an input sequence, we segment the femur and the bone regions; thereafter, we identify key points on the bone regions' boundaries and conduct optimal fitting to obtain the central planes of the two bones; finally, angles and distances between the central planes are measured which can be used to assist doctors in patellar dislocation diagnosis.
Of the whole process, the biggest challenges exist with the bone region segmentation, due to the confusion from soft tissue responses and noises, inhomogeneity of bone region intensities, and close or even fused bone regions in the sutura area. To overcome these challenges, novel and effective methods are proposed to improve the quality of input CT images by enhancing the contrast of each CT image and predicting the bone regions in a CT image utilizing the coherence between adjacent CT images. e improved CT images are finally segmented using a region-based active contour method. e accuracy and robustness of the automatic segmentation and measurement results are validated in our experiments.
In the future, we will extend our system to measure more parameters as needed for the diagnosis. Furthermore, we will investigate reconstructing a 3D volume of the bones from the CT images and conduct measurements on this 3D volume with increased capability and flexibility.

CT:
Computerized tomography CAD: Computer-aided diagnosis PTA: Patellar tilt angle PLS: Patellar lateral shift C-V: Chan-Vese BCFCM: Bias-corrected fuzzy c-means method RSF: Region-scalable fitting FCM: Fuzzy c-means LSEBFE: Level set method with bias field LIF: Local image fitting OLP: Overlap rate FPR: False-positive rate Dice: Dice similarity coefficient SR: Separation rate.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Disclosure
A preliminary version of this work has been presented in 2013 IEEE International Conference on Image Processing (https://ieeexplore.ieee.org/document/6738233).

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Measured distance VS orthopedists' diagnosis
Orthopedists' diagnosis Measured distance (mm)  Figure 11: Distribution of automatically measured distances with respect to orthopedists' diagnosis: 0-normal and 1-patellar subluxation. e distance corresponds to the cases with measured angles between 5 ∘ and 15 ∘ .