Several segmentation methods are implemented and applied to segment the facial masseter tissue from magnetic resonance images. The common idea for all methods is to take advantage of prior information from different MR images belonging to different individuals in segmentation of a test MR image. Standard atlas-based segmentation methods and probabilistic segmentation methods based on Markov random field use labeled prior information. In this study, a new approach is also proposed where unlabeled prior information from a set of MR images is used to segment masseter tissue in a probabilistic framework. The proposed method uses only a seed point that indicates the target tissue and performs automatic segmentation for the selected tissue without using labeled training set. The segmentation results of all methods are validated and compared where the influences of labeled or unlabeled prior information and initialization are discussed particularly. It is shown that if appropriate modeling is done, there is no need for labeled prior information. The best accuracy is obtained by the proposed approach where unlabeled prior information is used.
Recent advances in medical imaging have enabled the derivation of useful information about different body parts and tissues. Two major imaging modalities, computed tomography (CT) and magnetic resonance imaging (MRI), are commonly used as sources to extract anatomical structures. Despite the fact that CT is preferred for hard tissues, such as bone, MR images are commonly used for evaluating the presence and extent of the soft tissue volumes such as brain and heart.
Nowadays, doctors and clinical specialists take the advantage of imaging modalities in gathering anatomical information about a patient and are able to use this information in diagnosis and prognosis. The further step is to involve artificial intelligence to automate this diagnosis/prognosis process for segmenting target tissues.
Currently, most of the automatic soft tissue segmentation methods in the literature consider tissues like brain, heart, and lung as target tissues and there are very few works about facial soft tissue (FST) (e.g., facial muscles) segmentation. Considering the key role of the face in human life and the huge increase in craniofacial surgeries around the world, FST segmentation has become more important in recent days. Planning before a facial surgery by performing the modifications virtually prior to the actual operation [
Soft tissue segmentation is very complicated due to the fact that soft tissues do not have a constant shape. Moreover, segmentation becomes more complicated when the soft tissues interfere with each other and this is always the case for FSTs. To solve these problems, prior information is commonly used in a different manner to improve the segmentation quality.
By prior information, we mean the knowledge that we took from a set of individual MRI scans which can be considered as the training set and used to determine prior shapes and locations of the target tissues. This is quite like the method when a specialist doctor extracts the target tissue in a new image based on his/her past experience of viewing thousands of similar images. The standard method is to manually label the training data and construct an atlas from it [
Currently, these methods have been used for soft tissues other than FST in the literature. We implemented representative examples of the methods in the literature and compared them for the purpose of segmentation of masseter muscle.
Moreover, we proposed a new segmentation approach, which requires very little user interaction. Instead of manually labeled atlases, unlabeled training images are used as hidden atlases for the purpose of evaluating the effect of unlabeled prior information. The main reason in using the unlabeled prior information is that manual labeling of tens of medical image data sets is a very complicated and time consuming task and is prone to error.
In our previous work [
The unlabeled prior knowledge was used in our MRF structure and we tried to optimize the segmentation results iteratively by using expectation maximization (EM) algorithm. Finally, we compared our method using unlabeled prior information with the previously mentioned methods using labeled prior information and evaluate the advantages and disadvantages of all methods.
Although there are plenty of methods that perform soft tissue segmentation in the literature, facial soft tissue (FST) segmentation has received relatively little attention. Considering the visualization similarities between FST and other soft tissues like brain, the segmentation process can be the same theoretically. But due to different characteristics of these tissues, such as more complicated and interfered structure of FSTs than other soft tissues, more precise and powerful segmentation methods are needed for FST segmentation.
Facial soft tissues are usually small and surrounded with other tissues that share the same intensity values [
Purely intensity-based segmentation and classification methods assign a label to each pixel in the image and require only the intensity information that is generated by the MR imaging device. However, in medical image segmentation, different anatomical structures may have the same intensity values or intensity distributions that cannot be distinguished from each other. In such cases, extra information should be considered and included in the segmentation process. Spatial information like neighborhood relationships between pixels can be very useful in segmenting individual tissues. In addition to geometrical constraints, relationships between several different but similar data sets can also be considered. The additional data that is used in a segmentation process is called the prior information. Soft tissue segmentation methods usually use prior information in a different manner to improve the segmentation accuracy. The prior information is included mostly in the form of single or multiple atlases. An atlas can be presented as a single manually segmented data (2D image or 3D voxel volume or 2D/3D sequences) or can be formed from multiple manually segmented data [
As the number of atlases fused increases, the average segmentation accuracy increases [
Atlases should be registered to the query data before the segmentation process. Segmentations in atlases are transformed to the query data and subsequently fused or combined. One way of atlas-based segmentation is to transform the atlas segments to the test data by using nearest-neighbor interpolation so that each atlas provides a discrete labeling for each voxel. The final label can then be decided by “majority vote” [
Another method to incorporate the atlas in the segmentation process is to use MRF (Markov random field) or HMRF (hidden Markov random field) models. MRF models are commonly used for unsupervised segmentation of medical data since smoothness constraint can easily be incorporated to the model by neighboring relations among the pixels to be segmented. The first studies of brain segmentation use the basic HMRF formulation where smoothness is defined based on the resemblance of the neighbors [
In [
However, the usual way of improving the MRF performance in segmentation is to use parametric model where the parameters are learned from the image usually by EM (expectation maximization) algorithm [
In [
A manually constructed probabilistic atlas is used in [
In the literature, using the atlas as the prior probability of the labels is the most commonly chosen method to incorporate the prior information to the segmentation. However, this requires manually segmented atlases to be prepared. In this study, we propose another way for this cooperation where no manually labeled atlas is required.
All methods mentioned above perform segmentation for soft tissues such as brain, lungs, and cardiac. Very few studies considered facial soft tissue (FST) segmentation for MR images.
In the literature, FST segmentation is mostly done for clinical purposes with manual or other simple segmentation methods where human interaction is required. Manual segmentation can also be combined with the help of segmentation tools as in [
Anatomical visualization is another application of FST segmentation. In [
Other than manual methods, there are some other automatic or semi-automatic methods studied for FST segmentation. The main problem with classification algorithms in FST segmentation is the presence of several tissue types in one MRI slice. These tissue types may be different in the corresponding slices among different individuals. Therefore, the segmentation results may be poor or too many manual interactions may be needed.
Ng et al. [
All these methods need user interaction in several steps during the segmentation process. Also a manual thresholding method is used to exclude bone and fat that makes the method less automatic.
The complete and automatic segmentation of facial soft tissues still remains as an unsolved problem. In this work, we aim to investigate some of the methods which have been tested in segmentation of other soft tissues and try to modify them to be used in FST segmentation.
Our aim in this study is to investigate the role of the labeled or unlabeled prior information in facial soft tissue segmentation. For this purpose, we apply several existing two dimensional (2D) segmentation methods for target facial soft tissues. These methods are chosen because they are the representatives in the previous literature, which use prior information in some way or the other. A comparison between these methods will clarify different aspects of prior knowledge-based segmentation methods. These methods are as follows.
Then our newly proposed segmentation approach, MRF-based segmentation using unlabeled prior information (Method e), will be introduced and applied to the same image sets for 2D segmentation.
Masseter muscle in head is selected as the target tissue in this study. Masseter is a strong and large muscle, responsible for jaw motion. An axial view of both right and left masseter muscles in an MR image is shown in Figure
A sample slice. Target tissue borders are shown in green.
All images used in this work are whole head and neck 3D MRI sets which are obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) [
A block diagram is given in Figure
Block diagram of all methods.
The method is similar to [
We assume that pixels are connected to each other through the neighboring system as shown in Figure
Neighboring system. The dark cube is the current pixel and the bright cubes are the neighbors.
Here
Each element of
Let
Segmentation process is defined as assigning a unique value to each site in
Then the set of labeling for all sites in
We call the family
A set of random variables
The Hammersley-Clifford theorem [
Then, the conditional probability
Here, the model includes only pairwise
Here,
By using Bayes estimation, the posterior probability can be computed from the prior distribution (i.e., smoothness in MRF literature) and the likelihood,
Here, the parameter set is
Minimizing the Bayes risk is equal to maximizing the posterior probability. The expectation maximization (EM) algorithm is employed to maximize the posterior probability. In this iterative algorithm, the posterior probability for step
Then the model parameters are obtained in the maximization step as follows:
This process is repeated until the likelihood difference, that is,
Here,
Unlike other methods that perform a MAP estimation to estimate the labeling and use it in pair-wise clique potential computation, we define the prior probability
A presentation of prior information used in Method e.
As can be observed from the image, the prior information gives a good estimate of the pixels that may be in the target tissue. This image is like an imaginary image that a specialist may have in her/his mind due to seeing thousands of MRI pictures.
The important point about this picture is that the target tissue is fully unconnected from the neighboring tissues. This feature helps the segmentation process a lot in the segmentation of FSTs that are generally connected to the neighboring tissues.
The validation of the segmentation methods was done by comparing the automatic segmentation results with the manual segmentation results. For this purpose, the target tissue is segmented manually in each slice by an expert. This process is repeated for all 10 experimental sets and these manual segmentations are only used as the ground truth.
We used dice metric
The overall accuracy results of 2D segmentation for all methods for all sets are shown in Table
Accuracy results for all data sets and methods.
Case | M. a | M. b | M. c | M. d | M. e |
---|---|---|---|---|---|
Case |
66.66 | 88.24 | 86.87 | 86.98 | 93.07 |
Case |
83.23 | 86.12 | 93.33 | 93.37 | 83.74 |
Case |
76.72 | 88.01 | 84.18 | 71.56 | 90.03 |
Case |
30.24 | 49.92 | 82.86 | 82.68 | 78.39 |
Case |
84.75 | 19.34 | 78.9 | 84.16 | 90.67 |
Case |
60.98 | 75.3 | 77.56 | 59.88 | 77.23 |
Case |
81.97 | 68.97 | 74.38 | 71.89 | 66.49 |
Case |
82.16 | 92.65 | 75.68 | 89.14 | 93.4 |
Case |
83.35 | 82.4 | 90.71 | 94.69 | 96.11 |
Case |
79.13 | 68.25 | 92.89 | 91.05 | 96.07 |
| |||||
Average | 72.92 | 71.92 | 83.74 | 82.54 | 86.52 |
Atlas-based segmentation is known to be successful in brain tissue classification but as you can see in Table
This method completely depends on the atlas and when the shape and position of the tissue of the atlas are different from the shape and position of the tissue of the test data, then the registration may result in a very wrong answer.
As you can see in Table
This problem can be solved either by selecting the experimental data similar to the atlas or by increasing the number of training images in a way that covers all possible shapes, which is not very realistic.
The segmentation result for Method b is very successful in most cases, such as sets 1, 2, 3, and 8, but in some cases, such as sets 4 and 5, segmentation results are poor. To investigate this issue, we checked the initial labeling for the worst result (i.e., set 5) and the best result (i.e., set 8). The region growing outcome for sets 5 and 8 is shown in the original image in Figures
Initial labeling for (a) set 5 with RG, (b) set 8 with RG, (c) set 5 with modified RG, and (d) set 8 with modified RG.
Manual segmentation is given top left. Segmentation results using (a) method a, (b) method b, (c) method c, (d) method d, and (e) method e.
In Method c, we tried to solve the problem of initial labeling where a modified region growing algorithm (vii) was used for initialization. As you see in Table
The initial labeling with modified RG for sets 5 and 8 are shown in Figures
The segmentation performance for Method d is close to the MRF-based segmentation with modified region growing method (Method c) but it is about 1% lower. Despite the large amount of manual interaction required for the prior information in the MRF-EM part, this method shows lower accuracy than the previous method. This is mostly because of the initial estimation that is constant (
Finally, our proposed method (Method e) shows the best overall performance among all tested methods. In 6 out of ten data, the accuracy of this method is over 90%. The worst results are for sets 6 and 7 which also cause poor results by using normal EM-MRF method (Method c). So we can conclude that poor initialization is the problem for these cases. But this comment is not true for other low accuracies for sets 2 and 4.
The main problem is in finding a generic solution that results in a good accuracy for all of the images. But this requires that the training set should be big enough to overlap all possible shapes. Since manual labeling is not required for Method e, using many data as the prior information is possible. Another problem may be due to the affine registration which also may sometimes cause poor initialization.
When labeled prior information is used with the modified RG algorithm, Method c which uses MRF modality performs better than other methods. When unlabeled information is used, then Method e performs better than all other methods. It is important to note that the only manual interaction is the selection of a seed point and a threshold for region growing algorithm. The threshold value is kept constant because of the previously applied histogram equalization algorithm. Since Method e does not use any labeled training images, selection of a seed point for target tissue indication is inevitable. The rest of the method is fully automatic. The segmentation results of all methods for a test image are shown in Figure
Results of the segmentation algorithm for the bottom slices.
The average accuracy of the proposed method is about 3.5% lower than the results of our previous work [
In this study, we tested four different state-of-the-art methods for facial soft tissue segmentation on magnetic resonance images. Each method has a different way of including the prior information to the segmentation process. Some use labeled data as the prior information, some use this labeled data only as an initial estimation of the segment, and some do not use prior information at all.
Our main interest in this work was to investigate the role of the prior information in FST segmentation by using different methods. We applied all these methods on 10 different MRI data sets belonging to different individuals and aimed to segment the masseter muscle in them. The experimental MRI sets were registered 3 dimensionally before the segmentation so the slices roughly corresponded to each other.
Method a is fully based on the registration of the labeled training images to the test image. The average accuracy of this method is 72.92%. Method b is an MRF-EM based segmentation method where initial segment estimation is obtained by region growing which starts from a seed point. No prior information is used in this method and the acquired average accuracy is 71.92%.
According to our results, although Method a uses labeled prior information, the accuracy of Method b is very close to it. This shows that atlas-based methods are not as successful as expected in segmentation of FSTs. The most important reasons for this failure are the variation of the tissue shape among the sets and the existence of similar tissues in the neighborhood of the target tissue.
Method c is similar to Method b, except for the fact that the initial segment estimate is obtained by the modified region growing algorithm which uses prior information from the unlabeled training set. The accuracy is improved by 12% which emphasizes the importance of initial estimate in MRF-EM process and also the importance of using prior information.
In Method d, the similar MRF-EM framework is used but this time the labeled training images are used for both the initialization and the MRF model implementation. The method reaches 82.54% accuracy that is close to Method e which does not use any manual labeling. We may conclude that determining the target tissue with a seed point and a threshold (like we did in Method c) is more informative for MRF-EM framework than labeled atlases.
In the end, Method e uses unlabeled prior information both in initial estimation and during MRF-EM optimization. The average accuracy for this method is 86.52% which is the best result between the tested methods. The proposed approach starts from the same initial estimates as Method c but it uses prior information inside the MRF-EM process that causes about 4% improvement in the final segmentation accuracy. The importance of using prior information can be shown better when we compare Method b with our proposed method where using prior information causes about 15% improvement.
In the previous studies, Ng et al. [
We also believe that by increasing the number of training sets, although unlabeled, the accuracy of the method will be improved.
The final goal of this study is to segment the masseter tissue three dimensionally and use the results to construct a realistic biomechanical face model for any individual.
The authors do not have any conflict of interest with the content of the paper.
Data used in the preparation of this article were obtained from Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (