Protocol for Quantification of Defects in Natural Fibres for Composites

Natural bast-type plant fibres are attracting increasing interest for being used for structural composite applications where high quality fibres with good mechanical properties are required. A protocol for the quantification of defects in natural fibres is presented. The protocol is based on the experimental method of optical microscopy and the image analysis algorithms of the seeded region growing method and Otsu’s method. The use of the protocol is demonstrated by examining two types of differently processed flax fibres to give mean defect contents of 6.9 and 3.9%, a difference which is tested to be statistically significant. The protocol is evaluated with respect to the selection of image analysis algorithms, and Otsu’s method is found to be a more appropriate method than the alternative coefficient of variation method. The traditional way of defining defect size by area is compared to the definition of defect size by width, and it is shown that both definitions can be used to give unbiased findings for the comparison between fibre types. Finally, considerations are given with respect to true measures of defect content, number of determinations, and number of significant figures used for the descriptive statistics.


Introduction
Natural fibres, in the form of bast fibres from plants like flax, hemp, and jute have made a noteworthy contribution to the composite industry since the 1990s [1,2]. However, having being used mainly for semistructural applications, for example, for automotive interior panels, natural fibres are attracting increasing interest for being used also for structural applications, for example, for wind turbine rotor blades [3]. In order to be successful in these structural applications, high quality fibres with good mechanical properties are required to form a competitive materials alternative to the conventional glass and carbon fibres.
A central quality parameter for natural fibres is the amount of fibre defects which is expected to affect their strength properties and their reinforcement efficiency in composites [4]. Observations of natural fibres using polarised optical microscopy reveal that the fibres contain a number of structural irregularities distributed along their lengths ( Figure 1). It is believed that these irregularities are local misalignments of the cellulose microfibrils within the fibre cell wall, and, as such, they are believed to form weak points and are therefore denoted defects; other terms like kink bands and dislocations are also frequently used in the literature [5]. Previous studies have shown that the defect content and the mechanical properties of the fibres are correlated with their processing history [6][7][8][9].
The application of polarised optical microscopy to identify defects in natural fibres was originally proposed by Preston [10] and has later been revisited by Thygesen and Hoffmeyer [5], where the physical principles of the technique are described in detail. The technique has been used in a number of studies to investigate fundamental materials research questions regarding defects in natural fibres (e.g., [7,11,12]). In parallel to this work of a merely academic nature, there is a need for the establishment of an industrial oriented protocol for the quantification of defects in natural fibres. This will form a tool for the industry, that is, fibre suppliers, textile preform producers, and composite manufacturers, to evaluate fibre quality and to support the joint 2 Journal of Textiles goal of achieving high quality natural fibres with improved properties.
The aim of the present study is to present a protocol based on simple experimental methods, together with easy implementable image analysis algorithms to allow for automatic determination of the defect content in natural basttype fibres from plants. Examples of results and data analysis will be shown for two differently processed flax fibres, and the method will be evaluated with respect to the selection of image analysis algorithms and the definition of defect size. Finally, some considerations for the use of the protocol will be given.

Protocol for Quantification of Fibre Defects
A flowchart showing the involved steps and elements of the protocol is presented in Figure 2, and the related descriptions are given in the subsections below. Here, a list of the required equipment, materials, and software follows, where the specifications for the ones used in the present study are given in brackets: precision tweezers (Dumont, Dumoxel 5), glass slides for microscopy (dimensions of 25 × 40 × 0.15 mm), double-sided tape, optical microscope with polarisation filters (Leitz Aristomet), camera on microscope (Leica Microsystems DFC290; image resolution 2048 × 1536; 8-bit grayscale), and computer software for algorithms and image analysis (MATLAB).

Sample Preparation.
Extracting single fibres from a fibre bundle (e.g., in the form of a yarn) can be achieved by the use of precision tweezers. Wetting the fibre bundle in demineralised water before extraction may be helpful to reduce adherence between the single fibres. The extraction of single fibres must be performed very carefully if not to inflict additional damage to the fibres. In particular, it should be ensured that the fibre region to be inspected for fibre defects is kept undamaged; for example, this region should not be grabbed by the tweezers.
The extracted single fibre is carefully placed on a glass slide and fastened with double-sided tape at the ends of the slide (see Figure 3). A droplet of demineralised water is placed onto the fibre, at the middle part of the glass slide. Another glass slide is placed on top to encapsulate the fibre and the water, and this completes preparation of a sample.

Image
Acquisition. The prepared fibre samples are inspected using optical microscopy. For each section of a fibre observed in the microscope, two images A and B are acquired. The first image A is acquired to quantify the area of the fibre, and the second image B is acquired to quantify the area of the defects. This approach is similar to the one used in the study by Thygesen and Hoffmeyer [5]; however, the method has been modified in the present study.
The two images A and B are acquired by using different microscope settings. For image A, dark field illumination is used to have the best possible contrast between the edges of the fibre and the surrounding background. An example can be seen in Figure 4(a). For image B, polarised light is used to be able to visualize the defects within the fibre and to have the best possible contrast between the defects and the undefected parts of the fibre. An example can be seen in Figure 4(c). This is achieved by using a cross-polar setup, and by aligning the fibre in the direction of one of the polarisation filters.
In preliminary work, images of fibre sections were acquired along the fibre with overlap between the sections, and these images were then stitched together (using an image analysis built-in function) to create a large image canvas to be analysed. It was however found that this leads to defects that are not equally focused along the fibre, which disqualified that approach.

Image
Analysis. Image analysis methods are applied to the acquired pairs of images. For image A, to quantify the fibre area, the applied method is the seeded region growing method. For image B, to quantify the defect area, the applied method is Otsu's method. As an alternative method for the defect area, the coefficient of variation method (as used in the study by Thygesen and Hoffmeyer [5]) is also presented here, and a comparison study will be presented.
Common for the three image analysis methods is the objective to segment the processed image into a foreground (fibre or defect area) and a background. An image can be mathematically represented by an × matrix where and are the pixel height and width of the image, respectively. For an 8-bit grayscale image, as suggested for this protocol, all elements of such a matrix have a value between 0 (completely black) and 255 (completely white). The objective of the image segmentation is to assign the value zero to all background pixels and positive nonzero values to the foreground pixels.

The Seeded Region Growing
Method for Fibre Area. The seeded region growing method segments images by growing regions from a single seed point. The method works well for segmenting images containing several areas of similar grayscale intensity separated by areas of dissimilar grayscale intensity. The algorithm is presented in Algorithm 1, and an example of the seeded region growing method in process is presented in Figure 5.
The algorithm starts at a single seed point for which all adjacent pixels are stored in a list, which is denoted the sequentially sorted list (SSL). From that list, the pixel with the grayscale level intensity closest to the region mean (the initial seed point pixel in this case) is added to the region. Afterwards, all pixels adjacent to this new pixel are added to the SSL, and again the pixel with grayscale level intensity closest to the region mean is added to the region. Several strategies exist to determine a criterion for which pixels should be grown into a given region and when the region should stop growing [13]. The applied algorithm in   the presented protocol uses a simple criterion relating to the mean pixel value of the already grown region. The criterion states that the while loop, which continuously adds pixels to the SSL, should break when the difference in grayscale level intensity between any of the pixels in the SSL and the mean of the pixels in the already grown region is lower than a specified limit value . The applied value for the limit value is the standard deviation of pixel values for the entire image multiplied by a factor 0.9. The initial seed point needed in the algorithm may be placed at any point in the image that is not a part of the fibre area; it is suggested simply to place the initial seed point at one of the corners of the image.
The algorithm of the seeded region growing method is quite simple in mathematical terms, but it is computationally expensive since a large amount of pixels have to be handled individually. To lessen the computational effort, an initial image analysis algorithm can be performed to crop the image to contain only the fibre, that is, to remove a large part of the background. A simple summation of row values to an array is performed, followed by a peak finding function to detect the probable edge of the fibre. Then, the gap between the peaks is extended by a safety margin to ensure that the entire fibre is within the peak points and the image is cropped from there. Subsequently, the seeded region growing algorithm is applied to the cropped area only. This will drastically reduce the computational time.
In the original description from which the algorithm is based [13], the SSL is a list sorted with respect to pixel grayscale level intensity such that pixels can be more efficiently added to the grown region. Using the original description of the algorithm may reduce the computational time, but the final result remains identical to the one obtained using Algorithm 1.

Otsu's Method for Defect Area.
Building on the idea of segmenting an image using a threshold value, Otsu's method provides a method for segmenting an image into more than two discrete grayscale intensity levels, by finding multiple thresholds 1 , 2 , . . . , . The segmentation is then performed by substituting the value 0, for all values in the image matrix below 1 , substituting the value 1, for all values ranging between 1 and 2 , and substituting the value 2, for the next range, and so forth. In practice, however, more than three threshold values (resulting in more than four Seed = , ⊳ The start seeding point (2) = ( , ) ⊳ The starting mean, equal to seed point intensity = 1 ⊳ Size of the initial region to be grown (4) while Δ min < do for = 1, . . . , 4 do ⊳ If the neighbourhood used is four neighbourhood (6) = + neighbor position = + neighbor position (8) if not already flagged then Add pixel to SSL ⊳ Add pixel coordinates and intensity value to the SSL (10) flag pixel end if (12) end for Add min − ⊳ Add the SSL-pixel closest to region mean to growing region (14) Add   image segments) require excessive computational effort to be feasible in the case of a large number of high resolution images.
The values of the thresholds can be obtained by minimizing the within-class variance of the segmented histogram of the image. The minimum within-class variance exists when the between-class variance is at maximum [13]. The between-class variance is prescribed in (1) for the case of three thresholds giving four segments: where is the between-class variance, is the weight of the 'th segment, is the mean of the 'th segment, and is the overall mean. The algorithm is presented in Algorithm 2. The image histogram is used as input. The algorithm loops through all possible combinations of threshold levels. The combination of threshold levels that provides the maximum between-class variance is regarded as the optimal threshold levels.
In the presented protocol, Otsu's method is used to obtain four segments of an image. Based on preliminary work of inspecting a large number of images, it was found that the most robust and reliable results are obtained by assigning the segments 0 and 1 to the background and the segments 2 and 3 to the foreground; that is, the last two segments are used to quantify the defect area.

Coefficient of Variation Method for Defect Area.
The principle of the coefficient of variation method is to find a threshold value where the foreground area is stabilized with respect to the threshold value. This principle is exemplified in Figure 6 showing a plot of the relation between the foreground area (i.e., the defect area) and the threshold value. The slanting plateau of the curves is found by comparison of the coefficient of variation calculated by analysing the foreground area from a given threshold value, and from the 10 previous and 10 succeeding foreground areas, corresponding to the 10 previous and 10 succeeding threshold values; that is, the coefficient of variation of a given threshold value is calculated based on 21 values of foreground areas. The threshold value that yields the lowest coefficient of variation is chosen as the stable threshold. In Figure 6, results are shown for four different images, and the stable threshold values are found to be in the range from 80 to 130. The algorithm of the coefficient of variation method is presented in Algorithm 3.

Data Analysis.
For each of the imaged fibre sections, the defect content is calculated as the ratio between the determined defect area (image B) and fibre area (image A). The defect content can be expressed in fractions, or in percentages. The boundaries for the defect content are 0%, equivalent to a fibre section completely free of defects, and 100%, equivalent to a fibre section completely filled with defects.
Based on a number of fibre samples, and a number of fibre sections imaged for each sample, the obtained data of defect contents can be analysed. Descriptive statistics, for example, a box plot with quartiles and fractiles to show the distribution, and values for mean and standard deviation, together with statistical tests, for example, a Student's -test, can be used to evaluate fibre quality and to compare results between fibre types.

Materials
Two types of flax yarns supplied by Safilin (France) were examined to demonstrate the use of the protocol. Fibres from the two yarns will be denoted by ESTLYS and LML, which are the codes used by the supplier for the two yarns. The ESTLYS yarn is based on short technical fibres, a byproduct from the scutching process, and spun by semiwet ring spinning with a twist number of 430 turns per meter. The LML yarn is based on long technical fibres, a direct product from the scutching process, and spun by wet ring spinning with a twist number of 250 turns per meter. Since the two yarns are processed differently, the fibres are expected to show a difference in defect content. In the present study, however, it will not be attempted to study in detail any correlation of defect content with the process conditions; this should be the topic for further work.

Results and Discussion
For each of the two flax fibre types, ESTLYS and LML, 6 fibre samples were prepared, and 5-19 fibre sections were imaged for each sample, giving in total 60 and 77 determinations of fibre content for the two fibres, respectively. The determined defect contents for the two types of flax fibres are shown in Figure 7 in the form of a box plot. It is Journal of Textiles 7 shown that although the distributions are overlapping, they are shifted with respect to each other with values for the median on 5.8 and 3.2% for the ESTLYS and LML fibres, respectively. The unequal spacing between the quartiles and fractiles indicates that the distributions are slightly skewed. This is most pronounced for the ESTLYS fibres which are skewed toward the lower defect contents. In addition, it can be observed that the spread of data is larger for the ESTLYS fibres. The mean ± standard deviation for the defect content is calculated to be 6.9 ± 5.0 and 3.9 ± 3.1% for the ESTLYS and LML fibres, respectively. If the data is assumed to be normal distributed, the Student's -test for two populations with unequal variance can be applied. The result of this statistical test shows that the null hypothesis of equal means can be rejected with a significance level of 1%. Thus, the defect content in the two types of fibres can be said to be significantly different from each other. The ESTLYS fibres, being a byproduct in the yarn process, show larger defect contents than the LML fibres.
Next, an example will be presented to demonstrate that the determined defect contents are highly sensitive to the selected image analysis algorithm. In the above description of the protocol, the coefficient of variation method is presented as an alternative to Otsu's method for quantifying the defect area. In Figure 8, the determined defect contents using the two methods are plotted against each other. Each data point in the figure corresponds to a single fibre section. The data, although scattered, shows a clear tendency that the coefficient of variation method results in lower values of defect content than Otsu's method; that is, most data points fall below the line of equality ( = ). Moreover, it can be observed that the coefficient of variation method results in defect contents for the ESTLYS and LML fibres that are much closer to each other with values of mean ± standard deviation on 4.1 ± 4.8 and 3.1 ± 2.7%, respectively. Thus, even though that the results show the same tendency of the ESTLYS fibres having a larger defect content, the numerical difference is smaller. Otsu's method is giving data in a larger range; that is, it is more sensitive, and therefore it is a more appropriate method to use when different fibre types are to be compared.
In the presented protocol, the size of the defects is defined by their area, which is also the definition used in most previous studies (e.g., [5]). However, it can be argued that the size of the defects should be defined by their width, that is, the width of the defect area in the fibre direction, presuming that the fibre is affected by a defect along this entire fibre length. This definition has been used in the study by Hänninen et al. [7], where the fibre defects were measured manually (in contrast to the use of image analysis algorithms). In the present study, a simple computational routine has been applied to resemble the width definition of defect size, and the defect contents have been determined accordingly. Figure 9 shows the results of defects contents determined using the two definitions of defect size. Firstly, it can be observed, as expected, that the defect contents determined by the width definition are consistently larger than the ones determined by the area definition; that is, all data points fall below the line of equality ( = ). Secondly, it can be observed that the two definitions result in defect contents  that are approximately linearly proportional to each other, which can also be expected due to the almost similar shape of the defects (see Figures 4(c) and 4(d)). The mean ± standard deviation of the determined defect contents using the width definition for the ESTLYS and LML fibres are 20.2 ± 9.6 and 13.2 ± 8.3%, respectively. These values are about 3 times higher than the values obtained by using the area definition, which corresponds to the slope of the linear regression line in Figure 9. Altogether, it is shown that in principle both defect definitions can be used to give unbiased findings for the comparison of defect content between fibre types.
Finally, some considerations are given for the use of the presented protocol.
(i) It should be realised that the method of using optical microscopy and image analysis for defect quantification in natural fibres does not provide true measures for the defect content. By using this method, the defects in the three-dimensional volume of a fibre are projected into a two-dimensional image representation, and this introduces systematic errors. The location and size of the defects within the fibre cannot readily be scaled into a two-dimensional representation of the defect content. Moreover, the method is highly sensitive to the applied microscopy settings and the selected image analysis algorithms, and this introduces random errors. However, still the method provides measures for the defect content that are suitable for being used in the comparison between fibre types.
(ii) No recommendations are given for the required number of fibre samples and fibre sections to be imaged for each fibre type. In the case of the presented example with the flax fibres, about 60-80 determinations of defect content were made. This is considered to be the minimum number of determinations due to the large scatter of data. Further work is needed to study the variation in defect content along a single fibre, and between fibres, in order to establish the statistical basis for selection of the required number of determinations.
(iii) In the presented study, the mean and standard deviation values are given with two significant figures, for example, 6.9 ± 5.0%, to reflect that the precision of the method in itself is considered to be 0.1% (in the case where a given fibre section is repeatedly observed and measured). However, due to the large variation in defect content between fibre sections, the standard deviations are of substantial magnitude giving coefficient of variations in the range 70-80%. In light of this large variation, the mean and standard deviation values (in addition to other descriptive statistics) should more correctly be given with one significant figure only, for example, 7 ± 5%, when the defect content is to be reported in an applied context.

Conclusions
A protocol for the quantification of defects in natural fibres is presented to be used as an industrial tool to evaluate fibre quality. The protocol is based on the experimental method of optical microscopy, together with the image analysis algorithms of the seeded region growing method and Otsu's method, and this allows for automatic determination of the defect content in the fibres.
To demonstrate the use of the protocol, two types of differently processed flax fibres are examined. The resulting distributions of defect contents are shown to be slightly skewed, with a large spread of data. The mean defect content for the two fibre types is determined to be 6.9 and 3.9%, and the difference is tested to be statistically significant.
The presented protocol is evaluated with respect to the selection of image analysis algorithms and the definition of defect size. Otsu's method is found to be a more appropriate method than the alternative coefficient of variation method; it is more sensitive giving data in a larger range which is beneficial when different fibre types are to be compared. The traditional way of defining size of defects by their area is compared to the definition of size of defects by their width in the fibre direction. Defect contents determined by the two definitions are approximately linearly proportional to each other, which can be explained by the almost similar shape of the defects. It is shown that in principle both defect definitions can be used to give unbiased findings for the comparison of defect content between fibre types.
Finally, some considerations are given for the use of the protocol. It is emphasized that the method of using of optical microscopy and image analysis for defect quantification does not provide true measures for the defect content due to the introduction of both systematic and random errors; however, the method is suitable for being used for the comparison between fibre types. Further work is needed to establish the statistical basis for the selection of the required number of determinations of defect content for a given fibre type; however, 60-80 determinations are considered to be the minimum. The precision of the method in itself is considered to be 0.1%, which supports that the determined descriptive statistics can be given with 2 significant figures, for example, a mean of 6.9%. However, in light of the large spread of data, giving large coefficients of variations, it is more correct to use only one significant figure, for example, a mean of 7%, when the defect content is to be reported in an applied context.