An Approach Based on the Exploratory Data Analysis to Relate the Wear Behavior with the Microstructure of Ductile Cast Irons

The aim of this work is to propose a new methodology to relate Ductile Cast Irons (DCIs) wear behavior with the separation distances and sizes of the graphite nodules through an Exploratory Data Analysis (EDA). This methodology consists of morphological image processing tools (compacity and size distribution curves), an EDA performed by the use of box plots and an EDA-based section classifying algorithm. This algorithm classifies the microstructure of DCIs into classes and levels grouping different behaviors of the separation distances and sizes of graphite nodules. Finally, it was found, through a number of tribological tests, that the obtained classes and levels have a different wear behavior. The results achieved by this methodology were compared with those of traditional techniques used to characterize the microstructure of the material.


Introduction
Ductile Cast Irons (DCIs) are ferrous alloys in which graphite precipitations are embedded inside a metallic matrix in the form of spherical nodules [1].DCIs are low cost materials.They have properties like moldability, mechanical strength, and machinability characteristics.Besides, they are thermally conductive and wear and corrosion resistant.These are the reasons why the DCIs are widely used in manufacturing various mechanical pieces such as cams, camshafts, crankshaft, cylinder heads, and engine blocks [2,3].In most of these mechanical pieces, there is constant sliding that causes friction when contacting different surfaces.Excessive friction leads to premature mechanical wear which increases the risk of a mechanical failure in machinery elements.Due to this, the need to develop new methodologies and technologies to evaluate the mechanical behavior of materials has emerged.The works that study the behavior of certain materials analyze the relationship between mechanical properties, thermal treatment, and manufacturing process, among others, with the microstructure of the material [4][5][6][7].Therefore, it has been proven that the microstructure of materials directly affects some of their macroscopic properties.Consequently, it is necessary to make a proper microstructure characterization.
The most common techniques employed in characterization of DCIs microstructure involve the chemical composition and microstructural parameters such as nodular density, percentage of graphite, the average size of the nodules, and nodularity.The chemical composition has been associated with various mechanical properties (transverse rupture strength, hardness, wear, yield strength, tensile strength, elastic modulus, fracture toughness, and plastic deformation), as well as the cooling method and coating techniques [8][9][10][11][12][13][14][15][16][17][18].The analysis of the chemical composition of certain material only identifies what percentage of the elements exists in the material but does not identify particular characteristics about the geometry of the graphite nodules.On the other 2 Advances in Materials Science and Engineering hand, the use of the graphite percentage, nodular density, and nodularity has been associated with new alloys and heat treatment [19][20][21][22][23]. Furthermore, Likhite et al. [7] related graphically the nodular density, graphite-nodule average size, and graphite percentage to low-carbon cast iron modulus of elasticity.Another research like Čanžar et al. [24] studied the behavior of DCIs under cyclic deformation and fatigue as a function of nodular density, nodule average size, and nodularity.As shown in the previous works, microstructural characterizations are made of the basic parameters which are nodular density, percentage of graphite, and nodularity; the use of these parameters allows knowing the number of nodules per unit area (N/mm 2 ), the volume fraction of graphite (%), and the percentage of nodules that comply with a given circularity criterion, respectively.But these parameters neither provide information about the spatial distribution of graphite nodules nor serve to describe how similar or different the sizes of graphite nodules are.The graphite nodules in DCIs have the function of reducing the friction and wear.Regarding the wear, particularly the adhesive and abrasive are a form of deterioration that occurs when a hard rough surface slides across a softer surface.In fact, when the graphite nodules are under wear effect, these release small molecules forming a thin layer of lubrication to ensure that the opposing movement forces approach to zero.This allows the reduction of friction and then the wear as well.Therefore, if the graphite morphology is modified, the wear resistance is expected to be affected too [12,[25][26][27].As a consequence, it would be expected that the microstructures with small sizes of graphite nodules and short separation distances between them present better lubrication because of their proximity, reducing friction and therefore having a better wear behavior.Thus, it is necessary to develop methodologies capable of describing the separation distance and size distribution of graphite nodules, and not the commonly used parameters only, because of the important role played by them in the mechanical wear properties.
Besides the traditional parameters such as chemical composition, nodularity, density, and percentage of nodular graphite used in most of the works mentioned before, morphological image processing tools have been implemented in the area of metallography.For example, the work of Morales-Hernández et al. [28] introduces the concept of compacity curves by characterizing the separation distances between the graphite nodules in a DCI using a consecutive number of openings called granulometry (mathematical morphology).In another study by Liu et al. [29], the characterization of carbides formed in an iron foundry through the size, size distribution curves, and number and volume fraction of secondary carbides is performed.It is worth mentioning that this study does not carry out any deeper analysis of the curves obtained from the size distributions, and only the size distribution of carbides is analyzed but not its separation distance distribution.The work of Paredes-Orta et al. [12] reports the study of the wear behavior of ductile iron castings based on the concept of cluster nodules using the distribution curves of separation distances between the graphite nodules.Escobar et al. [30] study the effect of pouring temperature on the thermal-microstructural response of an eutectic spheroidal graphite cast iron by the use of a metallographic analysis through the number and size of graphite nodules at the end of the process.The results obtained in these studies are based only on the qualitative description of the compacity or size distribution curves of graphite nodules.Moreover, it can be seen that a formal statistical analysis in order to obtain numerical parameters to characterize the compacity and size distribution curves of graphite nodules is not presented in these works, making their interpretation difficult or even impossible.
The scientific contribution of this work is the development of a new methodology to relate DCIs wear behavior to the separation distances and sizes of the graphite nodules.The proposed methodology allows the classification of different microstructures, having similar characteristics with regard to the separation distances and sizes of graphite nodules, and shows that the microstructures contained in each class and level have a similar wear behavior.The graphite nodules release small molecules capable of functioning as lubricating layers when the graphite is under wear; then, since the sizes and spatial distributions of graphite nodules are different in each class and level, these will present a distinct lubrication behavior which will be reflected in a different wear behavior.Therefore, the proposed methodology finds a relationship capable of grouping the DCI sections which have a similar wear behavior.In order to establish this relationship, a formal statistical analysis is included to describe the separation distances and sizes of graphite nodules; therefore based on the characteristics obtained, classify the DCI microstructure.The proposed methodology is based on a series of morphological tools for image processing (compacity and size distribution curves of graphite nodules); with this information it performs an Exploratory Data Analysis (EDA) using box plots.By using the EDA, the interquartile range (IR) and the maximum value (Max) can be obtained.These statistical parameters are used to classify the sections in different classes and levels, according to the separation distance and the size distributions of the graphite nodules.Finally, it was validated by a series of tribological tests that the classes and levels obtained have a different wear behavior.The effectiveness of this work was tested in a fork drive shaft which has DCI regions with thick and thin walls, since these regions have different microstructural characteristics due to the diverse thermal behavior.

Materials and Methods
A general diagram of the proposed methodology is shown in Figure 1.The methodology is applied to an automotive piece made of DCI and consists of four stages: (a) the first stage is the experimental materials, in which the sectioning of the automotive piece is carried out, according to the distinct thicknesses, and, later, the image acquisition is performed by using an optical microscope for each obtained section; (b) the second stage is the microstructural analysis in which two morphological tools are applied to the acquired images, the compacity and size distribution curves; as a result from the use of these techniques, the separation distances between nodules and the sizes of the nodules are obtained, respectively; (c) the third stage is the Exploratory Data Analysis (EDA) which is done through box plots of the separation distances between the nodules and box plots of the nodule sizes; (d) the fourth stage is the section classifying algorithm, in which a two-phase algorithm is proposed, based on the EDA, to classify the microstructures in what is proposed as classes and levels.A class will be related to the maximum separation distance between the nodules, the maximum size of the nodules, and the average of these parameters; this is for each section; and a level will be related to the length of the box region of both box plots: separation distances between the nodules and sizes of the nodules.These classes and levels will group the sections with similar behavior of the separation distances and sizes of graphite nodules.Finally, a validation of the proposed methodology by means of a series of tribological tests was performed.

Experimental Materials.
For the development of this work, an automotive piece made of DCI (Figure 2(a)) was analyzed, which has the same chemical content, casting process, and cooling conditions but with different geometry.
The chemical composition of the piece has the following proportions: C = 3.5-3.9%,Mn = 0.15-0.35%,Si = 2.25-2.75%,S = 0.01 to 0.025%, and P = 0.05% max.The automotive piece corresponds to the fork of a cardan shaft, which is sectioned into five parts (Figure 2(b)), based on its thickness, because this affects the microstructure of graphite nodules for each part.Once the piece was segmented in five sections, images of each section were taken by an optical microscope with a magnification of 200x; the number of images to be taken for each section will be different, since each section has to meet the condition of analyzing between 2000 and 2500 particles.

Microstructure Analysis.
The microstructural analysis consists of the application of two morphological tools, the first is the compacity curve, which gives the separation distances between graphite nodules, and the second is the size distribution curve which obtains the sizes of graphite nodules.The compacity and size distribution curves are based on morphological definitions of antigranulometry and granulometry; these terms were introduced by Serra [31] and are defined as follows.
From the above definitions, the spatial distribution of nodules, which is described as a compacity curve, is obtained, being a regression curve of granulometry curves (openings).And it is considered that the concept of compacity is directly related to the separation distance between nodules [28].Compacity curves give the percentage of the ferritic matrix that corresponds to each separation distance between nodules.On the contrary, the size distribution curve of nodules is described as a regression curve of antigranulometry curves (closing).And the obtained curve shows the percentage of graphite that corresponds to each size of the graphite nodules.

Exploratory Data Analysis (EDA).
The EDA allows identifying and describing the main features of a data set distribution.The EDA consists of a box plot with a series of statistical parameters that is performed to the separation distances between nodules, as well as to the sizes of the nodules in each section of Figure 2(b).A box plot represents the distribution of a data set of a quantitative variable using five parameters: minimum value (Min), first quartile ( 1 ), median (Med) or second quartile ( 2 ), third quartile ( 3 ), and maximum value (Max).This means that a box plot divides data into four regions, each representing (approximately) the same number of data.The five parameters required to obtain the box plot is indicated in Figure 3.It also shows the interquartile range parameter (IR), which is the difference between  3 and  1 .As shown in Figure 3  of a box plot (compared with another of its regions) means that the data are less dense in the range given in that region or that the data have a higher dispersion, and not that there are more data in that interval [32,33].Therefore if a region is smaller (compared with another of its regions), it means that data are denser and also there is a higher concentration of data (homogeneous data) in the range of that region.Then the length (size) of each region is inversely proportional to the density of each region of the box plots, or directly proportional to the data dispersion.

Section Classifying Algorithm.
The sections of the automotive piece, Figure 2(b), are classified according to EDA behavior.The classification is performed in two phases: the first phase groups the sections into three classes according to the upper limit parameter (Lim sup ).That is, it evaluates whether the Max value of each section is higher or lesser than the Lim sup of both box plots, and the sections that have similar behavior are grouped.Once the sections are grouped into classes, the second phase is subclassification into three levels, each of the sections contained in a class, according to the dispersion threshold ().That is, it evaluates whether the length ratio parameter (Rat) is higher or lesser than  for both box plots of each section, and each section is grouped according to similar behavior.This second phase is carried out for all classes obtained in the first phase of the section classifying algorithm.Therefore, different classes are obtained, and each class contains various levels, which have an independent behavior between classes.The first phase is performed with the parameter Max   , where  specifies the sizes of the nodules () or separation distances between nodules () and  represents the number of the section (Figure 2(b)) to which the parameter belongs.The first phase, as mentioned in the previous paragraph, is based on the definition of the upper limit, Lim  sup , for the separation distances between the nodules ( = ) as well as for the sizes of the nodules ( = ), and it is defined as the average of the values Max   for each section , (1).The first phase of the section classifying algorithm is described by the flowchart of Figure 4: This classification allows grouping the behavior of the two box plots (separation distances and sizes of the nodules) into three classes, where a class represents a box plot behavior with respect to the limits calculated, as can be shown in Table 1.
Once the sections are classified into three classes, the second phase is applied to each class obtained from where Min IR   is the minimum of the IR   of the sections contained in a determined class .Then Rat   represents the proportion of how high each IR   with respect to Min IR   is.As mentioned before, large box lengths (large IR) represent a large data dispersion contained therein; on the contrary, short box lengths mean a higher concentration (less dispersion) data.Therefore Rat   represents how big the data dispersion of the box region is, with respect to the minimum dispersion, Min IR   , of the box plot.In order to classify each section of a class into a level, the parameters Rat   and   are compared, in which   represents the maximum value of dispersion allowed with respect to Min IR   .This second phase is described by the flowchart of Figure 5 and summarized in Table 2.
Once the sections are classified, a number of tribological tests are performed to validate that, in fact, different EDA behaviors of the sizes and separation distances of the nodules represented by classes and levels have an effect on wear.

Experimental Materials.
In Figure 6 different microstructures for each section of the automotive piece (Figure 2(b)) are presented.As mentioned in the experimental materials section, different geometries in a casting piece produce changes in its microstructure, causing different sizes of the nodules and different separation distances between nodules as observed in Figure 6.For each section, the condition of analyzing between 2000 and 2500 particles due to the diversity of sizes for each microstructure is satisfied.The number of images taken for each section is shown in Figure 6.3.2.Microstructure Analysis.Some works have made microstructural analysis using techniques such as compacity and size distribution curves, but none of them have reported up to today a formal statistical analysis of the study of compacity and size distribution curves or a relation between these techniques and wear [22,23].The obtained results to apply these techniques in order to characterize the microstructure of the five sections of automotive piece of Figure 2(b) are shown in Figures 7 and 8.The section curves obtained (Figures 7 and 8) have a wide variation and overlap each other, making it difficult to generate a description of the main characteristics of the curves.This is the reason why these techniques, by themselves, do not provide enough information to get a relationship with wear.However, the information obtained by these techniques is used for the development of the proposed methodology, aim of this work.

Exploratory Data Analysis (EDA).
This methodology proposes the implementation of the EDA to the information obtained through size distribution and compacity curves to describe the main features of the curve results.The results of applying the EDA to the size distribution and compacity curves are shown in Figures 9(a

Section Classifying Algorithm.
Following the proposed methodology, the upper limits were obtained by (1).The limit obtained for box plots of the sizes of the nodules was Lim  sup = 8.03.And for separation distances between nodules Lim  sup = 5.90.By the use of these limits, the Max value obtained, and the first phase of the section classifying algorithm (Figure 4), the classification of the sections was obtained, as shown in Table 3.
The box plots shown in Figures 10(a) and 10(b) present the classified sections according to the results in Table 3. Section 1 (blue) was grouped in Class A, sections 3 and 5 (orange) were grouped in Class B, and sections 2 and 4 (green) were grouped in Class C. The second phase of the section classifying algorithm was applied to each section into the classes obtained in the first phase, and the results are shown in Table 4.For this second phase, thresholds were assigned as follows:   = 1.25 and   = 1.25; these allow identifying when the box region has a bigger o smaller length than 125% with respect to the smallest length of the box region from a class.As can be seen in Table 4, section 1 was grouped in Class A-Level 1, sections 3 and 5 were classified into Class B-Level 2, and sections 2 and 4 were classified into Class C, section 2 in Level 2 and section 4 into Level 1.

Validation-Testing Wear.
To validate the results obtained with the proposed methodology, a number of tribological tests of the sections of automotive piece (Figure 2) were performed.The equipment used for the wear test was a CSM Instruments Standard Tribometer.The wear tests were performed according to the ASTM G99-05 standard, specifically by the weight loss method.
3.5.1.The Weight Loss Method.The weight loss method was used in this study, in which pins with diameter of 6 mm and 15 mm were used, with a radius of wear of 2.5-3 mm, lineal velocity of 15 cm/s, and a load of 2 N and 1000 laps.All samples (each section of Figure 2(b)) were cleaned with acetone in the contact area before testing, in order to remove any possible residue of fat or other surface contaminants.Before and after testing, the samples were weighed on a balance with an accuracy of ±0.001 g.The duration of each wear test lasted about an hour.Five tests for each sample were performed.The wear results were calculated from the measurements of the lost volume.The lost volume measurement is based on measuring the size of the mark formed by the pin at the end of the test.The measurement of the wear mark size was made through an optical microscope and it was calculated by the proposed method in the ASTM G99-05 standard by where ℎ is the radius of the wear mark, ℎ is the width of the wear mark, and  is the radius of the pin used in the test.Twenty measurements of the wear mark were performed to obtain an average.

Wear Test Results
. In Table 5, the results of wear tests for each of the sections of Figure 2, according to the weight loss method, are shown.Section 1 has the smallest wear with a value of 0.006 mm 3 , section 5 with a value of 0.017 mm 3 , and section 3 with a value of 0.018 mm 3 ; section 4 presents a wear of 0.022 mm 3 and section 2 has the biggest wear with a value of 0.031 mm 3 .

Discussion
This section demonstrates that the DCI wear behavior is independent of the behavior of the traditional parameters such as the graphite percentage, nodular density, and nodularity.However, it would be useful to find a relationship that allows establishing the criteria for classifying the DCIs with similar microstructural characteristics and therefore predict which DCIs will have a similar wear behavior.The scientific contribution of this work is a methodology that finds a relationship capable of grouping the DCI sections which have a similar wear behavior.
Recalling from the introduction section, the graphite of the nodules behaves as self-lubricant; therefore, this behavior should cause a direct relationship: the bigger the amount of graphite is, the stronger the wear resistance becomes.Nevertheless, as can be seen in Figure 11, section 5 has the minimum graphite percentage (12.75%),with a wear of 0.017 mm 3 .On the other hand, section 3 has the highest graphite percentage (14.97%),and it would be expected that this section had the highest wear resistance; however, the section has a wear of 0.018 mm 3 .Therefore, the graphite percentage does not play an important role in the wear resistance.Figure 12 depicts the wear against nodular density, where it can be seen that, for sections 5, 4, and 2, the wear increases as the nodular density does.From these data, it would be expected that if the nodular density increases, the expected effect would be an increase in the wear.On the contrary, for sections 3 and 5 the wear decreases while the nodular density increases.Consequently, it is impossible to predict the wear behavior only by means of the nodular density.The wear behavior with the nodular density and the nodularity is similar, as can be seen in Figures 12 and 13.Thus, neither the nodular density nor nodularity is enough to classify the DCI according to the wear behavior.Regarding the results of the proposed methodology, the classes and levels obtained are shown in Table 6.It can be seen that section 1, classified as Class A-Level 1, has the lowest wear (0.006 mm 3 ); sections 3 and 5, which are contained into Class B-Level 2, have higher values of wear (0.017 mm 3 ) than the previous case; section 4, contained in Class C-Level 1, has a value of wear of 0.022 mm 3 ; finally, section 2, classified as Class C-Level 2, presents the highest wear (0.031 mm 3 ).
Part of the contribution of this work is the capability of the proposed methodology to classify the sections of a DCI with similar wear behavior into classes and levels.A class with its own level represents a cluster of similar characteristics of a group of microstructures with regard to separation distances and sizes of their graphite nodules.From the results obtained with the proposed methodology, the different microstructures of a DCI could be classified into different classes and levels, based on a set of similar characteristics.Founded upon the types and levels obtained, it can be seen that Class A (Table 6) has a more resistant wear behavior; this is because the microstructure in Class A has smaller sizes of nodules and shorter separation distances, as can be seen from the curves of compacity and size distribution.The reason is that graphite nodules are closer and smaller; thus there is an increased nodular density, and therefore more molecules are released to form a higher density of lubrication layers significantly reducing friction along the microstructure and consequently the wear.Then the microstructures that have small sizes of nodules and shorter separation distances between the nodules have a better wear behavior.Otherwise, if the aforementioned classification was done with the traditional parameters employed, it would be impossible to perform, since these parameters do not consider the distribution of the sizes of the graphite nodules, or the spatial distribution of the nodules within the ferritic matrix.

Conclusions
It was shown in this work that the traditional parameters, nodular density, percentage of graphite, and nodularity, are not sufficient to establish a relationship with the wear of a DCI piece, because these parameters do not consider the distribution of the sizes of the graphite nodules, or the spatial distribution of the nodules within the ferritic matrix.
On the other hand, it was shown that the proposed methodology based on the use of EDA allows establishing the criteria for classifying the sections of an automotive piece in different classes and levels.The obtained classes and levels allow grouping similar behavior of separation distances and sizes of the nodules, a complicated task when a statistical analysis is not used, as was seen in the results of applying only compacity and size distribution curves.
The present work proves that sections contained in a particular class and level have a similar wear behavior, as shown in the obtained results with the proposed methodology.Therefore, it is important to consider the analysis of separation distances and sizes of the nodules when wear and microstructure of a DCI are related and not the use of traditional parameters only since they are not sufficient to establish a relationship with wear.
The proposed methodology can detect a particular microstructure belonging to a set of DCI microstructures, which present different microstructural characteristics or wear behavior to the previously mentioned set.So, this methodology could be implemented as a quality procedure to detect microstructures that do not meet certain microstructural characteristics or wear behavior according to a specific application.

kFigure 1 :
Figure 1: The general diagram of the proposed methodology.

Figure 2 :Figure 3 :
Figure 2: (a) Automotive piece to be analyzed; (b) sections of automotive piece.

Figure 4 :
Figure 4: Flowchart for the first phase of the section classifying algorithm.

Figure 5 :
Figure 5: Flowchart for the second phase of the section classifying algorithm.
) and 9(b), respectively.As can be seen in Figures9(a) and 9(b), every section presents a different distribution which indicates a distinct behavior of both the nodule sizes and separation distances between nodules.

Figure 7 :
Figure 7: Size distribution curves of the graphite nodules.

Figure 11 :
Figure 11: Graphic of the wear versus graphite percentage.

Figure 12 :Figure 13 :
Figure 12: Graphic of the wear versus nodular density.

Table 1 :
Summary of the first phase of the section classifying algorithm.

Table 2 :
Summary of the second phase of the section classifying algorithm.Length behavior of the box region of the box plot Level Rat   for both box plots of the section is smaller than   1 Rat   for only one box plot of the section is smaller than   2 Rat   for both box plots of the section is bigger than   3

Table 3 :
Results of the first phase of the section classifying algorithm.

Table 4 :
Results of the section classifying algorithm.

Table 5 :
Wear test results.

Table 6 :
Comparison between traditional parameters, results of the proposed methodology, and wear tests.