Austempered ductile iron has emerged as a notable material in several engineering fields, including marine applications. The initial austenite carbon content after austenization transform but before austempering process for generating bainite matrix proved critical in controlling the resulted microstructure and thus mechanical properties. In this paper, support vector regression is employed in order to establish a relationship between the initial carbon concentration in the austenite with austenization temperature and alloy contents, thereby exercising improved control in the mechanical properties of the austempered ductile irons. Particularly, the paper emphasizes a methodology tailored to deal with a limited amount of available data with intrinsically contracted and skewed distribution. The collected information from a variety of data sources presents another challenge of highly uncertain variance. The authors present a hybrid model consisting of a procedure of a histogram equalizer and a procedure of a support-vector-machine (SVM-) based regression to gain a more robust relationship to respond to the challenges. The results show greatly improved accuracy of the proposed model in comparison to two former established methodologies. The sum squared error of the present model is less than one fifth of that of the two previous models.
1. Introduction
Austempered ductile iron (ADI) is a specialty heat-treated material that takes advantage of the near-net shape technology and low-cost manufacturability of ductile iron castings to make a high-strength, low-cost, and excellent abrasion-resistant material. ADI has become an established alternative in many applications that were previously the exclusive domain of steel castings, forgings, weldments, powdered metals, and aluminum forgings and castings [1–6]. This material has been also proven to perform very well under different wear mechanisms such as rolling contact fatigue, adhesion, and abrasion [7, 8]. Considering the low-cost, design flexibility, flexible machinability, high strength-to-weight ratio and good toughness, wear resistance, and fatigue strength of ADI, its usage now is extended into marine application with increasing interest in the study of corrosion and coating of ADI [5, 9–12].
ADI is obtained by heat treating process of ductile irons to have bainite as matrix, which consists of strong bainitic ferrite platelets and tough high-carbon retained austenite, along with spheroidal graphite nodules [1–3]. The typical microstructure of ductile irons, shown in Figure 1(a), includes spheroidal graphite nodules and matrix surrounding them. The bainitic matrix of an austempered ductile iron [8] is illustrated in Figure 1(b). A significant amount of retained austenite is presented as the shape of films and blocks in the matrix.
Typical SEM microstructure of a ductile iron includes (ferritic) matrix surrounding spheroidal graphite nodules (a) and an austempered ductile iron showing detailed bainitic morphology, in which distinct ferrite platelets are clearly observed [8].
The heat treatment for developing bainite matrix includes two steps. First, ductile irons are heated to austenization temperature (Tr) around 1550–1750°F to change the original matrix into austenite and then quenched down to the bainite formation temperature range (450–750°F) for one to three hours when bainitic ferrite grows isothermally at the expense of austenite before cooling down to ambient temperature [1–3, 13, 14]. The austenization reverses the matrix structure to high temperature austenite phase and in the meantime determines the initial carbon concentration in austenite (Cγ0) before austempering process, since the graphite modules in ductile irons are both a sink and source for carbon atoms. During the isothermal formation of bainite, termed as austempering process, bainitic ferrite forms in a displacive manner at the expense of austenite and partitions excessive carbon into surrounding austenite, which is gradually enriched during the process. The transformation process stops when the carbon content of austenite reaches a certain level which is impossible for the transformation to proceed further thermodynamically, before all austenite is consumed [14–16].
2. Role of Austenization Temperature
The mechanical properties of ADI obtained from a given ductile iron are closely related to the microstructure, which can be controlled by manipulating austenization and austempering temperatures [1–5]. The initial carbon content in austenite (Cγ0), which is dictated by austenization temperature (Tr), has two significant consequences for the final microstructure of ADI. Firstly, Cγ0 affects the choice of austempering temperature since the temperature range for bainite formation is a strong function of carbon and, to a lesser extent, other alloy elements [13, 14, 17]. Furthermore, the austempering temperature is the most important aspect for controlling ADI’s mechanical properties since the nature of the bainitic ferrite formed at different temperatures is variable. Additionally, at a given austempering temperature, higher Cγ0 will result in a lower volume fraction of bainite which can be formed during the austempering process [18, 19]. Less bainite formation would lead to more retained austenite in the final microstructure as well as less carbon enrichment in retained austenite, resulting in more blocky shape of retained austenite, which is mechanically unstable and thus detrimental to mechanical properties [14].
There are two established empirical formulas for estimating Cγ0. First one involves only austenization temperature and silicon content [3] as follows:
(1)Cγ0=Tγ420-0.17(wt%Si)-0.95,
where Tr is in °C, and wt%Si denotes the weight percentage of the silicon content. The other one includes other common alloy contents [20] as follows:
(2)Cγ0=1.61×10-6Tγ2+3.35×10-4Tγ+0.006(wt%Mn)-0.11(wt%Si)-0.07(wt%Ni)+0.014(wt%Cu)-0.3(wt%Mo)-0.435,
where wt%Mn, wt%Si, wt%Ni, wt%Cu, and wt%Mo denote the weight percentages of manganese, silicon, nickel, copper, and molybdenum contents, respectively. It has been indicated that both formulas can only achieve limited accuracy compared to experimental results [5]. The unsatisfactory results of the models are not unexpected due to not only the rarity of available data but also their unspecified variance resulted from the different instruments and measuring methods employed in the respective sources of generating the data. It is clear, nonetheless, that the formulas employing merely linear multivariate regression are incapable of producing an accurate model. Moreover, an examination of the data points shows that the distributions of the corresponding features are seriously contracted and skewed. To gain a more reliable accuracy, an inversed lognormal histogram equalizer and a support-vector-machine-(SVM-) based regression are introduced to manipulate the data points. The histogram equalizer, as a preprocessor, is used to reduce the irregularity and enhance the contrast in the distribution of the data points. After equalizing the attribute contrast, an SVM-based regression is employed to fit the input with highly uncertain variance for a noise-tolerable relationship between the austenization temperature, Tr, and the alloy content with the initial carbon concentration, Cγ0. To introduce the model, the procedure of an inverse lognormal histogram equalizer for manipulating input data is outlined first, followed by the procedure of an SVM-based regression model. Finally, the established model and its prediction results are presented and compared with the previous models.
In this study, the histogram equalizer is employed in order to derive a balanced contrast of the regression input attributes. Histogram equalizer is often utilized to increase the global contrast of two-dimensional digital images. Through a mapping operation, the corresponding intensities of an image can be distributed expansively on the corresponding histogram, and thus the discrimination of the details could increase. The characteristics of the enhanced expansiveness are therefore conducive to the applications in the areas such as X-ray, thermogram, and face detection images [21, 22]. The present study adopted the two-dimensional mapping to be one-dimensional case to influence the input attributes separately which are chemical compositions and Tr. By taking the input attributes inspection in advance, the attributes show more or less imbalanced tendency in their distributed histogram. Due to a general assertion that skewed data distributions in nature favor a lognormal distribution [23], a lognormal distribution is asserted for fitting the attributes. An ideal histogram equalizer is designated to generate a uniform distributed histogram after the equalization. An inversed lognormal histogram equalizer, mapping the attributes from lognormal to uniform distribution, is, hence, employed to deal with the attributes’ imbalanced tendency.
The lognormal distribution definition is as follows:
(3)T(ρ)=1ρ2πσρ2exp-(lnρ-μρ)22σρ2,
where μρ and σρ2 are the natural logarithms of average and variance of corresponding θ, respectively. The inversed lognormal histogram equalizer can be given as
(4)ρ=T-1(θ).
The mapping is a conversion from lognormal distributed θto uniform distributed ρ by the inversed function T. The uniform distributed ρ is seeking wider and more balanced global contrast which is advantageous to the generation of the fitted function.
To compose a set of input features which contribute equivalently in the subsequent regression, the attributes ρ should furthermore be normalized as follows:
(5)xi=ρi-ρiminρimax-ρimin,i=1,…,6.
3.2. Support Vector Regression
The SVM-based regression, known to deal with the generalized model of complex uncertain relationships, has been gaining popularity and has shown much improved results [24–26], albeit only few experimental data points, also called observations here, are available. The present study is aimed to employ the support vector regression (SVR) to obtain a more accurate relationship model between the initial carbon concentration in austenite with austenization temperature and alloy contents. Advantageous of the structural risk minimization [27, 28], the SVM-based method for regression [29–32], similar to that for classification [33, 34], simultaneously minimizes both the model complexity and empirical errors, and in turn creates a predictor with a wide margin. In classification, the wide margin represents a high generalization capability for separating unlabeled samples. On the other hand, the wide margin in regression represents a smooth approximation function in which variance from noises will be rejected as much as possible. In contrast to traditional statistical or ANN regression model which derives the approximation function by minimizing the training error between observed and corresponding predicted responses, the SVR attempts to minimize a generalization error which combines the training error and a regularization term to control the model complexity. The generalization error mainly rejects the highly variant noises and achieves a rigid regression.
The SVR is intrinsically a kernel-based method [35]. With a given learning set S={(x1,y1),…,(xi,yi),…,(xl,yl)}, an approximated function y^=f(x) can be established for further prediction. In S, xi denotes the d-dimensional input vector,xi=[xi1,xi2,…,xid]T, xi∈ℜd, and yi denotes the corresponding target value of input xi, yi∈ℜ. By using the ε-insensitive loss function (Figure 2(a)) to regularize the degree of rigidness, the optimized f can include all input xi within the boundary of ±ε deviation while keeping the boundary (a tube in space) as straight as possible (Figure 2(b)). The essentially regularized rigidness is beneficial to SVR in finding an optimized generalization for the regression. By introducing the kernel trick [34, 35], the regression function can be described as f(x)=κ(w,x)+b, where w denotes a weight vector, w=[w1,w2,…,wd]T, w∈ℜd, b denotes the bias term, and κ(·,·) denotes a kernel function. Here, the kernel function is adopted to deal with the nonlinearity of the regression. Putting the elementary features together, the fitting of SVR can then be formally expressed as a primal convex optimization problem as follows:
(6)minw,ξ,ξ-12∥w∥2+λ∑i=1l(ξi+ξ-i),
subject to
(7)yi-κ(w,xi)-b≤ε+ξi,κ(w,xi)+b-yi≤ε+ξ-i,ξi,ξ-i≥0,i=1,…,l,
where λ denotes the regularization factor, and slack variables ξ=[ξ1,ξ2,…,ξl]T and ξ-=[ξ-1,ξ-2,…,ξ-l]T are introduced for allowing errors to cope with some infeasible constraints in the optimization and form a soft margin. Parameter ε associated with the ε-insensitive loss function, ε(ξ)=max(|ξ|-ε,0), controls error tolerance of the regression. The loss function defines the ε-tube which carries out the rigidness of the approximated function. So, the parameter ε affects the smoothness of the induced regression and the number of support vectors as well. On the other hand, the parameter λ controls the tradeoff between keeping the straightness of subsequent f and limiting the deviations to be less than ±ε. Let α=[α1,α2,…,αl]T and α-=[α-1,α-2,…,α-l]T be the Lagrange multiplier vectors for the first two sets of constraints in (7), and take Lagrange of the primal problem of (6)-(7). The Wolfe dual [36] can be obtained by differentiating the Lagrange with respect to w, b as follows:
(8)maxα,α--12∑i,j=1l(α-i-αi)(α-j-αj)κ(xi,xj)-ε∑i=1l(α-i+αi)+∑i=1lyi(α-i-αi),
subject to
(9)∑i=1l(αi-α-i)=0,αi,α-i∈[0,λ],∀i.
The functions in (8) and (9) become a quadratic optimization problem. The optimized α* and α-* can therefore be obtained after the optimization procedure. To take advantage of the sparseness of support vectors [32, 35], only those xi’s with nonzero αi*’s, called support vectors (SVs), are taken into account to form the consequent f. With the SV, weight vector wcan be computed by w=∑xi∈SV(α-i*-αi*)xi, and therefore
(10)f(x)=∑xi∈SV(α-i*-αi*)κ(xi,x)+b.
Several kernel functions have been introduced for SVR, including linear, polynomial, and Gaussian kernel [32, 35]. A straightforward way of selecting kernel function is to select the one which can reflect the natural tendency of the distributed data. In the study, a Gaussian kernel
(11)κ(xi,xj)=e-∥xi-xj∥2/2σ2,
where σ denotes the width parameter of its corresponding basis function and was employed to adapt the nonlinearity of the present problem.
The soft margin ε-insensitive loss function (a) corresponds to a linear support vector regression (b).
3.3. Flowchart of Proposed Model
Following the steps mentioned earlier, a flowchart for the whole proposed model is illustrated in Figure 3. Two steps, “data preparation” and “function approximation,” are enclosed mainly in the model. With the gathered input raw data, the model can generate an approximated function f(x) for estimating Cγ0 automatically.
Flowchart of the proposed model.
4. Results and Discussions
Forty-two experimental data points of Cγ0 were collected from the literature [2, 37–44] for the study. The data points containing six original attributes, including θ1(Tγ), θ2(wt%Si), θ3(wt%Mn), θ4(wt%Mo), θ5(wt%Ni), and θ6(wt%Cu), were recorded to establish the relationship Cγ0=f(θ1–θ6) for prediction. The corresponding metrics of mean, variance, and skewness of the original attributes were determined, as listed in Table 1. The variance and skewness found are surprisingly undesirable, given the fact that these data points are collected from different sources, in which different instruments and measuring methods were employed.
Corresponding metrics of the attributes before and after the inversed lognormal histogram equalization.
θ before equalization
ρ after equalization (normalized)
Mean
Variance
Skewness
Mean
Variance
Skewness
Tr
907.00
2285.70
1.38
0.14
0.07
2.79
wt% Si
2.56
0.08
2.20
0.28
0.17
1.15
wt% Mn
0.31
0.04
1.16
0.41
0.22
0.42
wt% Mo
0.16
0.06
1.21
0.35
0.11
0.71
wt% Ni
0.29
0.28
1.78
0.46
0.03
0.72
wt% Cu
0.18
0.18
2.87
0.71
0.09
−0.50
To enhance the discriminative information contained in the attributes and to ensure their satisfactory global contrast, the inversed lognormal histogram equalizer was applied, mapping original attributes θi to equalized attributes ρi. Figure 4 shows the normalized histograms of the attributes before and after the equalization. Corresponding metrics of the equalized attributes compared to those of the original attributes are also included in Table 1. This table illustrates that the skewness of the attributes, except for Tr, are significantly reduced. With their values close to zero, the remapped attributes are less skewed and more evenly distributed in Figure 4. Moreover, the histogram examination of the attributes’ global contrast is followed by reviewing the panels in Figure 4. Compared with the equalized attributes ρ1, ρ2, and ρ3, the original attributes θ1, θ2, and θ3 are more widely spread in the global contrast. The unexpected spread may be due to the overequalization. To fulfill the objective of the study, a combination of the original and the equalized attributes with wider spread contrast was selected to make up the SVR input. Consequently, three original attributes θ1, θ2, and θ3, together with three equalized attributes ρ4, ρ5, and ρ6, were chosen for further regression.
Histograms of attributes before (θ1–θ6) and after (ρ1–ρ6) the histogram equalization. For better comparison, all the attributes are normalized for chartings.
The attributes, consisting of the equalized ρ4–ρ6 and those original sets ρ1=θ1, ρ2=θ2, and ρ3=θ3, were then normalized into the range [0,1] for consistent contribution to the learning process and form the input vector x=[x1,x2,…,x6]T.
There are three adjustable parameters σ, ε, and λ during the SVR learning phase. To calibrate the parameters for an optimized model, the method of cross-validation (CV) has been undertaken. Following the cross-validation, the dataset was randomly partitioned into three groups, including 21 data points for training set, 11 data points for validation set, and 10 data points for test set. By the independence of the training and validation datasets, the cross-validation was taken through to pursuit the lowest generalization error and obtain the corresponding optimized parameters, namely, σ*, ε*, and λ*. The model parameterized by σ*, ε*, and λ* is the most generalized model for the prediction of Cγ0. As presented in the previous section, the generalized model is resistant to the input noises. To select the most generalized model, the sum squared error and mean squared error
(12)SSE=∑i=1lset(y^i-yi)2,MSE=1lset∑i=1lset(y^i-yi)2
are adopted to evaluate the errors in the cross-validation, where lset denotes the set-length of the chosen set, and yi and y^i denote observed and corresponding predicted output responses, respectively.
Since there are three parameters for tuning, adaptively searching the whole parameter-space is inefficient and may converge in a slower rate. In this study, a grid-search method incorporated with priority steps was adopted to speed up the searching for the optimal solution. The width parameter, σ, for the basis function determines the nonlinear transformation of the input data. In general, the larger the width is, the more linearity the induced model will be. The parameter λ is a regularization factor for controlling the tradeoff between the training error and complexity of the induced model. A larger λ produces more penalties on the training error and induces a higher complexity for the model [35]. These two parameters, σand λ, are dominant parameters in SV machine and are designated to be determined first in the study. The ranges for σ of [5–25] and λ of [10^{3}–10^{7}] were selected for the grid-searching of (λ,σ) after some preliminary tests. With the σ* and λ* being determined, the cross-validation then seeks the optimal ε* to achieve the most generalized model. In this study, a relative small value, ranging between[10-3–10-1], was specified for seeking ε*.
Figure 5 shows the cross-validated SSEs corresponding to the changes of σ, λ, and ε. From the panels, λ*, σ*, and ε* were chosen as 10^{7}, 20, and 5 × 10^{−3}, respectively, with the minimal SSE of 4.55 × 10^{−2}. With these optimized parameters, the SVR model was then taken to certify the prediction capability by the standalone test dataset.
The sum squared errors (SSEs) in cross-validation according to different (a) λ, (b)σ, and (c) ε settings.
As indicated by the unit-slope graphs in Figure 6, which illustrates the discrepancies between predicted and experimental data, the SVR model established with the chosen parameters shows much improved accuracy compared with the two previous models shown in (1) and (2). The MSEs of the three models, not only with the validation dataset but also with the test dataset, are also detailed in Table 2. For example, the error of the test dataset for the present SVR model is only about one third to one fifth of the two previous models.
The mean squared errors (MSEs) of different models.
Training set
Validation set
Test set
SVM model
3.91×10-4
4.13×10-3
5.52×10-3
Regression function (1)
1.13×10-2
0.98×10-2
1.47×10-2
Regression function (2)
1.42×10-2
2.05×10-2
3.07×10-2
Experimental Cγ0 data are plotted against those calculated Cγ0 obtained by the present SVR model, regression function (1), and function (2).
The present model illustrates a more accurate prediction of the initial carbon concentration in austenite after austenization but before austempering for heat treating ductile iron into ADI. With more accurate control of initial austenite carbon concentration, the austempering temperature can be appropriately selected to produce desired ADI microstructure after austempering and ultimately to meet target mechanical properties.
5. Conclusion
In the presented paper, support vector machine for regression was used to establish a relationship between the initial carbon concentration of austempered ductile irons after austenization (Cγ0) with Tr (austenization temperature) and alloy contents in austempering processes. The results indicate that SVM regression greatly improved the accuracy of Cγ0 prediction in comparison to two established equations using linear regression. Overall accuracy (sum squared error) of the present method is five and eight times of those of the two previous models, respectively. A better control of Cγ0 has been proven to be critical in achieving desired microstructures and mechanical properties for ADI, which has been applied, among numerous fields, for many marine applications.
The present study also demonstrates the possibility of employing a similar procedure to deal with contracted and skewed observations with highly uncertain variance. SVR, characterized by highly noise resistant as well as flexible in compromising between accuracy and complexity, is one of the suitable algorithms to deal with observations collected from multiple sources, in which instruments and measurements vary wildly.
Acknowledgments
This work was supported by the grants from National Science Council of Taiwan, under Contracts NSC 100-2218-E-149-002 and NSC 99-2221-E-149-009. The authors would like to thank Hsin-Liang Tai for his great contribution to the paper, Professors Jui-Jen Chou and Daw-Kwei Leu for their useful comments for the studys, and Mohammad Arif for his help in formatting this paper.
RundmanK. B.MooreD. J.HayrynenK. L.DubenskyW. J.RounsT. N.The microstructure and mechanical properties of austempered ductile ironSheaM. M.RyntzE. F.Austempering nodular iron for optimum toughnessVoigtR. C.LoperC. R.Austempered ductile iron-process control and quality assuranceHayrynenK. L.BrandenbergK. R.KeoughJ. R.Applications of austempered cast ironsUmaT. R.SimhaJ. B.MurthyK. N.Influence of nickel on mechanical and slurry erosive wear behaviour of permanent moulded toughened austempered ductile ironSahinY.ErdoganM.KilicliV.Wear behavior of austempered ductile irons with dual matrix structuresLainoS.SikoraJ. A.DommarcoR. C.Development of wear resistant carbidic austempered ductile iron (CADI)ChangL. C.HsuiI. C.ChenL. H.LuiT. S.A study on particle erosion behavior of ductile ironsVenkatesanR.VenkatasamyM. A.BhaskaranT. A.DwarakadasaE. S.RavindranM.Corrosion of ferrous alloys in deep sea environmentsHemanthJ.Solidification and corrosion behaviour of austempered chilled ductile ironHsuC. H.LuJ. K.TsaiR. J.Characteristics of duplex surface coatings on austempered ductile iron substratesKrawiecH.StypułaB.StochJ.MikołajczykM.Corrosion behaviour and structure of the surface layer formed on austempered ductile iron in concentrated sulphuric acidRounsT. N.RundmanK. B.MooreD. M.On structure and properties of austempered ductile cast ironChangL. C.An analysis of retained austenite in austempered ductile ironChangL. C.Carbon content of austenite in austempered ductile ironBhadeshiaH. K. D. H.EdmondsD. V.The mechanism of bainite formation in steelsChangL. C.HsuiI. C.ChenL. H.LuiS. T.Influence of austenization temperature on the erosion behavior of austempered ductile ironsPutatundaS. K.GadicherlaP. K.Influence of austenitizing temperature on fracture toughness of a low manganese austempered ductile iron (ADI) with ferritic as cast structureDarwishN.ElliottR.Austempering of low manganese ductile iron, part II influence of austenitising temperatureNeumannF.GonzalezR. C.WoodsR. E.ŠtrucV.ŽibertJ.PavešićN.Histogram remapping as a preprocessing step for robust face recognitionLimpertE.StahelW. A.AbbtM.Log-normal distributions across the sciences: keys and cluesGongY. B.ChenS. F.Gray cast iron strength prediction model based on support vector machineTangX.ZhuangL.JiangC.Prediction of silicon content in hot metal using support vector regression based on chaos particle swarm optimizationLiuY.YuH.GaoZ.LiP.Improved online prediction of silicon content in iron making process using support vector regression with novel outlier detectionVapnikV. N.VapnikV. N.VapnikV.GolowichS. E.SmolaA. J.MozerM.JordanM.PetscheT.Support vector method for function approximation, regression estimation, and signal processingDruckerH.BurgesC. J. C.KaufmanL.MozerM.JordanM.PetscheT.Support vector regression machinesSchölkopfB.BartlettP. L.SmolaA.WilliamsonR.Support vector regression with automatic accuracy controlProceedings of the 8th International Conference on Artificial Neural Networks1998111116SmolaA. J.SchölkopfB.A tutorial on support vector regression1998NeuroCOLT, NC-TR-98-030London, UKUniversity of LondonCortesC.VapnikV.Support-vector networksBurgesC. J. C.A tutorial on support vector machines for pattern recognitionSchölkopfB.SmolaA. J.FletcherR.RounsT. N.RundmanK. B.Constitution of austempered ductile iron and kinetics of austemperingRundmanK. B.KlugR. C.An X-ray and metallographic study of an austempered ductile cast ironVerhoevenJ. D.EI NagarA.EI SarnagawaB.CornwellD. P.FredrikssonH.HillertM.A study of austempered ductile cast iron34Proceedings of the Physical Metallurgy of Cast Iron, Materials Research Society Symposium1985387GrechG.YoungJ. M.Effect of Austenitising temperature on tensile properties of Cu-Ni austempered ductile ironDarwishN.ElliottR.Austempering of low manganese ductile irons—part 1: processing windowNieswaagH.NijhofJ. W.FredrikssonH.HillertM.Influence of silicon on bainite transformation in ductile iron, relation to mechanical properties34Proceedings of the Physical Metallurgy of Cast Iron, Materials Research Society Symposium1985411415DorazilE.AranzabalJ.GutierrezI.Rodriguez-IbabeJ. M.UrcolaJ. J.Influence of heat treatments on microstructure and toughness of austempered ductile iron