A New Classification Analysis of Customer Requirement Information Based on Quantitative Standardization for Product Configuration

Traditional methods used for the classification of customer requirement information are typically based on specific indicators, hierarchical structures, and data formats and involve a qualitative analysis in terms of stationary patterns. Because these methods neither consider the scalability of classification results nor do they regard subsequent application to product configuration, their classification becomes an isolated operation. However, the transformation of customer requirement information into quantifiable values would lead to a dynamic classification according to specific conditions and would enable an association with product configuration in an enterprise. This paper introduces a classification analysis based on quantitative standardization, which focuses on (i) expressing customer requirement information mathematically and (ii) classifying customer requirement information for product configuration purposes. Our classification analysis treated customer requirement information as follows: first, it was transformed into standardized values usingmathematics, subsequent to which it was classified through calculating the dissimilarity with general customer requirement information related to the product family. Finally, a case study was used to demonstrate and validate the feasibility and effectiveness of the classification analysis.


Introduction
In the past decades, the classification of customer requirement information (CRI) has become increasingly important in the entire product development and manufacturing process.Numerous researchers have accomplished valuable work in terms of developing approaches to classification, and three of these methods, that is, those based on the Kano model, product hierarchical structure, and data format, are the most prominent.
The Kano model was introduced by Noriaki Kano who classified customer preferences according to a threshold, performance, and excitement to guide design decisions [1].The Kano model is often associated with quality function deployment (QFD) and fuzzy mathematics, which was utilized to determine the weights or importance of customer requirements [2,3].The product hierarchical structure classifies CRI into different types according to modern product hierarchies, such as function, form, extension, and price requirements [4,5], and is a form of qualitative analysis intended to provide performance indicators.The data format divides CRI into binary, option, parameter, description, and interpretation data types in terms of common data formats on the Web [6,7].Compared with the product hierarchical structure and the KANO model, the classification based on the data format has already solved the problem of quantitative expression and analysis for CRI analysis.Moreover, the premise of a classification based on the data format is to collect CRI via the Web, an approach that is already adopted and is likely to be widely utilized in the future [8].The classification of CRI using the data format facilitates not only the subsequent analysis of the information, but also the management thereof.Nevertheless, classification based on the data format has certain limitations, because this method does not take into account the number of data formats that exist on the Web or how this classification relates to the entire product configuration process.

Mathematical Problems in Engineering
In conclusion, traditional classification methods focus on customer requirements rather than on product configuration, such that the classification results are neither extendable nor scalable.The processes of CRI classification and product configuration can occur independently of one another.For instance, customers' preferences are able to complete transformation into product design [9,10] without performing CRI classification, although CRI classification results are useful for determining them.Additionally, the lack of mathematical analysis of CRI means that it is difficult to use the classification results directly for product configuration purposes.Therefore, the modern approach to CRI classification has not been a simple issue of dividing the information into groups; instead, it has to consider (i) how to analyze and express CRI with mathematical methods and (ii) how to classify CRI for the purpose of product configuration.
In this paper, we introduce a classification analysis based on quantitative standardization.Section 2 describes a mathematical model to analyze CRI in terms of product families.Section 3 provides details as to how to transform CRI into quantitative standards.On the basis of Sections 2 and 3, Section 4 presents the classification method.Finally, in Section 5 a case is demonstrated to confirm the feasibility and effectiveness of the proposed method.

Background Review
2.1.CRI Structured Procedure.The increasing application of e-commerce has been transforming the acquisition method of choice from offline to online.Online CRI acquisition mainly depends on the Web, including the advantages it offers in terms of efficiency, affordability, and convenience [11].Online CRI can be submitted in the form of XML documents, which can contain multiple kinds of data and can adapt to the dynamic development of CRI.An XML document containing CRI is referred to as a customer requirement information document (CRID).Because of the XML framework, each CRID can tag various CRI features of each customer; however, these features contain multiple data types characterized by fuzziness, concealment, and similarity such that it is difficult to correctly identify the information for use in the process of product configuration.Thus, it is necessary to transform CRI into a structured model corresponding to the product family model to enable the information to be translated into product development for manufacturing.Research on the structure model of the product family has led to the construction of general customer requirement information (GCRI) [12], which is able to abstract a series of similar CRI features whose personalized features are distinguished by specific values of cases [13].Figure 1 illustrates the standardization procedure in which CRI is transformed into GCRI, which means that features with values in CRID categories can be transformed into the corresponding GCRI classes in a structured model.

CRI Document Model.
In this model, the CRI is submitted in the form of CRID, which is a document based on a document representation model capable of enlightening the CRID representation.Because a document is composed of words, the word is the most widely used unit of information in document modeling [14]; namely, a document representation model can be established by using the characteristics of words and is implemented by the Vector Space Model (VSM).The VSM [15] is a vector model utilizing the extracted characteristics of words from a document in a Euclidean space [16].A word characteristic corresponds to a separate term.If a term occurs in the document, its value in the vector is nonzero [17].
The main idea of VSM is as follows: let  = [ 1 ,  2 , . . .,   ] be a document set, with each document   ( = 1, 2, . . ., ) represented by a set of terms  = [ 1 ,  2 , . . .,   ]; any   ( = 1, 2, . . ., ) corresponds to one dimension in the VSM such that  can be a  ×  matrix, which means that documents can be mapped to a point in the VSM and their similarity can be calculated by distances.To date, the VSM is the most efficient and useful document representation model because it transforms the similarity between two documents into the similarity between two vectors [18].Thus, the representation model of the CRID is a onedimensional vector containing the values of the CRI features.If the number of CRID tags is  and the CRI features are independent of each other, any CRID   can be represented as a one-dimensional set [ 1 ,  2 , . . .,   ].On the basis of this, if the number of CRIDs is  and each CRID defines  features, there will be an  ×  matrix shown as follows: In the matrix ,   ( = 1, 2, . . ., ;  = 1, 2, . . ., ) is a  feature value submitted by one customer .However, an existing problem of CRI features is that the information may occur as a combination of multiple data types such as numbers and words.This issue would have to be addressed by analyzing the similarity of these values to introduce quantitative standardization of CRI, such that   in matrix  could be changed into a uniform fundamental unit [19].

Classification and Clustering.
Classification and clustering are two major methods for information analysis, especially data in XML documents [20].The aim of the classification is to build a classifier based on some cases with some attributes to describe the objects or one attribute to describe the group of the objects.Then, the classifier is used to predict the group attributes of new cases based on the values of other attributes [21].The aim of clustering is to find groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups.
The clustering algorithm has access only to the set of features describing each object; it is not given any labels as to where each of the instances should be placed within the partition [22].Thus, classification is supervised learning as targets are predefined, whereas clustering is generally used in an unsupervised fashion.
The typical clustering algorithm is -means [23], which aims to partition  observations into  clusters.Each observation belongs to the cluster with the nearest mean.Since the sum of squares is the squared Euclidean distance [24], this is intuitively the "nearest" mean.-means clustering is able to compute fast and compatible with massive data.However, due to its unsupervised fashion, there must be an issue of how to choose  and centroids.Subsequent research on semisupervised clustering [25] is to remedy this defect because guiding a clustering algorithm is very efficient for improving its quality.
In practice, if the centroids and  can be defined by some technique indicators, a classifier will be built based on dissimilarity calculation among objects.In the next sections, aiming at CRI, the construction of this classifier will be introduced.

Quantitative Standardization of CRI
The CRI features contain multitype data values in a CRID, including nominal and scaled variables.The nominal variables are binary and multiple, and the scaled variables are measured.The process of CRI quantitative standardization is shown in Figure 2 and has the purpose of realizing quantification for the assignment of uniform fundamental units of CRI feature values.

Quantitative Analysis.
The nominal binary variables have the values of 0 and 1, where 0 or 1 means a CRI feature value does not exist or exists, respectively.The scaled variables are real values with fundamental units.The nominal multiple variables are different from the former two, which may not correspond to real values with multiple states.Thus, there must be a quantitative analysis for variable assignment.
The differences among the various features and the differences between the feature states are accounted for by proposing methods based on fuzzy mathematics for the purpose of linguistic representation for the quantitative analysis.
(i) Quantitative Analysis Based on Fuzzy Mathematics.Let  be a discussion universe; if mapping   ̃:  → [0, 1],   →   ̃() ∈ [0, 1] defines a fuzzy set  ̃in ,   ̃() will be named the membership function of the fuzzy set  ̃ [26], which can be expressed as  ̃= (  ̃( 1 ) ,   ̃( 2 ) , . . .,   ̃(  )) . ( The defuzzification calculation formula of  ̃is The membership function is also represented as a fuzzy distribution, of which the trapezoidal distribution is commonly used.The trapezoidal distribution  ̃is described by four parameters  ̃= (,  1 ,  2 , ), whose fuzzy membership function can be expressed as According to Chen's model [27], the defuzzification calculation formula of  ̃is (ii) Quantitative Analysis Based on Linguistic Representation.The linguistic representation is achieved by constructing a fuzzy function with a linguistic evaluation scale rule in Table 1.In terms of the number of nominal multiple variable states, the number of linguistic terms is determined so that a linguistic set {N, VL, L, RL, LL, M, LH, RH, H, VH, P} can be defined.Finally, the linguistic set is transformed into fuzzy functions by linguistic representation [28,29].

Standard Transform.
The similarity of quantitative CRI with different fundamental units is measured by using a standard transform formula: This standard transform formula can eliminate the influence of fundamental units and unify CRI standardized values in the range of [0, 1].

Classification Analysis of CRI
(C) In ( 8), if   = 0, CRID  and CRID  will be merged such that the dimensionality of matrix  is reduced.An ℎ ×  (ℎ ≤ ) matrix is shown as : (E) In (10), if  * ̸ = 0, the feature value will be marked, and its corresponding CRID will be submitted and recorded.The classification results depend on the number of  * ̸ = 0 in the CRID.

CRI Quantitative Standardization for a Smart Phone.
The CRI scaled features for a smart phone have real values with fundamental units, some of which are chosen for the demonstration of quantitative standardization.The data is initially subjected to the process of quantitative analysis and standard transformation described in Section 3 and the resulting standardized values are listed in Table 2. Similarly, the standardized values of CRI nominal binary features are shown in Table 3.
The CRI nominal multiple features for a smart phone have no fewer than two states, for example, depth and color.Because these states are not expressed by quantitative values, they have to be transformed by fuzzy mathematics and linguistic representation to obtain quantitative values, which are subsequently changed into standardized values.The process of quantitative standardization is presented in Tables 4 and 5.
In Table 4, three states of depth are transformed into a trapezoidal distribution by the linguistic evaluation scale rule  and fuzzy membership distribution.Then, the quantitative values of defuzzification are calculated by Chen's model.After performing a standard transformation, the standardized values of the three depth states are 0, 0.78, and 1, respectively.Table 5 contains the results of the transformation of seven states of color into a fuzzy set vector by a color universe based on RGB [30].The quantitative values and the standardized values are calculated by a defuzzification formula of a fuzzy set vector.

CRI Classification Analysis for a Smart
Phone.If the GCRI feature set and their cases (set of standardized values) are selected as indicated in Table 6, these cases will produce the quantitative standardized feature values of 30 CRIDs listed in Table 7, which constitutes a 30 × 6 matrix.In accordance with the dissimilarity measurement, the dimension reduction matrix is a 22 × 6 matrix.
Finally, the values in Table 7 were processed in terms of the dissimilarity measurement formula for selected GCRI    From the microscopic view, the classification analysis also derives which GCRI feature is easy to be challenged or ignored.Thus, in terms of those challenged GCRI features, the product family model would be considered to make appropriate adjustments for better product configuration.Furthermore, because the GCRI is able to be renovated, it is possible to achieve a flexibly classification if it is the centroid.

Further Discussion
. Production in enterprises is largely oriented towards CRI.Because original CRI is hardly utilized for modeling product family, GCRI is widely adopted in order to transform CRI into a structure model.This structured model composed of GCRI refers to product family model.Based on the GCRI and the product family, the modular product configuration is able to select appropriate modules and manufacture desirable products.Take the case of a smart phone; Figure 4 presents product configuration with GCRI and its product family.Thus, using GCRI to distinguish CRI can effectively assist in product configuration.
In CRI classification analysis for the smart phone, the dissimilar distribution indicates what CRIDs can meet product configuration.The global analysis diagram shown in Figure 5 affords a proportion of dissimilarity with a GCRI set in all CRIDs and that with GCRI cases in each CRID.The enterprise can determine the processing level of each CRID according to the proportion of dissimilarity between GCRI cases, whereas CRIDs with a higher proportion often require more complex treatment.
Likewise, the CRI classification analysis discovers which GCRI feature is easy to be challenged or ignored.The feature analysis diagram shown in Figure 6 exposes the statistics of features that differ from those in the GCRI cases.Popular CRI features, such as the color feature, are clearly shown.Moreover, enterprises are able to select those elements they wish to include in the GCRI feature set by either adding or removing features presented in this analysis result.These results are expected to be useful for product family renewal.

Conclusions
This paper proposes a classification approach for realizing CRI quantitative analysis aimed at supporting product family adaptation in enterprises.The CRI classification analysis not only considers the scalability of classification results but also regards subsequent application to product configuration.At the technical level, considering that CRI feature values consist of multiple data types, such as numbers and words, a quantitative analysis based on fuzzy mathematics and linguistic representation is presented.This analysis is capable of revealing not only the differences between the CRI features of a product, but also the differences among the states in each CRI feature, thereby avoiding shortcomings such as incomplete expression of states and meaningless assignment of features.Furthermore, the dissimilarity among CRI feature values is measured by utilizing a standard transformation for eliminating the influence of different fundamental units.An association between the classification analysis and product configuration is achieved by using a flexible classification based on the fact that the selective GCRI is regarded as the centroid.Therefore, the determination of the classification results is no longer an isolated operation; instead, it derives which CRI can meet product configuration and which GCRI feature can assist in improving product family.
In engineering practice, classification analysis enables CRI to be quantified recognized information, which will be compatible with other management systems in enterprises such as ERP or PDM.It is helpful for the final product rapid development and intelligent configuration.Meanwhile, using GCRI to discover the specific features, enterprises can determine market positioning of their future product and predict the corresponding product family model.Although the proposed approach is demonstrated by analyzing selected features of smart phones to verify the feasibility and effectiveness, it can be extended to other consumer electronics.
As future work, we will consider the study of a new framework that enables CRI classification analysis to deal with Big Data.

Mathematical Problems in Engineering
China (2015BAA058).The authors express their most sincere appreciation to Professor Quan LIU, who provided an experimental platform together with valuable advice.The authors are also grateful for assistance received from Professor Qingsong AI and Professor Ping LOU regarding revising the paper.

Figure 1 :
Figure 1: Summary of the standardization procedure.

Figure 4 :
Figure 4: Product configuration with GCRI and product family.

Table 1 :
Linguistic evaluation scale rule.
As mentioned in the CRI standardization procedure, CRI features with values in CRIDs can be transformed into corresponding GCRI with cases in a structured model.If the GCRI is regarded as the centroid, the CRI features can be classified by their dissimilarity with GCRI cases.Because it is possible to change the selection of GCRI flexibly according to specific conditions, such as technical abilities or information variation, the classification can achieve diversification.
(A) If there are  CRIDs, in accordance with the GCRI feature set { 1 ,  2 , . . .,   }, there will be  selected CRI features to constitute an  ×  matrix :

Table 2 :
Quantitative analysis of scaled feature variables.

Table 3 :
Quantitative analysis of nominal binary feature variables.

Table 4 :
Quantitative analysis of nominal multiple feature variables: depth.

Table 5 :
Quantitative analysis of nominal multiple feature variables: color.

Table 6 :
GCRI feature set and cases.

Table 7 :
Quantitative standardized feature values for smart phone.