A Data-Driven Customer Quality of Experience System for a Cellular Network

Improving customer-perceived service quality is a critical mission of telecommunication service providers. Using 35 billion call records, we develop a call quality score model to predict customer complaint calls. The score model consists of two components: service quality score and connectivity score models. It also incorporates human psychological impacts such as the peak and end effects. We implement a large-sized data processing system that manages real-time service logs to generate quality scores at the customer level using big data processing technology and analysis techniques. The experimental results confirm the validity of the developed model in distinguishing probable complaint callers. With the adoption of the system, the first call resolution rate of the call center increased from 45% to 73%, and the field engineer dispatch rate from 46% to 25%.


Introduction
The desire to understand customer experience was stimulated by intense competition in the telecommunications industry.As the market saturated, thereby intensifying competition, service providers endeavored to increase customer retention.Customer churn, which can be led by customer dissatisfaction [1], is disadvantageous for service providers: it causes not only decreases in profit, but also the loss of opportunities for existing customers to suggest their service operator to other users.Moreover, attracting a new customer costs more than retaining an existing one [2].For these reasons, improving customer experiences has been a critical factor in business.
Quality of experience (QoE) has been of intense interests both from academia and industry for the last decade because it can provide a more holistic view of customer experience than its technology-centric complement, quality of service (QoS) [3].Especially, the implementation of QoE systems has been enabled by the development in computing technology such as big data analytics, cloud computing, and so on.Smart phones and devices have generated a large volume of data under an environment in which information can be stored at distributed storage, called cloud storage, at low cost.In addition, technology development in distributed data processing, such as Apache Spark, enables data processing at low cost [4].
Telecommunication service providers have developed large-scale systems to estimate overall customer satisfaction or make efforts to improve it.In mobile communication customer surveys in Hong Kong, coverage and transmission quality were found to be important factors [5].The relationships between customer satisfaction, switching cost, and customer loyalty in France were studied [6].Customer satisfaction is discussed to be measured through sales, installation, product use, repair, and billing experiences [7,8].A model was developed to explain the overall perceived performance at the customer level based on billing, branch networks, fault repair, service provision, brand image, and product satisfaction [9].Because these approaches rely on surveys, they can be used for a posteriori evaluation, but not for predictive purposes.
The telecommunication service providers manage customer relationships using various types of key performance indicators (KPIs) to evaluate the overall customer-perceived quality.Among the KPIs, customer dissatisfaction with a service is an important KPI for telecommunication service providers.There is a positive correlation between customer dissatisfaction and customer complaining behavior [10].

Mobile Information Systems
Experiencing dissatisfaction, customers may take several actions to dissuade their friends or acquaintances from using the service provider's products or services, or to complain through consumer advocacy groups [11].
More recent endeavors implement the QoS-/QoE-based system to measure the status of the network.A method using the operational data from user equipment and different network elements is proposed to overcome the challenges and limitations of manual drive testing [12].Under the paradigm of big data, the probability of subscriber churn is supposed to be well predicted because the intersubscriber influence is adopted [13].A network planning tool was developed using the estimated QoS scores based on usercollected measurements through supervised learning tools [14].A SlopeOne model-based QoS prediction method is also proposed to overcome the difficulty of customer clustering of the popular collaborative filter algorithm [15].For the proper management of mobile edge computing ecosystems, a reference architecture is proposed to leverage network functions virtualization and software-defined networking technologies [16].
Many of the studies use consumers' subjective quality experience to improve network operation and consumer relationship following the standard of ITU, mean opinion score (MOS, ITU-T P.800) [17] or the E-model (ITU-T G.107) [18].MOS is a subjective quality evaluation model using the averages of individuals' subjective evaluations of the quality of telephone calls.The E-model was developed to calculate MOS using objective metrics such as delay time and packet loss rate for a session; this model is extended to consider video and data (ITU-T P.910) [19].
However, we address that these models may not show customers' overall experience scores; rather it evaluates only the session.It also fails to explain the differences between customers, as it is a prediction model based on logical inferences for consumer-perceived quality.In other words, this model may have a little predictive power regarding overall customer-perceived quality.
Among the various complaint actions, the complaint call is an index that can accurately show customer dissatisfaction.Yurtseven advocated using the complaint call as a proxy for the overall perceived quality [20].The service level index (SLI) measures customers' perceived quality from individual experiences [21].In this paper, we develop a model that is different from SLI with the objective of predicting the likelihood of complaint calls.Quick and effective responses to complaint calls can prevent customer churn [22].
To the best of our knowledge, little effort has been devoted to predicting the likelihood of complaint calls.This may be attributed to several difficulties in indexing the complaint call.First, many customers, who have experienced poor service quality, do not make complaint calls [10].Moreover, complaints also depend on the type of goods where 49.6%, 29.4%, and 23.2% of customers take no actions for perishable goods, durable goods, and services, respectively.Second, not every service quality from the complaint calls is at a poor level.It means that customers sometimes make complaint calls despite they experienced a high QoS.For these reasons, additional effort is required to develop a model to predict the likelihood of complaint calls.
Our contributions are summarized as follows: (i) We propose a customer score model using machine learning techniques where the target variable of customer complaint calls has significant relevance to customer-perceived quality.This method integrates a subjective quality model with an objective one; its quality evaluations for individuals are automated.The objective quality model consists of access and service quality models.(ii) We configure a real-time cross-call detailed record (cross-CDR) database using (telephone number, call ID) as a key, based on various CDRs from in-memory computing using the open source Spark software.This method should enable real-time customer quality evaluation, monitoring, and analysis for several billion customer level quality logs per day.We expect that this method will be used as a building block towards future self-organizing networks (SONs) (the autonomous SON requires feedback loop as an input. We think the customer score model could be a very valuable input to SON by signaling poor performing network blocks).
The remainder of the paper is organized as follows.In Section 2, we compare the KPI-based method, which is a popular quality management method, and the customerexperience-based method.In Section 3, we introduce the composition of the overall system.In Section 4, we introduce a method to develop the access quality score model, the service quality score model, and the subjective quality score model.In Section 5, we verify the proposed models and present the experimental results.Finally, we conclude the paper in Section 6.

CEI-Based Quality Management
Customer experience indicator (CEI) quality management is a new paradigm in telecommunication quality management to focus more on individual customers.Before the paradigm, the focus of quality management is more on network system and equipment.

Network Oriented Quality Management.
To deliver excellent QoS, many telecommunication service providers adopt a quality management framework that consists of quality definition, data collection, analysis, and corrective actions [23].They select and monitor various KPIs.KPIs can be at the service level or at the system equipment level, such as success rate for radio resource control request of evolved Node B (eNB), paging success rate and attach success rate of mobility management element (MME), and bearer request success rate of packet gateway (PGW).The thresholds for each KPIs are set, and various data from the network management system, protocol analyzers, and complaint calls are measured and collected.The collected data are compared against the thresholds and analyzed.If necessary, corrective actions are taken to improve quality of the network systems.KPI-based quality control can be limited, because there can be too many KPIs to monitor as the size of the network increases.Moreover, many KPIs are correlated, which makes it difficult to determine the origin of any network problem may arise.In addition, KPIs do not necessarily reflect the user-level QoE, because they typically blear individuals' characteristics using averages for one part of the network.

Individual Oriented CEI-Based Quality Management.
The concept of QoE combines user perceptions, experiences, and expectations with nontechnical and technical parameters to better address customer needs [24].Qualinet defines it as the degree of delight or annoyance of the user of an application or service [25].For example, customer experience management index collects customers' experiences on the benefit, the convenience, and so on, from each unit of a network [26].The CEI is different from KPIs in that it is an individual-level indicator that reflects different customer behaviors.The CEI concept attempts to incorporate customers' individual experiences, needs, and behaviors with the technical KPIs to promote the optimal use of available resources.CEI-based quality management can achieve high retention rates, produce favorable recommendations, and (potentially) stimulate more service use.This method can locate problems and make improvements to support the provision of acceptable service experiences to customers.

Limitations of KPI-Based Systems.
The following issues can occur in current KPI-based quality management systems.
(i) KPI Management Overhead.Telecommunication service providers manage the KPIs of various networks such as wireless access, transport, and core networks.As the network becomes more complex, it may generate large overhead costs for network monitoring, troubleshooting, and improving KPIs.To ensure the best possible QoE on service, and to confirm that services are fit for purpose, it is necessary to manage many different KPIs in near-real-time.However, generally available systems and reports provide averages and do not cater to the experience of individual customers.
(ii) Slow Response to Customer Complaints.Managing KPIs at the device level or the system level may cause slow responses to individual customer complaint calls.The network engineers must determine the root cause of the problem after reception of a call without individual-level experience information.Additionally, the lack of quantitative data for customer-perceived quality can prevent the service provider from undertaking proactive actions.Misclassification of customer complaints may cause unnecessary dispatches to sites, which can be expensive.Approximately 46% of dispatches to sites are unnecessary (Figure 8 shows that, after the system adoption, the rate of the resolution by field engineer decreased from 46% to 25% (1-0.25/0.46= 0.46)).
(iii) Inefficient Use of Resources.Without knowledge regarding customer experiences, the available resources might not be used where they are needed most.Understanding the individual perceived quality levels can be helpful for locating problematic areas that require more investments or improvements.As budgets are limited, improvements can be applied in a suitable order.

System Configuration and Data Collection
To implement a real-time individual-level perceived quality score system, we may need to assess tens of millions of customers making several billion calls per day.Figure 1 shows the system configuration that we implemented to generate a perceived quality score for each user.
The LTE network consists of eNB, MME, serving gateway (SGW), PGW, and call session control function (CSCF) with many interfaces.The eNB is the hardware that communicates directly with mobile phones.The MME, the key controlnode for the LTE access network, is responsible for the authentication, service activation/deactivation, and so on.The SGW routes and forwards data packets and is involved in handover process as the mobility anchor.The PGW is the point of exit and entry to external data network.The CSCF deals with signaling of phone calls.For more details on the LTE system, refer to [27].
Three interfaces are shown in Figure 1.S1-C is the interface between eNB and MME for the control plane protocol.S1-U is the interface between eNB and SGW for the per bearer user plain tunneling and inter-eNB path switching during handover.SGi is the interface among PGW, CSCF, and the Internet.We implemented multiple tappers (three in Figure 1) for packet probing on the control plane (S1-C), the data plane (S1-U), and the SGi interfaces.The protocol analyzer decodes tapped packet streams and generates various CDRs.A single phone call generates multiple CDRs, S1 CDR, session initiation protocol (SIP) CDR, and VoLTE CDR, each of which is stored in different databases in the Hadoop distributed file system (HDFS).Every minute, newly generated CDRs are stored in the HDFS using a file transfer protocol (FTP) batch.User profile data from the business support system (BSS) are also stored in HDFS.
We used Kafka (a distributed messaging system "developed for collecting and delivering high volumes of log data with low latency" [28]) and Spark (a cluster computing technology utilizing in-memory cluster computing to increase processing speed [29]) for converting the CDR log and user profile data into cross-CDRs.To process several billion records in real-time, we allocate 12 servers with 384 cores to the conversion.The six Kafka servers collect CDRs from HDFS and feed them to twelve Spark servers.The Spark servers process CDRs to generate new records in the cross-CDR database.As a single call has multiple CDRs (CDR-S1, CDR-SIP, and CDR-VoLTE), they are combined into one record.The Spark composer module identifies CDRs from the identical calls using phone number and call start/end time and allocates a call ID.It also appends KPIs for connections, KPIs for SIP (session initiation protocol is a signaling protocol for multimedia sessions [30]), and KPIs for VoLTE and equipment ID.The equipment information and customer profiles are also added to the cross-CDRs by the Spark servers.The data in the cross-CDR database flows into the perceived quality model to evaluate the QoS at the  call level in real-time; the customer-perceived quality score is generated at the subscriber level.

Developing a Perceived Quality Score Model for Complaint Calls
In this section, we discuss our score model to predict customer complaint calls, because the complaint calls are a proxy for customer-perceived quality.Our work is focused on voice call quality.To develop the voice call quality model, we work with call quality-related customer complaints only, excluding the other ones such as billing, device, and so on.As shown in Figure 2, the score model consists of service quality and connectivity quality.We develop two models  1 and  2 for connectivity quality and service quality, respectively.These scores are combined with the final experience score  in the following manner: Each   ,  = 1 and 2, ranges from 1 to 100.We use the min function min( 1 ,  2 ), supposing that the worse of the two scores,  1 and  2 , will determine the customer experience.We also consider the psychological effect of peak and end effects.The peak and end effect implies that people judge an experience largely based on how they felt at its peak and at its end, rather than based on the sum or average of the individual moments of the experience.To capture the worst experience, we used a SPC-based approach, which is described in Section 4.3 (2).Development of a quantitative customer-perceived quality model is useful for improving the a priori customer satisfaction.Improved service and complaint handling are possible if the perceived quality levels are better understood by service providers.However, utilizing the complaint call as a target variable is challenging for several reasons.

Challenges
(1) Customers with No Action.It is well-known that a large portion of dissatisfied customers take no action [10]: for services, 29.4% of customers with low quality took no action.On the other hand, for telecommunication services, customers with good network quality sometimes also make complaint calls.The combination of these two opposite errors makes the development of a good scoring system a challenging problem.
(2) Imbalanced Data.Learning from imbalanced datasets can be problematic, making meaningful analyses difficult.
As complaint calls account for a very small fraction (less than 0.01% of daily calls), it is difficult to create an accurate model for predicting complaint calls.We have tried the known methods to tackle the imbalance of the data such as the undersampling, and so on [31,32], without success (the following methods were tried [31]: (1) collection of additional data, (2) resampling, and (3) different algorithms.
We first tested with the data from the past three months without success.The undersampling method was then tried by matching the ratio of the two user groups (those who made calls and those who did not make complaint calls).We then applied various methods including random forest, SVM, LASSO, and decision tree.Those methods exhibit an accuracy ratio between 59.9% and 72.6%, while the accuracy of our method is 91.5%).
(3) Skewed Independent Variables.Many KPIs (such as call drop rate or blocking rate), which are used as explanatory variables, have very biased distributions ranging from 0.1% to 0.01% or less.Therefore, these indices are not good candidates for independent variables.

Key Ideas
(1) Independent Variable Selection with Clustering.Typical supervised learning methods such as regression and the support vector machine did not generate meaningful models because of the imbalanced data and/or the skewness of the variables.We repeated an unsupervised method of cluster analysis until we found a cluster with above-normal complaint call rates.We analyzed and characterized these clusters to choose meaningful variables to use as independent variables.
(2) SPC-Based Poor Quality Pattern Variable Selection.The "peak and end rule" in psychology states that people judge an experience largely based on how they felt at its peak and at its end, rather than based on its average [33].To model an abnormally poor experience of a voice call, we use statistical process control (SPC), which is useful for detecting abnormal patterns.Among many rules of the SPC, we selected a few of them that significantly explain the likelihood of complaint calls.
(3) Use of Pseudo Target Variable via PCA.We do not use a linear regression model with the target variable of complaint calls with, since it is difficult to estimate individual customers' call quality experiences.Instead, we construct a pseudo variable representing the independent variables; this pseudo variable will be used as a target variable.We set the pseudo variable as the first principal component in principal component analysis (PCA).The pseudo variable is used to maintain the characteristics of the independent variables while relaxing the imbalanced class characteristics of the target variable.(1) Variable Selection via Clustering.First, we identify the 16 candidate variables among the KPIs of the quality of calls.These variables are related to packet loss rate, delay, jitter, and bursts.After cleansing the data, we repeated -means analysis by varying the number of clusters and weights of each variable until we find a meaningful cluster with much higher complaint call rates than the average.Table 1 shows the clustering results for three groups, one normal group (1) and two poor groups (2 and 3).The size of the normal group is 95.4%; that of the poor groups is 4.6%.Note that the second and third groups have 1.57 and 3.75 times higher complaint calls rates, respectively, than that of the normal group.

Service Quality
From the results of the clustering analysis, the variables that significantly explain the complaint call rate are identified.We dropped several variables with low explanatory power considering the multicollinearity between the variables and each variable's explanation rate.We chose the drop defect rate ( 1 ), Tx eMOS defect rate ( 2 ) (the eMOS stands for the E-model/MOS score [18]), and Rx eMOS defect rate ( 3 ) as the independent variables for the customer-perceived quality model, where the eMOS score ranges from 1 (worst) to 5 (best).The drop defect rate ( 1 ) measures the drop rate of the calling services.The eMOS score is calculated every 5 seconds, and it is considered as defect if the score is less than or equal to 2. The eMOS defect rate is the ratio of the number of eMOS defect to the total number of eMOS scores.Tx eMOS defect rate ( 2 ) and Rx eMOS defect rate ( 3 ) represent the fraction of poor quality period for outgoing and incoming calls, respectively.
(2) SPC-Based Abnormal Quality Pattern Variables Selection.Based on the peak and end rule from the field of psychology, we try to detect abnormally poor QoS as it can affect the perceived quality for customers.We adopted a method based on SPC, which is known to be very helpful for detecting abnormal behavior [34].
We used popular rules called the Western electric rules (for the zone rules and the asymmetric rules, refer to https://en.wikipedia.org/wiki/WesternElectric rules), especially the zone the asymmetric rules, which are used to find abnormal patterns in SPC (for other approaches to detect anomalies, refer to [35,36]), for example, "two consecutive values higher (or lower) than 2" or "four consecutive points out of five higher (or lower) than " to detect the bad perceived quality from frequent, consecutive, or intense defects at the subscriber level.Each call generates a sequence of eMOS scores, as shown in Figure 4.Each point in the figure corresponds to an eMOS score of  = 5 seconds.The  is the mean of eMOS scores.In Rule 1, if two consecutive points are higher (or lower) than 2, we judge it as an anomaly.As "out of 2" is a rare event, it shall be even rarer to observe its continuous appearance as in Rule 1.
Among the candidates in Table 2, Rules 1 and 2, are selected as variables  4 and  5 , respectively.If the ratio of  to  is higher than a threshold (we use 1.5 as the threshold.There is a significant gap observed between the ratios (/) of Rules 2 (1.625) and 3 (1.083)),we consider the rule to have correlation with customer complaint calls where  is the average complaint call rate for those who exhibit the abnormal pattern of rules, and  is the rate for those who have no such patterns.The variable  4 counts the number of patterns of "2 consecutive points higher (or lower) than 2" from the chart.Similarly, the variable  5 counts the number of patterns "4 consecutive points out of five higher (or lower) than ".(3) Pseudo  via PCA.PCA is commonly used to extract the representative characteristics of the dataset.Because of the imbalanced class characteristics of the variables, we could not develop a meaningful regression model for  as a complaint call rate.Instead, we generate the pseudo variable   by applying PCA to sets of  1 ,  2 , and  3 .We set the first principal component as the pseudo target variable   .We did not include the two derived variables,  4 and  5 , for the PCA, so as to avoid distortion caused by the correlated independent variables.We constructed regression models using the five independent variables (including the two derived variables) to explain the pseudo variable using various construction options.Then, we choose a regression model whose estimations show the greatest similarity with the values of the true target variable as the final call quality model.We tested the model's goodness of fit using the complaint call rate, not using statistical performance indexes, so as to increase the explanatory power of our regression model.We calculated the average complaint call rate of users whose score is less than or equal to 30 and selected a model with the highest complaint call rates.
(4) Service Score Model  1 .The resultant service regression model  1 is given by where  1 is the drop defect rate,  2 is the TX eMOS defect rate,  3 is the Rx eMOS defect rate,  4 is the count of Rule 1, and  5 is the count of Rule 2. The variable  1 is zero if all variables are zero, which means best quality; a higher  1 means a poorer quality for customers.Based on the value of  1 , the output of the regression model, the score  1 is calculated as shown in Table 3.The zero-percentile ( 1 = 0) (the zero-percentile corresponds to the regression model value  1 of 0.0) gets the score of 100, and 99-percentile ( 1 = 0.5) (the 99 percentile corresponds to the regression model value  1 of 0.5) gets the score of 10.The idea behind the function is that we would like to focus on 10% of customers who are highly likely to make complaint calls.Those are who have scores less than 30.In those regime, 1-percentile has score less than 10; 5-percentile has score less than 20; and 10-percentile has score less than 30.For the other 90 percent of customers, we use a linear score function  1 = 100 − 0.8, where  is the percentile except scores between 30 and 40.

4.4.
Connectivity Quality Score Model  2 .The connectivity score is developed to model the perceived quality of connection establishment and connection drop.To evaluate the connectivity quality score, we selected approximately 100 KPI variables from S1/SIP/VoLTE CDRs, some of which are shown in Figure 1.With basic statistical analyses, we chose 24 variables, as shown in Table 4, that can represent the overall connectivity quality: we omitted variables with missing values and high correlations using the multicollinearity test.Except the decision tree model, most predictive models with supervised learning did not work because of the high imbalance of datasets and skewness of variables (unlike service quality models, we think that a decision tree model is obtainable because the connection quality is either one (if a service is connected) or zero (otherwise)).We used SAS Eminer package [37] to derive the decision tree with the C4.5 algorithm [38] to find the decision tree model for complaint calls of Figure 5.The target variable is whether a call contains complaint calls or not; the independent variables are those 24 in Table 4.We varied the parameters (such as maximum number of branches, maximum depth, and leaf size) to find the decision tree with the highest accuracy.The tree showed the performances of the accuracy of 99.6146% and the sensitivity of 83.3333%, and the specificity of 99.6151%.
The decision tree shows that the number of connection failures (CNT CON FAIL) is very important.If this number is higher than 20, the score is zero; if it is between 2.5 and 20, the score is 20.The next branch is the number of service request mobile termination (SRMT) failures (CNT SRMT FAIL).After a connection has been established, if the service request fails, CNT SRMT FAIL is increased by one.If the number of failures is higher than 8.5, the score is 40; otherwise, the next variable is the number of  initial registration failures (CNT INIT REG FAIL), which happens in the authentication phase.If the number of failures is higher than 15.5, the score is 20.Otherwise, method checks the real-time transport protocol (RTP) defect rate (RTP DEF RATE).The RTP packets should be shown in both directions.However, if they are shown in only one way or less, it can be considered as a defect.The RTP defect rate is the number of RTP defect to the total number of connections.If the rate is higher than the threshold, the score is 20.Otherwise, the score is 100.
The connectivity score  2 is calculated using the percentile-score mapping of Table 3.Each box in the decision tree of Figure 5 has a complaint call rate.For example, the complaint call rate of the bold-line box under the CNT CON FAIL node in Figure 5 is 0.24%.The complaint call rate is mapped into percentile; the percentile is mapped into the score using Table 3.The rate 0.24% corresponds to 95 percentile and the score of 20.
Figure 6 shows the complaint call rates versus connectivity quality score.As the decision tree generates 4 scores of 0, 20, 40, and 100, there are 4 dots in the Figure .It shows that the extreme poor quality score of 0 has high complaint call rates 0.73%, while the score of 100 has that of 0.018%.The worst score group has 40 times higher rate than the best score group.
Perceived Quality Score.The perceived quality score  is the minimum of the two scores, namely, service score  1 and connectivity score  2 , as Equation ( 1).This choice can be justified from the fact that customers react more to worse experiences (minimum of two) than average experiences (weighted sum) by the peak and end rule.

Experiments
We developed the model using the data collected from August 10 to 16, 2016, from a production network.It contains CDRs of 35 billion calls from 9 million subscribers.The validation is performed with the test data from August 17 to 23, 2016.using the training data, while the other line with squares results from the validation data.The two lines show very similar pattern with the largest difference of 11% at the score of 10.The complaint call rate is low when the objective score is higher than or equal to 40 and starts jumping from the score of 30 or less.The average complaint call rate of the low score group (30 or less) is six times higher than that of the high score group (higher than 30) and it is 11 times higher than the average.The complaint call ratio of the worst group (score 10 or less) to that of the best score group (higher than 90) is 22.The figure confirms the validity of the proposed score as a good measure predicting complaint calls.

Comparison with Random Guess Model.
Table 5 shows the performance comparison of our model with that of a random guess model, which does random coin flipping for the prediction.The random guess model does coin flipping with probability of 0.067% for complaint call, which is the average complaint calls.The true positive value, 1.08 − 6, of the proposed model is 2.4 times higher than 0.45 − 6 of the random guess model.Observe also that true negative value is improved by a similar amount even though it is very small due to high imbalance of the data.Observe also that the accuracy, sensitivity, and specificity are also improved.

Comparison of KPI versus CEI-Based Model: Investment
Targets.We selected the top  worst performing cells, which are top candidates for upgrade and repair, from both methodologies and compared them as shown in Table 6.In the KPIbased method, which is the method used currently, important KPIs such as the cell outage rate, the paging success rate, the number of the radio link failures, the number of radio resource control setup failures, and so on are considered.Each KPI value is mapped into a KPI score between 0 and 100, and the total score of the cell is a weighted sum of the KPI scores.For example, the more the outage rate, the lower the score.In the new CEI-based model, we defined bad quality users as those who have score less than or equal to 30.We counted the number of bad quality users in all cells and sorted cells in a descending order of the number of bad quality users.It compares the average complaint call rates from the selected  cells.Observe that the complaint call rates from the CEI-based methods are two to six times higher than those from the KPI-based method.The average complaint calls from top 10 cells are 0.78% and 0.13%, respectively.The CEIbased method is able to select cells that cause complaint calls better than the KPI-based method.

Complaint Call Response
Performance. Figure 8 shows the changes in the first call resolution (FCR) rate in the call center and the field dispatch rate trends after the adoption of the system.The FCR rate is the fraction of customers whose issues were resolved the first time they called.We monitored the trends up to seven months after adoption.Prior to the introduction of the system, the FCR rate was about 45%.After adoption, the FCR rate increased gradually to 51%, 60%, 67%, and 73%, and then leveled off at around 72%.The decreasing line shows the field dispatch rate (FDR) of field engineers.When issues are not resolved at the call center, the field engineers are dispatched to solve them.Prior to adoption, the FDR was between 45% and 50%.After the adoption, it decreased to 25%, which is about half of the prior rate.With the adoption of the system, the call center representative can more accurately determine the customer quality issues and resolve many of them in advance, which reduces the demands on the field engineers.

Conclusions
Using a machine learning methodology, we developed a scoring model to predict the likelihood of customer complaint calls as a proxy for customer-perceived quality.The developed model consists of two components: service quality and connectivity quality.The service quality model considers the perceived service quality during the sessions, while the connectivity quality considers connection establishment.We overcame several issues using clustering-based variable selection, pseudo target variable generation using PCA, and the introduction of SPC-based variables.We implemented the model in a real production network system that handles several billion calls per day.To manage the tens of billions of CDRs, we used the open source Kafka and Spark software packages.The developed system generates a cross-CDR database to be used by the quality model after merging various CDRs.
The validation test showed that the score model has strong explanatory power.Individuals from the lowest scored group (a score less than or equal to 30) have a 20 times higher likelihood of making complaint calls than the highest score group.Also, the sensitivity of the proposed model is 2.4 times higher than that of the random guess model.Compared to the legacy system, the new system based on the proposed scoring model detects base stations that cause a high level of complaint calls.When cells are selected with the new method, the complaint calls made in the top ten worst performing cells increase by about six times.By upgrading or repairing these cells first, we believe that customer dissatisfaction can be handled more efficiently.When we monitored the first call resolution rate, that is, the fraction of customers whose issues are resolved at the first call, the rate increased from 45% to 72% within six months of adoption, which shows the effectiveness of the proposed system.
Integration of a subjective quality model with the overall quality model is a future research direction.Additionally, the differences in the data service quality or video service quality from the call quality can be integrated with the objective quality model.Our final goal is to generate data that can be used to improve business performance by analyzing the influence of network quality on customer net profit scores and customer churn.

Figure 1 :Figure 2 :
Figure 1: System diagram and an example cross-CDR database.

Table 3 :
Scoring of objective quality score  1 .

Figure 5 :
Figure 5: Decision tree model for connectivity quality.

Figure 6 :
Figure 6: Connectivity quality score versus complaints call rate.

Table 1 :
Variable selection via clustering analysis.

Table 2 :
Selection of rules for derived variables.
X Figure 4: Matched SPC rule having an effect on the complaint call rate.

Table 4 :
Selection of variables for connectivity quality.Drop count/connect success rate/connect fail rate/drop rate/no Tx-RTP one-way-call defect/no Rx-RTP one-way-call defect/no Tx-RTP one-way-call defect rate/no Rx-RTP one-way-call defect rate

Table 5 :
Performance comparison with random guess model.

Table 6 :
Comparison of complaint call rates for selected  base stations for  = 10, 50, 100, and 200.