Estimating Spectral Efficiency Curves from Connection Traces in a Live LTE Network

In cellular networks, spectral efficiency is a key parameter when designing network infrastructure. Despite the existence of theoretical model for this parameter, experience shows that real spectral efficiency is influenced by multiple factors that greatly vary in space and time and are difficult to characterize. In this paper, an automatic method for deriving the real spectral efficiency curves of a Long Term Evolution (LTE) system on a per-cell basis is proposed. The method is based on a trace processing tool that makes the most of the detailed network performance measurements collected by base stations. The method is conceived as a centralized scheme that can be integrated in commercial network planning tools. Method assessment is carried out with a large dataset of connection traces taken from a live LTE system. Results show that spectral efficiency curves largely differ from cell to cell.


Introduction
In the coming years, an exponential growth of cellular traffic is expected.Specifically, a 10-fold increase in mobile data traffic is forecast from 2015 to 2021 [1].Meanwhile, the proliferation of smartphones and tablets has changed the most demanded services in cellular networks.These changes will continue with the massive deployment of machine-type communications in Internet-of-Things applications [2].To cope with these changes, future mobile networks will have to combine multiple technologies.Thus, service and network heterogeneity has been identified as a critical issue in future 5G networks [3,4].
In parallel, the increasing size and complexity of cellular networks is making it very difficult for operators to manage their networks.Thus, network management is one of the main bottlenecks for the successful deployment of mobile networks.To tackle this problem, industry fora and standardization bodies set up activities in the field of Self-Organizing Networks (SON) while defining 4G networks [5].Self-organization refers to the capability of network elements to self-plan, self-configure, self-tune, and self-heal [6].This need for self-organization has also been identified by vendors, which now offer automated network management solutions to reduce the workload of operational staff.
Legacy SON solutions are restricted to the replication of routine tasks that were done manually in the past.Currently, network planning and optimization is mostly based on performance counters and alarms in the network management system [7][8][9].Thus, other data from network equipment and interfaces that could give very detailed information is discarded.Such a piece of information is only used in very rare cases for troubleshooting after a tedious analysis.However, with recent advances in information technologies, it is now possible to process all these data on a regular basis by means of Big Data Analytics (BDA) techniques [10].In cellular networks, "big data" refers to configuration parameter settings, performance counters, alarms, events, charging data records, or trouble tickets [11].
While BDA have long attracted the attention of the computing research community, this field is relatively new in the telecommunication industry.In [3], the authors propose a generic framework for improving SON algorithms with big data techniques to meet the requirements of future 5G networks.With a more limited scope, a self-tuning method is proposed for adjusting antenna tilts in a Long Term Evolution (LTE) system on a cell basis based on call traces [12].Likewise, a review of network data used for self-healing in cellular networks is presented in [13].However, few works have used BDA for self-planning cellular networks.
In radio network planning, the key figure of merit to evaluate network (or channel) capacity is spectral efficiency (SE).A theoretical upper bound on the channel capacity of a single-input single-output wireless link is given by the Shannon capacity formula [14].This formula can be adapted to approximate the maximum channel capacity under certain assumptions specific to each radio access technology [15][16][17][18][19].However, even if channel capacity is mainly determined by signal quality, it is also affected by the radio environment (user speed, propagation channel, etc.), the traffic properties (service type, burstiness, etc.), and the techniques in the different communication layers (multiantenna configuration, interference cancellation, channel coding, radio resource management, etc.).As considering all these factors is extremely difficult, most network planning tools rely on mapping curves relating signal quality to SE (a.k.a.SE curves), generated by link-level simulators [20][21][22].This approach is still limited, as simulators make simplifications for computational reasons, and there remains the problem of selecting the right combination of simulation parameters that closely match the reality.
In this work, a new automatic method for deriving the real SE mapping curves for the downlink of a LTE system on a cell-by-cell basis is proposed.The method is based on a trace processing tool that makes the most of detailed network performance measurements collected by base stations (specifically, signal strength, traffic, and resource utilization measurements).Method assessment is carried out with a large dataset of connection traces taken from a live LTE system.The main contributions of this work are (a) a data-driven methodology for deriving SE mapping curves from real network measurements, which can be integrated in commercial network planning tools, and (b) a set of SE curves obtained from connection traces collected in two live LTE systems.
The rest of the paper is organized as follows.Section 2 presents the classical approach to derive SE curves in radio network planning tools.Section 3 explains the trace collection process.Section 4 describes the new methodology to derive SE curves from user connection traces.Section 5 presents the results of the proposed method over a real trace dataset taken from the live network.Finally, Section 6 presents the main conclusions of the study.

Current Approach
In wireless technologies, SE is strongly affected by the link adaptation scheme.For clarity, a brief overview of the link adaptation process in LTE is first given.Then, the classical abstraction model of the link layer integrated in most network planning tools is explained.a user.In LTE, this is achieved by dynamically changing the Modulation and Coding Scheme (MCS) depending on radio link conditions.Figure 1 shows the structure of the classical LA scheme for the downlink of LTE [23].Better radio link conditions translate into a higher reported CQI, thus allowing the eNB to select more effective MCSs (i.e., higher order modulations with more bits per symbol and less redundancy).Conversely, in poor radio link conditions, a lower CQI is reported, and more robust MCSs are selected (i.e., lower order modulations with less bits per symbol and more redundancy).

Link Adaptation
The actual SINR values triggering the use of different MCSs in ILLA are vendor-specific and depend on the network conditions assumed by the vendor (radio environment, antenna configuration, traffic properties, network features, etc.).

Link Abstraction Model.
As a result of LA, SE (and link capacity) can be treated as a function of SINR.In most network planning tools, SINR is estimated on a per-location basis.Then, the maximum SE of a single-input single-output system (in bits/s/Hz) for infinite block length and infinite decoding complexity in an Additive White Gaussian Noise (AWGN) channel can be obtained by the Shannon capacity formula [14] as where SNR is the signal-to-noise ratio.For general multipleinput multiple-output systems with perfect transmitted knowledge, the Shannon capacity is [16] where   and   are the number of transmit and receive antennas, respectively, and SNR  is the SNR of the th spatial subchannel.In practice, real implementations are below the theoretical limit given by (2).Thus, the real SE of the limited set of MCS specified in the standard can be better approximated by the Truncated Shannon Bound (TSB) formula [24] suggested in [19]: where  is the SINR of the link,  min is a lower limit on SINR below which SE is zero,  max is an upper limit on SINR associated with the SE of the highest implemented MCS (e.g., 64 QAM, rate 4/5, in this work), SE max , BW eff is the system bandwidth efficiency that accounts for different overheads (pilots, cyclic prefix, control channels, etc.), and  is a correction factor to reflect implementation losses.The values of  eff and  for different antenna configurations and packet scheduling schemes are presented in [24].SE estimate in ( 3) is still an optimistic value of the link SE.Classical LA schemes based on adaptive thresholds (i.e., OLLA + ILLA) suffer from slow convergence with strongly biased CQI reporting [25,26].Such a slow convergence is a major issue in current LTE networks due to the prevalence of short connections [27].Even if more realistic values of SE could be obtained from simulations, these cannot capture all possible factors, which greatly vary from cell to cell and dynamically change with time.As a result, SE and throughput measurements are much lower than expected in live networks [28].
Network planning is negatively affected by the overestimation of SE, as this parameter controls the expected demand of network resources.Thus, underestimating the average cell load during network coverage planning might lead to a too optimistic cell radius from unreal cell-edge performance.Likewise, underestimating cell load might give an inadequate amount of the traffic resources needed per cell during network capacity planning.All these problems can be solved by deriving a more realistic SINR-to-SE mapping from connection traces.

Connection Traces
Data for managing a radio access network includes DTFs can be further classified into User Equipment Traffic Recording (UETR) and Cell Traffic Recording (CTR) [29].UETRs are used to single out a specific user, while CTRs are used to monitor cell performance by monitoring all (or a random subset of) anonymous connections [30].The former are used for network troubleshooting, whereas the latter are used for network planning and optimization purposes.
Depending on the involved network entities, events can be classified in external or internal events.External events include signaling messages that eNBs exchange with other network elements (e.g., UE or eNB) through the Uu, X2, or S1 interfaces [31][32][33].Internal events include vendor-specific information about the performance of the eNB.

Trace Collection.
Figure 2 depicts the reference architecture for trace collection in LTE [30].CTR collection starts by the operator preparing a Configuration Trace File (CTF) in the Operation Support System (OSS), with (a) the event(s) to be monitored, (b) the cells and the ratio of calls for which traces are collected (i.e., UE fraction), (c) the ROP (typically, 15 minutes), (d) the maximum number of traces activated simultaneously in the OSS, and (e) the time period when trace collection is enabled.After enabling trace collection, UEs transfer their event records to their serving eNB.When ROP is finished, the eNB generates CTR files, which are then sent to the OSS asynchronously.

Trace Preprocessing.
Trace files are binary files encoded in ASN.1 format [29].The structure of events consists of a header and a message container including different attributes (referred to as event parameters).The header contains general attributes associated with the event description, such as the timestamp, the eNB, the UE, the message type, or the event length, while the message container includes specific attributes associated with the message type.
Trace decoding is performed by a parsing tool that extracts the information contained on fields.In most cases, the output is one file per event type, eNB, and ROP.Then, traces are synchronized by merging files from different eNBs by event type and ROP and ordering events by the timestamp attribute.Thus, it is possible to link simultaneous events of the same type from different eNBs (e.g., incoming and outgoing handover events).

Estimating Spectral Efficiency from Traces
A method for building a link-layer abstraction model for LTE downlink from network measurements is proposed here.The model relates SINR to SE based on signal strength, traffic, and radio resource measurements obtained from the live network.Such measurements are generated by the UE and the eNB, and later uploaded to the OSS in the form of connection traces.The inputs to the algorithm are CTR files with the following events: It can be configured to be reported periodically or event-triggered.In the former case, each connection can comprise many records of this event.A measurement report is said to belong to a given connection if it is reported during such connection.Figure 3 illustrates an example of how these events are distributed within a call.A call starts with a connection setup and ends with a connection release.While in a call, the UE may perform a handover between cells.The term "connection" refers to the time spent by a UE in a cell, until a handover is executed or the call is finished.Therefore, a call may contain more than one connection.A UE traffic event is reported at the end of each connection, while RRC measurements are generated periodically along a connection.Tables 1 and 2 present the most relevant parameters in the UE Traffic Report and RRC Measurement Report events.[35].
Figure 4 shows the flow diagram of the proposed algorithm.In stage 1, the time distribution of cell load is calculated per cell as the percentage of used REs during a fixed time period based on the information in UE Traffic Report events.In stage 2, the average SINR per connection is calculated as the ratio between the average received power from the serving cell and the sum of the interference power plus background noise (in linear units).To estimate interference levels, RSRP samples in RRC Measurement Report events are combined with cell load estimates computed in stage 1.In stage 3, the average SE per connection is calculated as the ratio between the total carried traffic volume and the amount of used REs based on the information in UE Traffic Report events.In stage 4, a fitting curve is built relating average SINR and average SE estimates from stages 2 and 3.All these operations are described in more detail in the following paragraphs.

Stage 1: Estimation of Cell Load Distribution over Time.
In this work, cell load is defined as the ratio of REs occupied for transmission.In the network, cell load changes every Time Transmission Interval (TTI).As the number of REs used per connection is only available at the end of the connection, cell load cannot be calculated on a TTI basis.Alternatively, cell load is estimated with a lower resolution by defining a fixed time granularity of several TTIs.Then, the total number of REs used by a connection is evenly distributed across the equally spaced time intervals from the start to the end of the connection.First, the average resource usage rate (in RE/s) in cell   from the th connection (where   is the serving cell of that connection) is computed as where   is the total amount of resources used by the th connection (in REs), and  (0)  and  (1)   are the start and end times of the th connection (in s), respectively, as illustrated in Figure 5.

}
Vector with RSRP values in th measurement report associated with vector of cell identifiers   , so that  ()  is the RSRP measured from cell  ()   .  = { (1)   , . . .,  where   is the resource usage rate for the th connection,  (0)  and  (1)   are the start and end points of the th connection (in s), and Δ is the sampling period defining the time resolution (in s).
Finally, the sampled average load distribution of cell , () = {(,  1 ), . . ., (,    )}, is calculated as the ratio between the sum of resources used by connections and the total amount of available resources in that cell in the th period, as where  RE is the total number of available REs per time slot,  sc is the number of subcarriers per Physical Resource Block (PRB), set to 12,  RB is the number of PRBs in the cell, given by the system bandwidth,   is the number of OFDM symbols per slot (6 or 7 for normal or extended cyclic prefix, resp.),   is the slot duration (0.5 ms), and Δ is the time interval duration (i.e., the sampling period).Also, 1/() is a correcting factor that represents the traced connection ratio configured by the operator.If all connections are traced in the network (i.e., UE fraction is 100%), then, () = 1.

Stage 2: Estimation of Average SINR per Connection.
The SINR is defined as the ratio between the received power from the serving cell and the sum of the interference power (i.e., received power from adjacent cells) plus background noise (in linear units).In LTE, different REs transmit different signals, causing the fact that not all REs in a resource block experience the same SINR.3GPP specifications do not standardize how SINR is measured, so the actual definition is vendor-specific.It can be measured in data or in reference signals REs.However, SINR is generally calculated on the REs carrying reference signals [36].In our case, the average SINR (in natural units) for the th measurement report can be estimated as where  (1)   is the RSRP (in mW) of the serving cell in the th measurement report,  ()   is the RSRP (in mW) of the th neighbor cell in the th measurement report,    is the number of cells in the th measurement report, ( ()   ,  (MR)  ) is the average load of the th interfering cell in report  at the time interval  (MR)  when the th measurement report was sent, and  0 is the background noise (in mW).
As previously stated, the UE may send more than one RRC Measurement Report per connection.Therefore, it is necessary to obtain an average SINR per connection.The average SINR for the th connection is obtained as where   is the connection to which the th measurement report belongs and    is the number of measurement reports in the th connection.

Stage 3: Estimation of Average SE per Connection.
SE is defined as the data rate that can be transmitted over a given bandwidth in a communication system.Based on the UE Traffic Report, the average SE (in bps/Hz) in REs assigned to the th connection can be estimated as where   is the traffic volume in the th connection (in bytes),   is the total amount of resources used in th connection (in REs),  sc is the subcarrier bandwidth (15 kHz in LTE),   is the number of OFDM symbols per slot (6 or 7 for normal or extended cyclic prefix, resp.), and   is the slot duration (0.5 ms).Note that ( 9) is restricted to data REs.Thus, it considers the loss of SE due to cyclic prefix, but does not take into account other factors such as (a) the limited BW occupancy to satisfy the Adjacent Channel Leakage Ratio (ACLR), (b) the pilot overhead due to CRSs, and (c) the dedicated and common control channel overhead.All these factors can be added later if needed for planning purposes, based on the values suggested in [19].

Stage 4: Construction of Link-Level Mapping Curves.
The SINR-to-SE curve is computed by regression analysis of the scatter plot built with the average SINR,   , and average SE, SE  , estimated on a per-connection basis.Depending on the aggregation level, the output of the regression analysis is a single mapping curve for the whole network or a set of curves constructed on a per-cell basis.In principle, any regression method could be applied as long as it provides good fitting.Previous studies suggest a logarithmic fitting, based on the expression of the Shannon bound [19], or an arctangent-based approach, based on empirical results [37].In this work, a simple polynomial regression from logarithmic SINR values is used for simplicity and flexibility, as it is included in most statistical analysis packages and does not presume any shape of the mapping function.
Several factors may add dispersion to the SINR-to-SE estimates, causing two connections with the same average SINR to have different average SE.A first reason is instantaneous SINR fluctuations due to fading, multipath, and other propagation phenomena, which is not reflected in SINR averages.A second reason is the limited time resolution of RRC measurements, which may cause the average SINR estimate to not reflect the true average SINR of the connection.Another reason is the service type, as the LA scheme requires certain time to converge, which might not be satisfied in short connections.All these factors degrade regression performance.
To increase the robustness of regression, several actions are taken.To improve the accuracy of SINR measurements per connection, regression analysis is carried out over connections with more than 1 RRC Measurement Report.Likewise, piecewise regression is used to avoid the fact that the most populated SINR values dominate the regression equation.Thus, SINR measurements are divided into bins of 1 dB, centered at integer SINR values (i.e., . . ., −3, −2, −1, 0, . . .dB).Then, a single SE value is computed per bin by averaging the SE of all connections in the bin.Bins with less than 50 samples (connections) are discarded for the regression analysis.
It should be pointed out that the output of the method is a curve relating the average SINR and SE of a connection.This is the information needed by a network planning tool, where SINR and SE are calculated per location in the form of averages.Thus, the resulting curve might differ from the curves used in system-level simulators, where the link-layer model considers instantaneous SINR and SE values.

Results
The proposed method is tested with trace datasets taken from a live LTE network.For clarity, the analysis methodology is first described and results are presented later.Finally, implementation issues are discussed.

Analysis Setup.
Two trace datasets are used in the analysis, taken from different networks (referred to as Network 1 and Network 2).Table 3 describes their main parameters.The bulk of the analysis is carried out on Network 1, and Network 2 is only used to check the impact of the network configuration and service mix.Even if traces include both downlink and uplink measurements, the analysis presented here is restricted to the downlink.
The proposed trace-based approach to derive the SINR-SE curves is compared with a theoretical bound in the absence of a dynamic system-level simulator that captures the diversity of services, radio environments, and features in the real network.Specifically, the following approaches are evaluated: (c) TB-C: the proposed trace-based approach applied to the traces of a single cell, resulting in a mapping curve per cell.
Method assessment is carried out by comparing the shape of the SINR-SE curves.For a fair comparison, the SE of all methods is restricted to data REs.Thus, the bandwidth efficiency parameter in TSB-MIMO only considers the loss due to cyclic prefix; that is, BW eff = 0.927 for long prefix.Hence, the maximum achievable SE in antenna configurations with 1 spatial stream, SE max , corresponding to the highest MCS (i.e., 64 QAM, rate 4/5), is 6 bits/symbol ⋅ 14 symbols/(ms⋅subc.)⋅ 1 subc./15kHz ⋅ 4/5 ⋅ 0.927 = 4.15 bps/Hz.shows the simplified scatter plot obtained by discretizing SINR values and computing a piecewise regression of order 0 (denoted as piecewise regression).To aid comparison, the curves obtained by polynomial regression on the original and simplified data are also superimposed (denoted as original and piecewise, resp.) and the -axis is restricted to the range of [−10, 30] dB.From the figure, it is clear that the regression curve derived from the points computed by piecewise regression better captures the average SE trend.This is confirmed by the large value of  2 .
To show the benefit of using real traces, Figure 7 compares the TSB-MIMO (theoretical) and TB-N (practical) approaches.For TB-N, 95% confidence intervals for the average SE in each SINR band are included.Note that both methods result in a single curve for the whole network.It is observed that SE values in traces are consistently below the maximum theoretical values suggested by TSB-MIMO.This gives clear evidence of the need for computing SINR-SE curves from real connection traces.
The reasons for such differences are the link adaptation process and the transport protocol.In [28], it was shown that connection length has strong impact on user throughput.Short connections, prevailing in current LTE networks, suffer from reduced user throughput.This is due to the slow Outer Loop Link Adaptation Process (OLLA) convergence and the slow-start feature of Transport Control Protocol (TCP), causing throughput to ramp up. Figure 8   short and long connections.In this work, a connection with less than 20 ACK + NACK is classified as a short connection.Conversely, a connection with more than 100 ACK + NACK is classified as a long connection.In the figure, it is observed that the maximum SE for long connections is more than three times larger than for short connections (1.45 versus 0.45 bps/Hz).This is mainly due to OLLA convergence issues, as traffic burstiness caused by TCP ramp-up should not affect the selected MCS.By comparing Figures 7 and 8, it can be deduced that, even for long connections, the theoretical curve is a loose upper bound for the average SE with good radio link conditions.To show the benefit of computing a curve per cell, Figure 9 compares the output of the trace-based approach executed on a cell basis, TB-C, for two cells in the system.The networkwide curve obtained by TB-N is also included as a reference.It is observed that SE values may differ from cell to cell up to a 150% for the same SINR value.A closer analysis (not presented here) shows that this is due to the fact that the ratio of long connections in Cell A is 41% and it is only 20% in Cell B. Recall that connection length has a strong impact on user throughput due to OLLA convergence issues and TCP slowstart feature.Thus, the connection length distribution in a cell strongly influences the spectral efficiency curve measured for that cell.The observed differences justify the need for deriving SE curves on a cell basis.
Finally, Figure 10 compares the results of the trace-based method in the two datasets from different networks.For brevity, the analysis is restricted to the network-wide solution for long connections.In the figure, even if trends are similar, Network 1 has a SE lower than Network 2 for the same SINR.This might be due to the different service mix in both networks.To back up this statement, a deeper analysis of radio network measurements is done.On the one hand, traces show that 50% of long connections in Network 1 have less than 300 ACKs + NACKs, compared to only 10% in Network 2. Thus, the probability that OLLA has reached steady state before the end of a connection is higher in Network 2. On the other hand, network counters show that the percentage of active TTIs where the user buffer is emptied (i.e., last TTI transmissions [34]) is 41% for Network 1 and only 25% for Network 2. In last TTI transmissions, some REs in the PRBs assigned to the user might not carry data because there is not enough data, decreasing the link SE.Thus, the number of underutilized resources for this reason should be larger in Network 1.Both effects indicate that traffic in Network 1 is more bursty than in Network 2. These differences justify the need for deriving a specific SE curve for each network.

Implementation Issues.
The method is designed as a centralized scheme that can be integrated in a commercial radio network planning tool.Its low computational load makes it a perfect candidate for improving measurementbased replanning algorithms.The worst-case time complexity is linear in the product of the number of cells and trace collection periods.In practice, the most time consuming process is parsing and synchronizing the traces, which can be done with trace processing tools provided by OSS vendors.The rest of the method can be developed in any programming language (in this work, R [39]).Specifically, the total execution time for Network 1 dataset in a 2.6-GHz quad-core processor laptop is less than 780 s (3 s per 1000 connections).

Conclusions
Link spectral efficiency is a key parameter when designing and optimizing cellular networks.Unfortunately, such

( a )
Configuration Management data (CM), consisting of current network parameter settings, (b) Performance Management data (PM), consisting of counters reflecting the number of times some event has happened per network element and Reporting Output Period (ROP), (c) Data Trace Files (DTFs), consisting of multiple records (known as events) with radio related measurements stored when some event occurs for a single User Equipment (UE) or a base station.

Figure 5 :
Figure 5: Temporal distribution of resources in a connection.
(a) TSB-MIMO: modified truncated Shannon bound adjusted for best fit to link-level simulation curves for 2 × 2 multiple-input multiple-output antenna configuration with Alamouti Space Time Coding under Typical Urban (TU) channel at 3 km/h and Proportional Fair-Time Dependent Packet Scheduling (PF-TDPS) [38]; it corresponds to transmission mode 3 (open-loop spatial multiplexing) with Rank 2; specifically, BW eff = 0.927,  = 0.62,  eff = 1.4, min = −10, and SE max = 4.15 bps/Hz;(b) TB-N: the proposed trace-based approach applied to the complete set of traces, resulting in a single mapping curve valid for the whole network;

Figure 6 :
Figure 6: Influence of SINR discretization in trace-based approach.
confirms this observation by showing the SE curve obtained by TB-N for

Figure 7 :
Figure 7: Comparison between theoretical and real spectral efficiency curves.

Figure 8 :Figure 9 :
Figure 8: Impact of connection length on spectral efficiency.

Figure 10 :
Figure 10: Impact of selected network on spectral efficiency.

Table 1 :
Parameters in UE Traffic Report event.
In the tables, subindex  refers to the traffic report (i.e., connection), and subindex  refers to the RRC Measurement Report.In Table1, it is worth noting that   only counts  End time of th connection   Unique cell identifier of cell serving th connection REs used for user data transmission in the Physical Shared Data CHannel (PDSCH) and thus excludes REs used for Cell Reference Signals (CRS) and other signaling information (e.g., Physical Common Control CHannel, PDCCH)

Table 2 :
Parameters in RRC Measurement Report event.