Hybrid ARIMA and Neural Network Modelling Applied to Telecommunications in Urban Environments in the Amazon Region

(is study explores the use of a hybrid Autoregressive Integrated Moving Average (ARIMA) and Neural Network modelling for estimates of the electric field along vertical paths (buildings) close to Digital Television (DTV) transmitters. (e work was carried out in Belém city, one of the most urbanized cities in the Brazilian Amazon and includes a case study of the application of this modelling within the subscenarios found in Belém. Its results were compared with the ITU recommendations P. 1546-5 and proved to be better in every subscenario analysed. In the worst case, the estimate of the model was approximately 65% better than that of the ITU. We also compared this modelling with a classic modelling technique: the Least Squares (LS) method. In most situations, the hybrid model achieved better results than the LS.


Introduction
e growth in the population density of big cities has made residential and commercial buildings a common feature of the urban landscape, as they optimize the spatial exploitation of a terrain. Sometimes, though, these buildings are located near transmitter towers. Depending on the operating frequency and received power intensity of these transmitters, there is a real concern about the excessive exposure of the general public to nonionizing radiation (NIR), which in particular can affect citizens, building constructors, and supervisory agencies.
is article sets out a hybrid modelling system to estimate the propagation of the electric field along vertical paths close to transmitter towers that operate around 600 MHz. We chose this frequency range because it is adopted by Brazilian Digital Television (DTV) and, at least in Belém and similar cities, one of the most significant sources of outdoor/ indoor NIR exposure are VHF/UHF transmitters operating at a high-voltage power.
Finding a definitive solution to this problem, such as, for example, fitting frequency filters to the walls of the buildings or relocating the transmitter towers, requires great political will and a high financial cost. In the meantime, people are being continuously exposed to NIR in their own homes/ offices. is subject is still under discussion and has led to various approaches, such as in [1][2][3][4][5][6]. Hamiti et al. [1] describe a study of exposure to NIR involving multiple frequencies in the city of Prishtina. Koprivica et al. [2] carry out a statistical analysis of electromagnetic radiation, but are restricted to GSM frequencies in Serbia. Bernhardt [3] addresses general subjects related to NIR. Belpomme et al. [4] examine some of the expected effects of long-term exposure to NIR. Kuzniar et al. [5] conduct a study of the biological effects of short-term exposure to NIR with three frequencies (50 Hz, 2.1 GHz, and 5.8 GHz) in mammalian cells. Bernabò et al. [6] carry out a survey of the scientific literature with regard to the effects of NIR exposure on fertility (the authors examined 104 papers published from 1989 to 2017).
ere have been suspicions about the health effects of continuous NIR exposure on these people which have arisen because of the uncertainty of the results obtained so far. In light of this, the aim of this study is to provide a degree of comfort to the general public by helping to estimate the level of exposure to NIR (i.e., electric field exposure), in vertical path scenarios satisfactorily. With the aid of this information and on the basis of the exposure safety standards for NIR, ordinary citizens can check if their level of exposure to NIR (in a selected frequency) exceeds the permitted limit, which may mean their health is at risk. e modelling proposed in this article is designed to help involved parties to make an assessment of their risk of exposure to NIR. e system used in this work is a hybrid ARIMA and neural network model. is type of model is well known for its ability to address problems that have both linear and nonlinear components in their mathematical formulation, since single ARIMA models are widely used to represent linear problems and neural network models can represent nonlinear problems satisfactorily. e purpose of this hybrid approach is to use the advantages of both modelling techniques to tackle a single problem.
Neural network modelling is a technique inspired by how the human brain operates, with regard to the synaptic connections of its neurons. It is widely used (either alone or combined with other techniques) in many areas, including for the solution of electromagnetic propagation problems.
is wide range of applications is the most important advantage of a neural network.
Models such as ARIMA, that is, based on the time series theory, are widely used in areas such as economics and biology, as well as for any problem that can be modelled using a series of time-based data as its main variable. However, this type of modelling is rarely used to meet the requirements of electromagnetic propagation, which is the main problem in this work.
Some studies closely related to the model set out in this work are [7][8][9][10][11][12]. Wang et al. [7] analysethe phenomenon of tuberculosis incidence by means of a hybrid ARIMA model and nonlinear autoregressive neural network and compare its estimates to those of a single ARIMA model. In [8], a hybrid seasonal ARIMA (SARIMA) and neurofuzzy system network is used to predict the monthly inflow of water, as it is an extremely important variable in water resource planning.
is model was compared with a combined SARIMA and neural network model and achieved better results for its purpose. Naveena et al. [9] employ a joint hybrid ARIMA and neural network model to predict the price of "Robusta Coffee" in India and compares this with single ARIMA and neural network models. On the basis of the chosen evaluation metric, the hybrid model predicted the prices more accurately. In [10], the main benchmark for the model is chosen. A large number of studies which use similar hybrid models have been influenced by [10]. Finally, Khandelwal et al. and Aladag et al. [11,12] provide variations of the methodology devised by [10] and applied to literature-based data.

NIR Problem and General Methodology.
is particular problem of human exposition to NIR was analysed in the central zone of Belém city (located in the Brazilian state of Pará, in the Brazilian Amazon region), where the transmitters of its main television/radio stations are located. Some of these towers are as close as 50 m to residential buildings. An aggravating factor is that, instead of having repeater stations throughout the city operating at a lower power, there is only one transmitter for each station that usually operates at an excessively high power. For instance, the highest transmission power level in Belém is around 20 kW and this tower is located in one of the most urbanized neighbourhoods of the city, where there is a residential building that is almost as high as the tower situated approximately 100 m from it. Table 1 shows the limits of exposure to NIR for the general public adopted in Brazil. In this work, we take into account the values of electric field exposure limits, since the measurements were of electric field intensity. It should be noted that the Brazilian limits to NIR exposure follow the public exposure guidelines defined by the International Commission on Non-Ionizing Radiation Protection (ICNIRP) for the frequency range of 8.3 kHz to 300 GHz.
As the transmitter analysed in this work operates at 600 MHz, the electric field intensity exposure limit equals 33.6805 V/m. is is the limit of the general public exposure and should not be exceeded by the mean value of the electric field over a time lag of 6 minutes. e assessment of human exposure to NIR at any given point can be measured by using the quadratic relationship following [13] ER where E m,i is the measured electric field at the given point in frequency i and E L,i is the exposure limit for frequency i. Equation (1) shows that the effect from exposure to NIR is cumulative, if more than one source is present. As only one frequency is considered in this work, the sum in equation (1) becomes a single quadratic ratio for each point, as variable i has a single value for every measured point (see equation (2)). e value of the ER ratio must not be higher than the unity, that is, ER ≤ 1. If the ER value exceeds 1, it means there is an excessive exposure to NIR in the considered location: (2) e studied scenario and its measurement campaigns resulted in three datasets, namely, "data1," "data2," and "data3." ey represent a single scenario: a building close (less than 250 m) to a transmitter tower operating at approx. 600 MHz. ese three datasets can be regarded as three subscenarios, that is, three similar situations. Figure 1 shows a representation of the studied scenario. All three datasets were obtained through measurement campaigns and consist of values of electric field intensity (V/m) in every floor of each measured building. e receiver equipment was always located in such a way that it was directly facing the transmitter tower and the data was acquired for 6 minutes, as recommended by [13], which resulted in a series of values of the electric field for each floor. e mean value of each series was taken to represent the received intensity of the electric field in each floor. Table 2 provides information of the variables in Figure 1. e proposed model was developed from a hybrid methodology that combined an ARIMA model and a neural network, inspired by [10][11][12]. Let E be the value of the electric field intensity. is can be written as the sum of two components (equation (3)), one containing its linear part (L) and the other containing its nonlinear part (N), that is, In this work, the linear component L is first adjusted by an ARIMA model. Secondly, the difference between the ARIMA estimate L and the measured data (E), i.e., the residuals of the ARIMA estimates, is adjusted by the neural network.
Datasets representing other scenarios (different from those found in Belém) must be acquired to ensure that the model operates effectively in these new scenarios, as it is an empirically based model. New datasets can be acquired through measurement campaigns and/or simulations. ese are some of the continuous upgrades designed for this work, as described in Section 4. Another advantage of this model is that it can be deployed to address a wide range of problems, as it can design both linear and nonlinear systems precisely. e necessary calculations and programs were carried out on MATLAB [14] software, by means of internal functions, both for ARIMA and the adjusted neural network.

ARIMA Fitting Methodology.
e ARIMA adjustment was made by adopting the usual strategies from the time series theory, mainly autocorrelation function (ACF) and partial autocorrelation function (PACF) analysis for the original, or (somehow) transformed, time series, as in [15]. e neural network is a radial basis network with two layers. It should be noted that, when using the ARIMA model, the (usual) variable "time" is replaced with the "height from ground" variable, which characterizes a vertical path. In other words, it is assumed that the electric field intensity on one floor of a building is determined by the values (the exact quantity depends on the order of the model) from the floors below. e standard Cartesian coordinate system is used for the y-axis with a variable height (zero at the base and increasing as it rises). e analysis and the results of this work were divided into two groups: (1) Original Measured Data and (2) Interpolated Measured Data. As the names suggest, the results of Group 1 were obtained from the datasets with their original number of samples. With regard to the second group, the results were obtained after adjusting the model to the interpolated measured data. We did this to increase the number of samples of each measured dataset, thus allowing the ARIMA model to work with more samples, thus, refining the linear fitting. We used a shape-preserving piecewise cubic interpolation (SPPCI) to increase the number of samples of each dataset from 25 (datasets: "data3" and "data1") or 31 (dataset: "data2") to 200. In addition, the interpolated group of datasets was able to simulate a "nonstop" measurement campaign, which is more desirable than a "stop-and-go" campaign. Instead of having to stop at  Figure 1: Representation of the studied scenario. Tx abbreviation stands for "transmitter tower." Variable d is the horizontal distance between the measured building and the transmitter tower (not to be confused with the distance of the receiver equipment from the transmitter source). Variable h 1 is the height of the transmitter tower and h 2 represents the height of the measured building. each floor of a building to carry out the measurement, we could use a device that allows continuous measurements to be made (a receiver attached to a drone moving at constant speed, for example) with no stops. However, our measurement campaigns were of the "stop-and-go" type. ere are some steps that have to be taken to adjust the ARIMA model. Firstly, it is necessary to determine if a nonlinear transformation (e.g., logarithm transformation) of the original series is necessary to stabilize its variance. Secondly, the tendency of the original data must be calculated and isolated, if necessary, so that the adjustment of the series can proceed without its tendency. en, on the basis of the ACF and PACF analysis, it must be decided whether differences in the series should be made. Finally, the type/ order of the ARIMA model will be obtained when the ACF and PACF graphs make correct evaluations after the necessary interventions in the original series. All these stages follow the standard [15] approach when modelling with time series, where it is necessary for the series that must be adjusted to be stationary (or "close" to it). e original series of graphs for the three datasets are shown in Figure 2, while the graphs for the interpolated datasets are shown in Figure 3.
In the process of achieving the best result for the proposed modelling, we carried out a wide range of tests that followed different stages. e diagram in Figure 4 shows all the stages of the testing procedure followed in this study. Optional stages are to take a nonlinear transformation and taking differences on the original/interpolated series. ese stages are indicated in the diagram by dashed lines.
When analysing the diagram, it was noticed that the decision about the interpolation was made right at the beginning of the process required for this work. We did this to simulate the two scenarios under consideration (i.e., the "non-stop" and "stop-and-go" measurement campaigns) and allow a more refined ARIMA estimate, as stated previously. We isolated the tendencies of the series, which is a standard procedure for the ARIMA fitting and replicated in the LS fitting to make a fair comparison between the fitting methods. We, then, continued with the ARIMA fitting, by analysing if a nonlinear transformation is necessary (that is the reason for the dashed lines in this stage, since this stage is optional). Following this, we analysed the ACF and PACF functions of the series under study and decided whether to take difference(s) or proceed to define the ARIMA model. After the ARIMA fitting, we proceeded with the neural network fitting, which is carried out by finding the difference between the measured data and ARIMA estimates, i.e., the residuals from the linear estimate. Finally, we obtained the results of the modelling, which were ready for making comparisons and reaching conclusions. e optional stages in the diagram were tested and the best results were obtained when dataset "data3" was used as a calibration set (for both the interpolated and original datasets) without nonlinear transformation, but isolating the tendency and fitting the ARIMA to the series without it. is means that the ARIMA coefficients were calculated by only using the "data3" group as an adjustment/training set. Groups "data1" and "data2" were only used for making comparisons and estimates.
In Sections 3.1 and 3.2, we briefly explain how we carried out the fitting by employing the LS method. is was carried out to act as a counterpoint to the modelling, as it is a usual procedure for tackling problems like this. We will, later, compare this method with the LS and the ITU recommendation P. 1546-5 in Section 4 of this study. In addition, we explain in detail how we obtained the best linear results in Sections 3.3 and 3.4. As for the neural network fittings, these are described in Sections 3.5 and 3.6.

LS Fitting Methodology.
To ensure a fair comparison is made between the combined ARIMA and Neural Network fitting, "data3" will be used as a calibration dataset and the fitting procedure will be carried out in the series without its tendency.
e general problem was represented by choosing a recursive second order polynomial given by is represents a situation where the current value of f, i.e., f h , depends on the two preceding values (based on the adopted metric in the problem, of course), which is similar to how the ARIMA model makes its estimates.
Equation (5) expresses the system of equations representing each value of f h from f 3 to f n . We consider values f 1 and f 2 to be the first two values of the training dataset: System in equation (5) is incompatible, but a least square solution can be obtained by minimizing the sum of squares of the errors between the estimates and the (theoretically) correct values, i.e., its residuals. e system with the errors is given by Our problem, hence, is to minimize n− 3 k�0 ε 2 k . It should be stressed that the original problem is linear, but, since the matrices of the system that have to be minimized are dynamic, we have a nonlinear LS problem to solve.

Neural Network Fitting.
A neural network can be used for fitting the difference between the ARIMA estimate and the original data. at is, let L be the ARIMA estimate of one measured dataset Z. We can write it as in In equation (7), the nonlinear term of Z, which will be fitted by the neural network, is represented by N.
In this study, a radial basis function neural network with two layers is employed. e neurons of the first layer make an element wise product between the biases and the weights and each neuron correspond to a training point. e neurons of the second layer normalizes the values previously found (see MATLAB documentation on newgrnn neural network [14]).  International Journal of Antennas and Propagation e activation function of the neural network is a Gaussian function, given by equation (8). A general diagram of this kind of network is shown in Figure 5.
with n being the number of inputs in the network. In this work, there is one neuron in the network for each training point. e number of training points vary from the original datasets and interpolated datasets. From the diagram in Figure 5, we conclude that the function for the nonlinear estimate N is given by with w j being the weights of the neural network. e diagram of the representation of the network used in the original datasets fitting is shown in Figure 6.
In the original datasets, eight of the twenty-five original samples were used to train the network (as in Figure 6). With regard to the interpolated datasets, we used 24 of the 200 available samples. We proceeded in this way to avoid overtraining in the neural network. e boundary and the central samples are always used as training points. e other points are chosen at random. We used 1 as the spread value of the neural network (the standard value for MATLAB). e output of the network is, thus, interpolated (SCCIP) to ensure that the final output vector has the same number of elements as the measured data and the ARIMA vector. Finally, the estimated values from the neural network N are then added to the estimated ARIMA values which gives the final model estimate for Z, which is in

LS Fitting: Results of Original Samples.
We used the classical Levenberg-Marquardt [16] method to solve System 6. e values of the coefficients a j , j � 1, 2, 3, where a 1 � − 0.1904, a 2 � 1.0187, and a 3 � 0.0002 for the original datasets. e Euclidean norm of residuals was 0.3302. e graphs of the LS curves for the three original datasets (calibration/training set and comparison sets) are shown in Figure 7. e relative and RMS errors values for this LS fitting are displayed on Table 3.

LS Fitting:
Results of Interpolated Samples. By analogy, we used a similar polynomial as in equation (4) (recursive of the second order) for the interpolated datasets. is polynomial is represented by f h (the symbol ∼ in other variables through this article indicate they are originated from interpolated data). e analogous system to equation (6), after the matrices had been minimized, resulted in the following respective values of coefficients for f h : a 1 � 0.9641, a 2 � 1.9590, and a 3 � 0.0000. e curves of each estimate for the interpolated datasets are displayed in Figure 8. e relative and RMS errors values for this LS fitting are given in Table 4.
We also tested the LS fitting using higher order polynomials. In light of the behaviour of the curves, this possibility is almost never considered. In the case of both the original and interpolated dataset curves, the second order LS fitting obtained good results. However, when the order of f was increased for the interpolated datasets, the LS could not find the optimal solution, no matter where the initial point was. Furthermore, comments on this will be made in Section 4.  International Journal of Antennas and Propagation

ARIMA Fitting: Results of Original Samples.
Let Z 3 be the mathematical notation for the original measured series of "data3" datasets. When analysing its "Amplitude vs. Mean" graph (see Figure 9), we decided not to take a logarithm transformation on Z 3 in order to proceed with the ARIMA adjustment, even though the angular coefficient of the best linear fitting is not zero (as shown in the graph).
In view of the behaviour of all the measurement datasets acquired, we decided to calculate and isolate the tendency of all three series. In other words, fitting is carried out in the Z 3 series without its tendency. e tendencies of Z 1 and Z 2 series are also isolated, since the model estimates for both datasets are made for these series without their tendencies (i.e., they are reintegrated after the estimates to allow comparisons to be made for the measured data). We examined the measured data without seasonal components and concluded that Z 3 � L 3 + N 3 and L 3 is with T 3 being the tendency for Z 3 and α 3 the white noise.
We also took account of the polynomial tread line and estimated it by means of the linear least squares method, which resulted in T 3 . erefore, the series that must be estimated by the ARIMA model is represented by Y 3 : In this case, the tendency is represented by a first degree polynomial. Coefficients of the polynomials representing the tendencies of Z 1 , Z 2 , and Z 3 are shown in Table 5. e studied series without tendency (Y 3 , Y 2 , and Y 1 ) are shown in Figure 10. e estimated series will be called Y j , j � 1, 2, 3. Now, we can analyse the ACF and PACF graphs for Y 3 (this is the training set) to determine if their behaviour satisfies one ARIMA model. Figure 11 shows these graphs.
e main result at this stage is that the ACF plot moves to zero rapidly. is means that further transformation to the series is not necessary. e order of the ARIMA model can be defined. e ACF function also moves rapidly towards zero and its behaviour is similar to a damped sinusoidal function, as well as the PACF graph. e ACF behaviour indicates that there is an autoregressive (AR) component in the model, and as it is infinite (i.e., exists in all lags), this is a sign that no moving average (MA) term is present. As PACF moves near zero after lag 1, we have a first-order AR model. It can thus be concluded that the best linear adjustment possible is an ARIMA (1,0,0) model. Its ACF and PACF are shown in Figure 12.
e adjusted/estimated ARIMA model is represented by with ϕ � 0.8587 and c � − 0.0011.   International Journal of Antennas and Propagation e graph with the best adjustment for the "data3" dataset is shown in Figure 13.
is is the graph that originated from the estimation of equation (14) model when applied to its own adjusted dataset, i.e., Z 3 : By analogy with L 3 , we can write L 2 and L 1 . Figure 13 shows, as well, the graphs of the estimations of the ARIMA model to "data2" (L 2 ) and "data1" (L 1 ) datasets, respectively, that is, the comparative subscenarios. All these graphs also show the estimations of ITU-R P.1546-5 and LS for each subscenario. Tables 6-8 show the relative and RMS errors of both ARIMA, LS, and ITU estimations for every subscenario.

ARIMA Fitting: Results of Interpolated Samples.
We have the variables Z j and Y j , j � 1, 2, 3, representing the interpolated measured series with and without their tendencies, respectively. e ARIMA adjustment process for the interpolated series gave, as its best result (Z 3 is the training series) an ARIMA (4,0,0) model given by with ϕ 1 � 3.1394, ϕ 2 � − 3.7423, ϕ 3 � 2.0210, ϕ 4 � − 0.4191, and c � 8.7149 × 10 − 6 . e graphs of the best ARIMA adjustment to Z 3 and the comparisons to Z 2 and Z 1 are shown in Figure 14. e relative and RMS error for the interpolated datasets are displayed on Tables 9-11.

Neural Network Fitting and Final Model Estimates:
Original Samples. Figure 15 shows the results for the adjusted final model versus the adjustment dataset ("data3") and its estimates for the "data2" and "data1" datasets, respectively. ese figures also show the ITU-R P.1546-5 and LS estimates for all the subscenarios. Tables 12-14 show the relative and RMS errors values of the combined model, the LS, and ITU estimates for all the datasets.      Dataset "data1" without tendency (c) Figure 10: Graph of (a) "data3" without tendency (Y 3 ), (b) "data2" without tendency (Y 2 ), and (c) "data1" without tendency (Y 1 ).     Figure 13: ARIMA estimation for (a) "data3" with tendency reintegrated (L 3 ), (b) "data2" with tendency reintegrated (L 2 ), and (c) "data1" with tendency reintegrated (L 1 ).          Figure 15: Comparison between measured data, ITU estimation, LS estimation, and proposed modelling curves for (a) "data3" subscenario, (b) "data2" subscenario, and (c) "data1" subscenario.    Figure 16 shows the results for the fitting and estimates in the interpolated datasets and their respective relative and RMS errors are shown in Tables 15-17.

General Public Exposure to NIR.
With regard to the general public exposure to NIR, the values found for the electric field do not exceed the limits currently in operation in Brazil. On the contrary, they were significantly below the limits referred to in the frequency that was analysed (33.6805 V/m for electric field at 600 MHz). is means that the ER ratio value is less than 1, which can be regarded as a safe value by Brazilian standards. e greatest value of the ER ratio in all the scenarios (taking account of both the original and interpolated sets) was 0.3174 and was calculated through the ratio between the greatest value measured for the "data 3" dataset (10.6899 V/m) and the limit of 33.6805 V/m. By using the planned modelling, any similar requirement that might arise in the future does not necessarily involve the  Figure 16: Comparison between measured data, ITU estimation, LS estimation and proposed modelling curves for (a) "data3" subscenario, (b) "data2" subscenario, and (c) "data1" subscenario.   need for measurements since the model represents, to a satisfactory extent, the electric field propagation in the scenario and in the frequency in question. However, in a situation where the estimates of the model are as close as (or even closer to) the exposure limits than the error of the model (whether RMS or relative) with regard to the training set, there is a need for measurements and a cautious approach to the situation. Without measurements being required in every situation that raises doubts among the public, the process of evaluating the exposure to NIR would be less expensive, faster, and even more efficient. is helps in providing a precise and rapid response and in improving the service rendered by the regulatory agency.

Original Datasets.
On the basis of the results of the 25 sample datasets (not interpolated), we observed that our modelling system shows a significantly better result than the commonly used ITU-R estimates. In the case of the "data1" group, where the estimates of the model were worst, it was still approximately 65% better than its ITU estimate (taking into account the relative errors). It can, thus, be concluded that the hybrid modelling achieves its goals with a high degree of accuracy. However, when compared with the "single" ARIMA estimates, the combined ARIMA and neural network model achieves slightly worse results. is suggests that the additional feature (combining the ARIMA model with a neural network), in this particular problem, may not be necessary to represent it in a precise way.
Although there is this slight difference between the "single" and "combined" ARIMA models (as confirmed by the relative/RMS error values), we think that the combined model should be used in all cases, in view of its applicability to other scenarios. In the case of this problem, for example, in our view, both approaches ("single" and "combined" ARIMA) are equivalent, as the difference in the errors is slight. e largest RMSE difference in all the tested scenarios is approximately 0.1 for the "data3" dataset without interpolation.
Since this work comprises a case study, we think it should be stressed that the "single" ARIMA model gave the best results in this particular scenario. We expect, however, that in other scenarios, the combined model may become necessary, as it can represent nonlinear features, whereas the "single" ARIMA modelling cannot. When compared with the LS estimates (except for the "data1" dataset), the combined modelling system achieved better results.

Interpolated Datasets.
On the basis of an analysis of the results obtained from 200 samples (interpolated datasets), the same conclusions can be reached as with results from the 25 samples, but with a caveat; although the LS was able to find an optimal solution to this problem, owing to the number of samples there is a risk that this may not be feasible in other scenarios. ere may be too many alternatives for the LS algorithm to search and its result may "explode." is also suggests that, in a desired scenario of "nonstop" measurements, when a higher number of samples is naturally acquired, applying the classic LS fitting may not be the best alternative, since there is risk that an optimal solution will not be found. Nonetheless, the combined ARIMA modelling system (or even the "single" one) does not have this limitation and the neural network is able to model nonlinear features that the "single" ARIMA model cannot.

Future Improvements on the Model
As future improvements to this study, we recommend the following: (1) Including other scenarios and frequencies through measurement campaigns: this will enable the proposed modelling to tackle a wider range of problems, i.e., help to generalize the model (2) Adding datasets from near-field measurement campaigns: as it is, the proposed model cannot predict near-field propagation situations, since the training data consists of far-field measurement information (3) Implementing this modelling system through the deployment of an applicative service on a mobile device, such as a smartphone: we believe this applicative could be used by the general public to estimate electric field intensity in similar scenarios to the one(s) studied here, when doubts about the exposure to NIR arise Data Availability e .xlsx ("data1," "data2," and "data3" datasets) and .txt (instructions on how to use the .xlsx archives) data used to support the findings of this study are included within the supplementary information file.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments is work was supported by the Federal University of Pará, by means of its infrastructure, and Brazilian National Agency of Telecommunications, by means of measurement equipment and personnel. e authors are grateful to the Brazilian Coordination of Superior Level Staff Improvement (CAPES) for its financial support. e authors would also like to thank the Federal University of Pará (UFPA) for the technical assistance given to the research undertaken in this paper.

Supplementary Materials
e datasets including the measurement data and the instructions on how to use this data in order to obtain the results in this work are provided. e datasets are archives in .xlsx. e instructions on how to use them is described in a .txt archive. (Supplementary Materials) International Journal of Antennas and Propagation 13