Non-Gaussian Hybrid Transfer Functions: Memorizing Mine Survivability Calculations

1 Institute of Systems Engineering, Faculty of Science, Jiangsu University, 301 Xuefu, Zhenjiang 212013, China 2Department of Computer Science, Faculty of Applied Science, Kumasi Polytechnic, P.O. Box 854, Kumasi, Ghana 3Computer Science and Technology, Suqian college, Jiangsu University, 399 South Huanghe, 223800, China 4Department of Mathematics and Statistics, School of Applied Science, Kumasi Polytechnic, P.O. Box 854, Kumasi, Ghana 5College of Finance and Economics, Jiangsu University, 301 Xuefu, Zhenjiang 212013, China


Introduction
Hybrid algorithms are used in optimizing real-world implementations that is, it comes as the best optimization solution tends to have challenges in implementation cost, time, and so forth, that needs a solution by using another technique.Hybrid algorithms have received significant interest in recent years and are increasingly use to solve real-world problems.These hybrid algorithms or models include combination of two or more algorithms involving genetic algorithms (GA) [1], particle swarm optimization (PSO) [2], and other computational techniques such as, artificial intelligence or neural networks including but not limited to multilayer perceptrons (MLP) or sigmoid [3], radial basis functions (RBF) [4,5], fuzzy systems [6] and simulation annealing [7].
An artificial neural networks (ANNs) are techniques of artificial intelligence (AI) that have the capability to learn from experiences, it is robust [8] and improves performance by adapting to the changes in the environment.The underlying advantage(s) of ANNs are the possibility of efficient operation of large amounts of data and its ability to generalize the outcome.This training algorithm, the ANNs, is largely used in applications involving classification or function approximation, and it has been proved that several classes of ANN are universal function approximations [9].These include radial basis function (RBF) and multilayer perceptrons' (MLPs) neural networks.Taking into consideration the great potential of these techniques, this paper aims to establish a hybrid model using multilayer perceptron network (MLP) also called sigmoid basis function (SBF) and a radial basis function (RBF) network-both feed-forward learning.The RBF and MLP networks are usually employed in the same kind of applications.Examples include the nonlinear mapping approximation and pattern recognition [10]; however their internal calculation structures are different.In the multilayer fully connected feed-forward networks, the nodal transfer function activation flows from the input layer through a hidden layer to the output layer [10].This functional description process can be expressed as   =  ⋅ ∑  =1     +   [11] with typical processing node, where   is one of the  inputs for processing, node ,   is the connection weight between node  and node ,   is the bias for node , and   is the output from node .Each neuron in one layer is connected in the forward direction to every nodal unit in the next layer.One disadvantage in using most feed-forward layered neural networks is the high degree of nonlinearity in the parameters.Learning must be based on nonlinear optimization techniques (i.e., back-propagation), and the parameter estimate may become trapped at a local minimum of the selected optimization criterion during the learning procedure.
Another option to such neural networks is to use the radial basis function (RBF) as a transfer function.There is a strong connection between the RBF and neural networks and it is reasonable to believe that a radial basis network (RBN) can offer approximation capabilities similar to other feed-forward, layered neural networks [12], provided that the hidden layer of the RBN is fixed appropriately.This belief is strongly supported by the theoretical results from the RBF method as a multidimensional interpolation technique [13].A radial basis function neural network has an input, hidden, and output layers.The input layer is composed of an input vector .The hidden layer consists of RBF activation function as networks neuron.The net input to the RBF activation function is the vector distance between its weight  and the input vector , multiplied by the bias .Detailed work has been done on advantages of both sigmoid and radial basis functions [14].Radial functions are a special class of functions whose value increases or decreases in relation to the distance from a central point.There are different types of radial basis functions, but the most frequently used is the Gaussian function.It is well known that the MLP networks have been applied successfully in several difficult problems.MLP networks also work globally and the network outputs are decided by all the neurons [15].Radial basis function (RBFs) act as local approximation networks and their outputs are determined by specified hidden units in certain local receptive fields.RBF networks are simpler than MLP networks, in spite of having more complex architectures and respond well to patterns that were not used for training from the point of generalization [15].Comparing the properties of neural networks, fuzzy inference systems, the RBF had the advantages of easy design, stable and good generalization ability, good tolerance to input noise, and online learning ability.RBF networks are strongly recommended as an efficient and reliable way of designing dynamic systems [15].
An important issue in the RBF neural network applications is the network learning, that is, the need to optimize the adjustable parameters, the center vectors, the variances or the widths of the basis functions, and the linear output weights connecting the RBF hidden nodes to the output nodes and to determine the network structure or the number of RBF nodes [16].Closely coupled are the determination of the network size and the adjustment of parameters on the continuous parameter space.In this wise evolutionary algorithms have been used to address this problem, nonetheless they are computationally very expensive in its implementation [16] which results in slow and premature convergence and this has attracted attention in literature.The center location and clustering techniques have been proposed [17].An identical width can be set for all the basis functions if the input samples are uniformly distributed, otherwise a particular width has to be set for each individual basis function to reflect the input distribution [18].Once the centers and the widths are determined, the linear output weights can be determined using Cholesky factorization, orthogonal least squares, or singular value decomposition [19].In contrast to the conventional two-stage learning procedure, supervised learning methods aim to optimize all the network parameters [20].Various techniques have been introduced to improve the network convergence and these include hybrid algorithms to improve the convergence; various techniques combine the gradient-based search for the nonlinear parameters (the widths and centers) of the RBF nodes and the least squares estimation of the linear output weights [18] and combing the merits of fuzzy and crisp clustering [21].
Supervised learning is thought to be superior to conventional two-stage approaches but it can be more demanding computationally.The Akaike information criterion was used when dealing with different network size; however, it is equally computational demanding [21].With respect to the determination of the RBF neural network structure, a popular approach is to formulate it as a linear-in-the parameters' problem, where all the training patterns/samples are usually used as the candidate RBF centers.To improve the network generalization, the regularized forward selection algorithm has been proposed [22], which combines subset selection with zero-order regularization.Backward selection methods have also been used in RBF center selection [23].However, forward selection algorithms are thought to be superior to backward methods in terms of computational efficiency, however these methods have several major disadvantages such as being computationally too expensive or sometimes impossible to implement.The search for the optimal values of the nonlinear parameters (RBF centers and widths) is a continuous optimization problem.In order to optimize the RBF center and width parameters along with the network structure determination process, a sparse incremental regression (SIR) modeling method was proposed very recently to determine the network structure and the associated nonlinear parameters simultaneously [24].This can deal with large dataset and improve the network significantly.Others include the moving k-means clustering to position the RBF centres with givens least squares to estimate the weights [25] and forward algorithm in RBF construction [9], to mention a few.
Different from existing methods in RBF neural network construction and multilayer perceptron, this paper proposes a novel hybrid (HSRF) feed-forward algorithm involving the multilayer perceptron (sigmoid) and non-Gaussian type radial basis transfer functions which is robust and performs parameter optimization within an integrated analytic framework, leading to two main technical advantages.
(1) The network can be significantly improved through the optimization of the nonlinear RBF parameters on the continuous parameter space.
(2) Using the speed of the multilayer perceptron and the simplicity and accuracy of RBF to produce fast and accurate model for rescue operations.In addition the paper uses coded genetic algorithm to train the proposed hybrid algorithm.Finally, network outcomes including mean iteration, standard variation, standard deviation, convergent time, and optimized error are evaluated using 5th order polynomial.

Problem Statement and
Objective.There are generally heavy casualties and tremendous loss of property in the event of accident such as fire, rock fall, flooding, or poisonous gasses as well as lose of human life in the mining sector [26].This calls for a model that is fast and robust for monitoring and locating survivors to safety in times of accident.The justification of this work is that, the focus of current research is moving from system analysis of small-world networks to that of millions of nodes.This will demand large computers to process, and even if those computers are available, it will demand considerable time to run.This implies that there is the need for fast prediction algorithm using NN to memorize precalculated results to deal with large number of sensors (i.e., as sensors grow so rapidly to thousands and millions that battery drain will not permit calculations on the spot of a problem).In addition the base station can be destroyed in times of accident.Further justification for a research like this is that the simple imitations of the human brain (called neural network models) demonstrate fast and accurate learning and classification properties in problems that otherwise require human experts.Although such tools cannot obviously replace human experts, they are used as on-the-fly diagnostic tools and supporting evidences in quick decision making.With these in mind the main objective of this study is to investigate and improve upon the Gaussian radial basis function and develop a non-Gaussian hybrid of MLP (sigmoid or SBF) and the compact radial basis functions (CRBF) with enhanced optimization features.From this an optimized hybrid model is assessed that has the highest predicted survival probability for an emergency rescue operation in an underground mining with genetic algorithm.The two main objectives examined in this paper are as follows.
(i) To investigate the radial basis function of Gaussian model and remove the additional computation burden on the model by paralyzing the power operation on Gaussian model to generate a compact radial basis functions (non-Gaussian) literally but novel to reduce computational cost and increase processing efficiency.
The study focuses on the use of absolute operation instead of square operation.In Figure 1 the green outline (for online viewing) represents additional requirement of resources in terms of time, cost, and so forth, assuming the resource is proportional to the calculated value, that is, zeros are not stored.
(ii) To develop an optimized hybrid neural network called HSRF model with nonlinear weights of negative cosine, imposed on new compact radial basis function.This nonlinear weight was introduce to further reduce the RBF magnitude or element in the model for accuracy and maintained speed.

Preliminary Consideration
2.1.Deployment of Sensors and Connection.Topological deployment of sensor nodes affects the performance of the routing protocol [27].The ratio of communication range to sensing range as well as the distance between sensor nodes can affect the network topology.To get the right deployment, the following are essential.
(ii) State the relative rock hardness parameters, that is, hardness or softness of rock is chosen to be in the range 0.7 and 0.9 and 6 values are chosen to reflect 6 types of rocks obtained from the industry in Table 1.
The sensor sequence location explains the shortest search path criteria of the robot.The robot's job is to make sure every sensor is working.It moves to the first level, searches the row(s) and the respective column(s) and proceeds to the next level to repeat the procedure.
The sensor sequence position is expressed in Matlab code as shown in Algorithm 1.

Communication and Transmission
Range.The Throughthe-Earth (TTE) communication system transmits voice and data through solid earth, rocks, and concrete and is suitable for challenging underground environments such as mines, tunnels, and subways [28,29].There are stationary sensor nodes as well as mobile sensors (humans and vehicles) distributed uniformly.The survivability of the mobile sensor is the main concern of this paper; we need to predict the survivability based on the rock type and location that the sensor sees, using the NN which is trained ahead of time.Both stationary and mobile sensor nodes are connected to either the access point (AP) and/or access point heads (AP Heads) based on transmission range requirements.The AP heads serve as cluster leaders and are located in areas where the rock is relatively soft (large size resting and eating room) or has relatively better signal penetration to ensure optimum transmission.The TTE is dropped through a drilled hole on top of the large size room or bay, approximately 300 metres beneath the ground based on the rock type.The depth and rock type determine the required number of TTEs needed.Next the DATA-mule is discharged to carry items such as food, water, and equipment to the miners underground and return with underground information to rescue team.Minimizing the transmission range of wireless sensor networks is vital to the efficient routing of the network.This is because the amount of communication energy that each sensor consumes is highly related to its transmission range or signal reach [30].The node signal reach  is defined as the absolute difference of the minimum signal reach and maximum signal reach of nodes plus the minimum signal reach, taking into consideration the 6 cases of the rock structure , where  lies between the soft-rock 0.7 and the hardest rock 0.9.Routing is also limited to the load of the nodes and the distance each node travels [31].The minimum and maximum signal reach of nodes ( min and  max ) and  are calculated in Matlab code below.
Initialize the connection matrix Mc = zeros (, ); rock = rock cases for  = 1 : , and for  = 1 : The relationship between rock hardness/softness () and the signal reach is a complicated nonlinear function.It is related to the skin depth of the rock with alternating currents concentrated in the outer region of a conductor (skin depth), by opposing internal magnetic fields [32] as follows: Skin depth = √2/( *  * ), where  is material conductivity,  is frequency,  is magnetic permeability, the signal (B-field) is attenuated by cube of distance expressed as: signal reach (distance) = 3 * skin depth [33].The common rocks identified for this work are presented in Table 1.
To depict how the deployed sensor nodes were connected, the connection matrix is introduced before transmission of data can be made.Similarly, to depict the path in which data is transferred from source to destination and vice versa, the routing matrix is introduced.Let  denote the connection matrix, a sensor node is named by its 3D integer (, , ) coordinates, where 1 ≤  ≤ length, 1 ≤  ≤ width, 1 ≤  ≤ depth and  = Legth ⋅ Width ⋅ Depth is the total number of nodes.Node (, , ) is connected with node (, , ), if the element on (( − 1) ⋅  ⋅  + ( − 1) ⋅  + )th row and (( − 1) ⋅  ⋅  + ( − 1) ⋅  + )th column is 1, otherwise 0; therefore where  is the node signal reach.The routing matrix R is limited to total multiple points' connections to be made.In arriving at the final optimized vector for transmission, each matrix was generated  times, where  is the number of cases it iterates before producing the final survival rate vector (),   is point-to-multi-point connection (hub or switch) are less or equal 4 and is an even number allowing bidirectional communication, thus, between the source and destination of nodes.Consider where ,  are the source and destination nodes respectively.

Fault Tolerant Considerations.
Fault tolerant considerations for hardware, software, and network security are critical areas in emergency situations as they significantly affect the efficiency of the communication, and many key management schemes have been proposed to mitigate the constraints [34,35].In an event of accident, let  be the probability of accident occurring, then routing path or matrix R would be affected by (1−), where  is any random value within  (i.e., relative rock hardness or softness), that would cause explosion.Let  be explosion matrix whose elements depict the level of damage to the sensor nodes or routing path caused by the explosion and  is defined in the following: The damage of explosion will consequently result in the failure of the routing path and matrix  is introduced to depict the signal status of the sensor nodes which is used as new generated connection matrix.The matrix  is defined as where   and   represent the lower and higher accident impact thresholds, respectively.For example,  , ≤   implies that the number of nodes will be 0, 1, 2, and 3; when  , ≥   the number of nodes is expected to be 7, 8, and above.Whiles   <  , <   is expected to be 4, 5, and 6. Figure 2 explains the effect of the explosion on the transmission link in three regions.Region 1 of the figure implied the link(s) were not affected.Nodes with numbers 4, 5, and 6 in region 2 represent a probability that the links will be able to transmit/receive data while nodes above number 7 in region 3 signal dead links (Figure 2).
To find the matrix with the maximum survival probability, a new set of routing path is defined after the damage.To optimize transmission after damage caused by explosion, the hope matrix was introduced whose elements indicate status of hope of signals to reorganize the routing path.To describe the success rate of signal from each node to the sink(s), the exit matrix is introduced.In most practical applications, more than one sink is used, and sink's node is either through the fiber or through the earth (TTE).
Let  be the hope matrix, and let  be the exit matrix.Then  and  are, respectively, defined as where matrix R and  are respectively given in ( 4) and ( 6),   is the total number of safe exits available for use, and  2 is the probability of additional error that miners may commit in trying to escape danger.

Hardware and Software Considerations.
In real rescue situations, software and hardware, including radio frequency identification (RFID) [36], can fail as a result of accidents or explosion which can significantly affect the routing path and thwart the efforts of the rescue team.Equally, miners as well as personnel can make other mistakes in the face of accident that can compound the existing problem(s), especially where miners find themselves more than 4,000 feet underground, as it is in the case of one of the mine-fields used for this study.In addition personnel may also fail to cope with the stress that comes with accident.It is therefore imperative to consider such failures in developing rescue models.Let  represent the software survival rate matrix.A matrix  is used to describe software or relational database management systems (RBDMS) survival rate including effects from bugs or attacks.Let  , represent elements of matrix  in row  and column , then  is defined as where  is the total number of nodes deployed and  , is the element of matrix of the random hardness of rock that is generated as a  by  matrix according to the geometric distribution.The maximum probability of survivability to rescue miners is defined by vector .To obtain , it is assumed that each miner will have an RFID.To describe probability of the survival rate, including risks of running out of battery, we introduce the row vector .Let   represent element of  (i.e., survival rate of each miner to be rescued) in column , then   is defined in the following: where the   is th element of a vector of the random hardness of rock generated according to the geometric distribution.To measure the hardware survival rate of miners to be rescued, the vector  is introduced, and  is closely related to the exit matrix and is defined in the following: where   and ( ⋅ (/  ))  were, respectively, th element of row vectors  and  ⋅ (/  ),  is the exit matrix given by expression (7), and   is the total number of hubs used.To ensure that a reliable system is in place for emergencies, the final survival rate vector is () introduced by the following equation: where  is the mean and the vector  represents the highest survival probability to ensure miners safety.

MLP (Sigmoid Basis Function).
A sigmoid basis function (SBF) is a mathematical function having an "S" shape (sigmoid curve) and is related to brain reasoning and the structure favours the computational believers.Often, sigmoid function refers to the special case of the logistic function.It is used in modeling systems that saturate at large input values, for example, the ogee curve as used in the spillway of some dams [37].Wide varieties of sigmoid functions have been used as activation functions of neurons, including the logistic and hyperbolic tangent weight functions.Sigmoid curves are also common in statistics such as integrals and logistic distribution, normal distribution, and Student's probability density functions [37].The generalized function and the sigmoid output function [10] can be expressed, respectively, as In the above formula, the initial inputs  1 ,  2 , . . .,   are summed together where  1 ,  2 , . . .,   are weights of the neurons for the th input,  is the activation level scaled according to the output function (), giving the actual output  of the neuron.A positive constant  controls the slope or steepness of the sigmoid; the sigmoid function amplifies the small activation levels and limits the high activation levels.In practice; the output  lies between 0 and 1 [38] however, outputs requiring negative values use the hyperbolic tangent as expressed below: The initial stage of the sigmoid basis function (SBF) grows relatively exponential as  touches the zero (0) line, and then the growth begins to slow down towards saturation and stops at maturity as  goes to 1.

Gaussian RBF.
The use of RBF kernels, mainly Gaussian and its global acceptance into various applications, cannot be overemphasized.However the model carries additional power computational burden that is translated into cost.The objective is to explore and remove this power computation burden and apply it in an emergency rescue system.The section discusses many related works to the proposed approach; this includes the Gaussian radial basis function (GRBF) neural network; radial basis function neural network consists of the input layer, the hidden layer, and the output layer.The inputs of hidden layer are the linear combinations of scalar weights and input vector, where the scalar weights are usually assigned as unit values, that is, the whole input vector appears to each neuron in the hidden layer.The incoming vectors are being mapped by the radial basis functions in each hidden node.The output layer yields a vector by linear combination of the outputs of the hidden nodes to produce the final output [39].The structure of an  inputs and  outputs RBFNN is depicted as where  = { 1 ,  2 , . . .,   } denotes the input vector for  inputs and  = { 1 ,  2 , . . .,   } denotes the output vector for  outputs,    denotes the weight of the th hidden nodes and the th output node, and  is the total number of hidden nodes,   (⋅) denotes the radial basis function of the th hidden node.The final output of the th output node   (), is the linear combination of all hidden nodes.Using the summation as the denominator, expression ( 14) can be normalized as , for  = 1, . . ., .
A multidimensional function RBF describing the distance between a given input vector and a predefined center vector is given as

Training Methods
In a genetic algorithm (GA), a populace of candidate solutions (individuals or creatures), to an optimization problem is evolved toward better solutions.Each candidate solution has a set of properties (its chromosomes or genotype) which can be mutated and transformed.Traditionally solutions are represented in binary as strings of "0s" and "1s" [40].The development usually starts from a population of arbitrarily generated individuals and is an iterative process, with the population in each iteration called a generation [41].In each generation, the fitness of every individual in the population is evaluated; the fitness is usually the value of the objective function in the optimization problem being solved.The more fit individuals are stochastically selected from the current population, and each individual's genome is modified (recombined and possibly randomly mutated) to form a new generation [42,43].The new generation of candidate solutions is then used in the next iteration of the algorithm.Generally, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population.
A typical genetic algorithm requires a genetic representation of the solution domain and a fitness function to evaluate the solution domain.A standard representation of each candidate solution is as an array of bits.Arrays of other types and structures can be used in essentially the same way as used in Hasan [44].The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size, which facilitates simple crossover operations.Variable length representations may also be used, but crossover implementation is more complex in this case.Treelike representations are explored in genetic programming and graph-form representations are explored in evolutionary programming, that is, a mix of both linear chromosomes and trees is explored in gene expression programming [45].Once the genetic representation and the fitness function are defined, a GA proceeds to initialize a population of solutions and then to improve it through repetitive application of the mutation, crossover, inversion, and selection operators.Parent chromosomes are selected with a probability related to their fitness.The chromosomes with high fitness have higher probability to be selected for mating than chromosomes with less fitness.The mutation, crossover, and reproductive chart is displayed by Wright [46] in Figure 3.The GA operates in Here genetic manipulation processes consisting of two steps are carried out.This includes the crossover operation that combines the bits (genes) of each two selected strings (chromosomes) to be executed.
However, there are several limitations of the use of a genetic algorithm compared to alternative optimization algorithms.Repeated fitness function evaluation for complex problems is often the most prohibitive and limiting segment of artificial evolutionary algorithms.Finding the optimal solution to complex high dimensional, multimodal problems often requires very expensive fitness function evaluations.In real world problems such as structural optimization problems, one single function evaluation may require several hours to several days of complete simulation.Typical optimization methods cannot deal with such types of problem.In this case, it may be necessary to forgo an exact evaluation and use an approximated fitness that is computationally efficient.It is apparent that amalgamation of approximate models may be one of the most promising approaches to realistically use GA to explain complex real life problems.Genetic algorithms do not scale well with complexity.That is, when the number of elements which are exposed to mutation is large, then there is often an exponential increase in search space size.This makes it extremely difficult to use the technique on problems such as designing an engine, a house, or plane.In order to make such problems tractable to evolutionary search, they must be broken down into the simplest representation possible.The second problem of complexity is the issue of how to protect parts that have evolved to represent good solutions from further destructive mutation, particularly when their fitness assessment requires them to combine well with other parts.The best solution is only in comparison to other solutions.

Mathematical Problems in Engineering
As a result, the stop criterion is not clear in every problem.In many problems, GAs may have a tendency to converge towards local optima or even arbitrary points rather than the global optimum of the problem [42].Various types of crossover operators are used such as the single-point and twopoint crossover operators among others [47].
In this paper the coded genetic algorithm was used to further optimize the parameters.
Research has shown that a small population size with relatively large mutation rate is far superior to large population sizes and low mutation rates [48].The paper argued that the best mutation rate for GAs falls between 5% and 20% while the population size should be less than 16.However in choosing the population size that optimizes the GA, a number of options were considered.The total input elements chosen is 300 which corresponds to the matrix size or the dimensions of the field space giving by depth (), length (), and width () that is 10, 6, and 5 respectively.To balance the speed and accuracy we took the square root of the total element of the matrix size to obtain the optimum size of 17.32.Guided by this figure a population size of 20 is used.In addition 10% of the matrix size (i.e., 30) is also used as a basis for comparison.

Proposed Hybrid Non-Gaussian Model Based on Compact Radial Basis Function (CRBF)
As already stated, this proposed novel hybrid involving a special mix of the multilayer perceptron and radial basis transfer functions is an empirical study to initiate and determine such guidelines (e.g., how much of the MLP or RBF should be included and vice versa) of mixing two major transfer functions.It will also examine several parameter combinations to find the one that will give the right mix to keep the error at minimum, maintain speed and accuracy, and optimize various parameters well, in one engineering design problem.The Gaussian-like kernel was adapted because it has gained global acceptance.The paper paralyzed the power parameter arbitrary but novelly as expressed below: which is in the form   (,   ) = exp(−abs(−  )), ,   ∈ R  and   (,   ) = log sig(,   ), ,   ∈ R  , respectively.The function MLP and RBF are given as where log sig() is the function for multilayer perceptron or sigmoid (SBF) and exp(−abs()) is the function for RBF, respectively (in Matlab code).The  is the optimum set of the routing table with the maximum survival probability which is used as an input for the neural network;  1 and  2 are the transfer functions for the new sigmoid and compact radial basis functions, respectively.The hybrids of the transfer function are summarized; with a weight of −cos(R) on CRBF, the HSRF − cos is given as The study also investigated a parameter between the Gaussian models (GRBF) and proposed compact radial function (CRBF) model (with a power of 2 and 1 resp.), for simplicity we referred to it as a reduced parameter (ZRBF) Finally a hybrid of Gaussian RBF using our model is also investigated as follows: The GA-neural network (GA-NN) is used to train the neurons such that the initial error will be minimized and make the model more reliable [49].Now it must be noted that the MLP is also referred to as SBF is a complete function and so is the RBF, and therefore adding these will be an abuse of usage of the transfer functions.Other options include considering 50% of each transfer function and considering some percentages of each as well.The conceptual view of the model can be seen in at the top section of the semantic structure in Figure 4. To examine the position of each neuron, let PN denote the position of the th genome, then PN is expressed as follows: where   ,  1 ,  2 are number of neurons at input, hidden, and output layers, respectively.The initial population is made up of randomly determined parameters within specified boundaries.These parameters are called the genes of the chromosome or the genes of an individual [50,51].Assuming a population  of three random individuals with genes that represent the values of three design variables, the selection of an individual is done by evaluating the performance of each one and then ranking them from best to worst.Finally an individual is chosen for production [52].The performance of the individuals, in this case, is the difference between the measured (or desired) value of the spectrum and the calculated value.This is measured by using the mean square error.Generally there are three methods to diagnose the fitness and these include the mean cubic error (MCE), the mean absolute error (MAE), and the mean squared error (MSE).The mean cubic error (MCE) will allow for fast convergence at the expense of accuracy, making the process unstable.Whiles the mean absolute error (MAE) is stable but converges slowly [53].A midway between these two is the MSE.Therefore the goodness of the fit in this study is diagnosed using mean squared error (MSE) as against the other two.In addition other parameters including mean iteration, standard variation, standard deviation, and convergent time are also evaluated.The MSE is given as where  , is the ideal value of th sample at th output and  , is the actual value of th sample at th output,  is number of samples, and  is the number of neurons at output layer.
The individual with the best performance receives a relative weight (RW) of (  ), where  is the number of individuals in the population and  is the weight that is usually between 1.0 and 1.5 [51].The next best individual receives a RW of ( − 1)  , and so forth, until the worst individual receives a RW of 1.The probability of reproduction rate of each individual is given as PR  = RW  /(1/) ∑ RW, where PR  is the probability of reproduction for individual , and  = 0, 1, . . ., .Individuals selected are made to reproduce.Reproduction involves randomly selecting two parents to form the reproduction pool to cross and create offspring.The performances of each offspring are evaluated and the best offspring selected.Mutation permits the introduction of extra variability into the population.Here the study permits 20% of those with worse performance to mutate for faster convergence; greater percentage will equally present slow convergence.The semantic structure of the proposed hybrid neural network with GA is presented in Figure 4.
The performance of the model is measured using polynomial or curvilinear trendline to examine the contributions of the whole models as well as the model parameters.Trendlines also called regression analysis estimates the relationship between variables so that a given variable can be predicted from one or more other variables.By using regression analysis, you can extend a trend line in a chart beyond the actual data to predict future values.A polynomial trendline is used for this work and the closer the -square value is to 1, the trend is better, that is, if -squared value is 0.979, then the corresponding trend is a good fit of the line to the data.A polynomial or curvilinear trendline is given by () =  0 +  1  +  2  2 +  3  3 + ⋅ ⋅ ⋅ +  6  6 to calculate the least squares fit through points, where  0 ,  1 ,  2 , . . .,  6 are constants.The 5th order polynomial is used to examine the performance among the individual parameters in each model because it has a higher predictive power over the 3rd and 2nd orders and it is the best fit for the proposed model, especially where the environments are highly controlled and observations are made to a specified level of tolerance.In addition the 5th order gives a more reliable result than the 2nd Mathematical Problems in Engineering and 3rd degrees.The expression of the 5th order is displayed as where  1 is time (seconds) and   is the coefficient of the polynomial,  = 0, 1, . . ., 5.

Simulation Results and Discussion
The simulation results are based on the following: the total nodes used for the simulation is 300 with underground mine dimensions of  = 10,  = 6 and  = 5 for depth (level), length (row), and width (column) respectively.The intervals between sensors in all directions are 100 meters.Multipoint connection (hubs/switches) used is 4. It is assumed that 2 exits are safe for miners to escape after accident.The GA training used population size of 20.The thresholds  and  are 3 and 6, respectively.Six cases of routing paths  are estimated base on relative rock hardness/softness between 0.7 and 0.9.The number of neurons is 6.
It must be noted that in presenting the matrices only 6 nodes are used  = 3,  = 2, and  = 1 for the dimensions.
After simulation the sensor location sequence  eq is generated for the three rows represent the depth (level), length (row), and width (column), respectively, in underground mine space, as indicated earlier this explains the shortest search path criteria of the robot.The following matrices and vectors are also generated and represent the connection  and routing R matrices.Element "1" in the connection matrix and routing matrix represent signal reach and routing (transmission links), whiles "0" means there is no signal reach or transmission link is down.Consider S eq = ( ( Elements in the explosion matrix  show the effect as a results of accident or explosion.If the number of a node is 7 and above, it implies that the link is totally down, 3 and below means the link is good and has 100% assurance that transmission can go on, and 4, 5, and 6 represent probabilities of the link(s) being able to transmit.Equally in the failed matrix , the nodes "0, " "1" and fractions represent the link is down ("0"), good ("1") and the fraction, for example, 0.75, represent the probability for transmission.Consider  = ( ( 1 5 1 2 2 9 0 1 6 1 0 0 4 6 2 2 2 0 2 1 0 6 0 3 As discussed already in Figure 2, the nodes represent the effect of the explosion in transmission of data and the number of nodes 0, 1, 2, and 3 in Region 1 of the figure implied the link(s) are not affected.Nodes with numbers 4, 5, and 6 in region 2 represent a probability that the links will be able to transmit/receive data while nodes with number 7 and above in region 3 mean the links are totally dead as indicated on the impact of explosion/accident on transmission link curve.The remnant matrix called the hope matrix () is optimized for transmission.The nodes on the hope matrix indicate hope to reorganize the routing path to get the optimized matrix () or a new routing path with the elements 0.4, 0.48, and "0.8" strongly connected as the probability to be able to transmit and receive data whiles "0" is not able to transmit or receive data.Consider  = ( ( 1 .6 1 0 0 0 1 1 .5 1 0 0 0 .5 1 1 0 0 0 1 1 .5 1 0 0 0 0 .
In the exit matrix  the resulting safe routes are 0.72, 0.864, and 1.44 values whiles the values "0s" represent unsafe routes.The values in the rest of the vectors infrared , hardware , and software matrices  represent the probability of successes of survival of the infrared, hardware, and software, respectively; that is, the elements in the hardware survival As noted already each iteration in generating the routing path is 6, representing the six (6) types of rock identified for the study.The  is the mean or average of all the 6 cases  .cases which is used as inputs to the neural network.The nodes   = (0.5377 0.8597 0.8741 0.8644 0.8143 0.5615) represent the maximum survival probability for a total of 6 nodes deployed, each row of the  .cases matrix represents a vector at each iteration (i.e., The elements   = (0.5377 0.8597 0.8741 0.8644 0.8143 0.5615) represent the probability of 54%, 86%, 87%, 86%, 81%, and 56% success of each node transmitting data to and from its source and destination resp.).It describes the success rate from each node to the sink(s).In most practical applications, more than one sink is used, and sink node is either through the fiber or TTE connection, in this study TTE is used.The size of the vector depends on the dimensions of the field.It assists decision makers as to whether to send data through one or more nodes or send each message twice.for the two hybrids of CRBF and Gaussian.Figures 6 and  7 indeed represent the MSE convergence curves from the simulation results for the proposed hybrid and the Gaussian hybrid.This is so because the fitness value or the error is obtained at convergence.These figures reveal three major areas: (1) the MSE convergence curves which is plotted on fitness value against iteration (top of each figure), (2) the fitness values or the optimized error from the same curve, and (3) the survival probabilities and scalability plotted on survival probabilities over rock hardness values.The figures recorded in (Figures 5 and 6) represent just one run in each case.For example, in Table 2, the details for each parameter for the proposed model are recorded from the 10 runs in order to obtain a fair value since some converge earlier than the others.It must be noted that Figures 6(a) and 6(b) are indeed the MSE curves for the hybrid models HSRF − cos and HSGF − cos respectively.It worth noting that 3D dimension with total matrix size of 6 has the MSE convergent curve displayed as in Figure 7.
Details of the survival probability and the scalability of the GA algorithms are in Table 3, for 30, 120, 200, and 300 nodes, respectively.The initial survival probability for the GA-SBF is between 87.3% and 94.2% in soft rock layers declining to 68.7%-84.5% at harder rock layers (Figure 6(a)), the GA-CRBF is 88.9%−100% at the soft rock layers and 66.0%-86.6% at the harder rock layers.The average performance trends of each parameter of each model after a number of simulations are displayed in Table 3.The performance of the model parameters using the 5th order polynomial reveals that SBF and CRBF are almost at par with  2 value for SBF 0.8228, 0.8513, 0.8337, 0.6544, 0.6373, and 0.9293 and CRBF  2 values of 0.8463, 0.8642, 0.8025, 0.6025, 0.7230, and 0.8457 (Table 2) represent the mean iteration (MI), standard variance (SD VAR), standard deviation (SD), convergent time (conv.time), optimized error (ERROR), and central processing unit or time (CPU), respectively.It must be noted that  2 values indicate the strength or contribution as well as weakness of the model or model parameters, that is, the higher the value is close to 1 the better the trend.The GA-ZRBF and GA-GRBF followed in decreasing order of performance.Computationally, the performance of the models is SBF with  2 value of 0.9292 and CRBF  2 value of 0.9022 and the rest follow in descending order in Table 3.
Again the optimized errors recorded from MSE convergence curves were in order of best performance HSRF-GRBF (0.01076), HSRF-CRBF (0.01109), ZRBF (0.012183), GRBF (0.012291), SBF (0.0126), and CRBF (0.0129).The trends of the  optimized errors of the basic models are displayed in Figure 8 using the fifth order polynomial.The scalability of the basic model is displayed in Table 4, for 30, 120, 200, and 300 nodes and the results indicate strong scalability with respect to rock types and rock hardness.
The relationship between the various models with respect to the central processing unit (CPU) was profiled for different runs to assess its scalability (Table 5).The proposed GA-SBF and GA-CRBF have better usage of CPU time, and optimizes its parameters with  2 = 1.It was very comparative to the PSO-CRBF in addition the PSO-CRBF was outstanding among the swarm search whiles the SBF is outstanding in the GA.Comparing the GA trained in this work and PSO trained [54], the GA-SBF has better utilization of resources (e.g., processing time) among the basic models in the genetic algorithm, whilst the PSO-CRBF was outstanding in keeping the error at minimum.The GA-SBF shows outstanding performance in terms of processing time which was slightly higher than that of the CRBF as in Figures 9 and 10.Again the results reveal that both GA-SBF and GA-CRBF with the absolute operation have efficient resource utilization better than that of Gaussian RBF.It has been ascertained that RBF produces more accurate predictions than other models such as MLP, RBFNN-MQ when applied to other applications.This is also true in the basic PSO models displayed in [54].Similar works in evacuation procedures including that of [55]  have been done; however underground characteristics and limitation of choice of safe exits makes it impractical.Furthermore, the results indicate that both the PSO and the GA have strong survival probabilities and are scalable.The future trend of the models for the PSO and GA are good.In addition the model as a whole in both cases had the trend of the  2 value of 1 which implies that both can stand the test of time.However the PSO trained models are more accurate in terms of the optimized errors, that is, CRBF (0.0108), SBF (0.016), GRBF 0.013, ZRBF (0.012109), and CRBF with nonlinear weight hybrid of negative cosine (0.009) [56].On the other hand, the optimized errors of the  GA trained models are CRBF (0.012923), SBF (0.0126), GRBF (0.01229), ZRBF (0.012183), and nonlinear hybrid of negative cosine weight CRBF (0.012291).In terms of training, the GA trains better than the PSO but the PSO's signal transmission is better in terms of harder rock areas.Generally the errors were higher in GA than in PSO.For more information on the particle swarm models, and the hybrids, readers may refer to the following references [54,56].It must be noted that the GA-SBF performed well with genetic algorithm.This was demonstrated in the computational efficiency in Figures 8  and 9.The GA's values were 31%, 28%, 23%, and 18% for SBF, CRBF, ZRBF, and GRBF, respectively, while the PSO values were 31%, 26%, 27%, and 16% for CRBF, SBF, ZRBF, and GRBF, respectively.The PSO favoured the CRBF and the GA favoured the SBF.
To verify the optimal parameters the study used both population sizes 20 and 30 to simulate the results.The differences of the CPU time, SD, and the error for the population sizes of 20 and 30 as seen in Table 6.The population size of 20 is outstanding in all the three outcomes: CPU time, SD, and error as 223.807, 0.0087163, and 0.01233273, respectively.The simulated results (CPU time, SD, and the error) for the two population sizes are displayed in Table 6.These results reveal that there is much demand on CPU time with error more than twice compared with target set for population size of 30.

Conclusion
This study has been able to construct a hybrid neural network for underground rescue system using absolute operation with non-Gaussian and compact radial basis transfer functions.This novel hybrid transfer function model shows outstanding performance over the non-Gaussian models.The study has discussed the new model using GA training and has compared it with the PSO training in previous work.The proposed model is very competitive and efficient to existing models like the Gaussian.The results of the GA show an alternative training method for rescue system for emergency operation for mining grounds and tunnels, where rock hardness is relatively constant.It can also be used for hospitals, building surveillance, and evacuation situations.The processing efficiency in the models is better as compared with Gaussian model.
The challenge is that the optimized errors obtained from the models with genetic algorithm are high (note that this is only in comparison with the PSO) and future work should address this limitation.
For future work there is the need to use the trained NN to predict the survivability with instant inputs of complete/correct and/or incomplete/incorrect location and rock type from the randomly picked mobile sensor, to further verify the accuracy and robustness of the prediction.In addition, more hybrids will need to be investigated with the GA of both linear and nonlinear weights to further optimize the error as well as investigate the signal sensitivity of the model.
It is envisioned that this model is a good alternative in the mining industry and can go a long way to help save lives.It is robust, fault tolerant, and competitive to the Gaussian and non-Gaussian models.

Figure 1 :
Figure1: The effect of using power operation instead of absolute operation.

2
Indicate probability of link being able to transmit data

Figure 2 :
Figure 2: Impact of explosion/accident on transmission link.

Figure 3 :
Figure 3: The mutation, crossover, and reproductive cycle of GA.

Figure 4 :
Figure 4: The conceptual view with GA semantic structure of the proposed hybrid neural network.

Figure 6 :Figure 7 :
Figure 6: GA Convergence and optimized error for hybrids.

Figure 8 :
Figure 8: The optimized error for GA trained model.

Figure 10 :
Figure 10: Computational efficiency of PSO trained model.

Table 1 :
Common rocks found in typical mines in relation to hardness or softness.

Table 3 :
Efficiency of CPU time and other parameters for the Genetic Algorithm.

Table 5 :
Average performance of the models CPU time (GA).

Table 6 :
Comparing the two population sizes.