Evolutionary Optimization of Electric Power Distribution Using the Dandelion Code

,


Introduction
Electric power distribution must satisfy user needs within a given geographic area, and for this purpose, an electric network that can identify a primary and a secondary distribution network is used. In the primary network, electric power is distributed in a tree topology, an activity in which electric companies invest approximately 50% of the transport network's capital [1] and which gives rise to the requirement of efficient planning of the distribution network, taking into account the minimization of power loss. One way of approaching the planning is through mathematical programming models, which, although useful to solve various real situations [2,3], have limitations when dealing with problems of large size.
The feeders allow power to be transferred from the substations to places close to the final consumers, using a voltage level that can satisfy the requirements of each consumer. The power flows radially along conductors, which, because of their nature, offer resistance to the flow, causing part of this power to be dissipated as heat. These conductors have different costs depending on their cross sections and types of materials, and, therefore, the flow must be adjusted to minimize the loss. This paper considers the design problem that also involves minimizing the amount of materials and installation costs.
Energy distribution from a substation to the consumption points can be represented by a tree in which the substation corresponds to the root and the trees nodes represent the consumers. This tree can be coded as an array of integer numbers that represent the labels of the nodes. If we consider the bijection between a Cayley tree [4] and a string of n characters, the number of possible trees given by the Prüfer number [5] is n n−2 . Furthermore, because each connection can be established with different types of conductors, the combinatorial degree of the problem can reach dimensions that make it computationally unmanageable.
The optimum generation of electric power distribution trees is a problem that has been studied in the literature and is known as the distribution tree problem (DTP) [6]. One way of facing this challenge is through heuristic and metaheuristic methods. Under this approach, a comparative analysis of computational performance between simulated annealing and tabu search has shown that it is possible to obtain good quality solutions for a group of instances that, although larger than previously reported in the literature, are still far from the sizes that are actually handled [7].

Journal of Electrical and Computer Engineering
Various authors have studied the DTP by considering genetic algorithms (GAs), which are stochastic search methods based on principles of evolution and natural selection [8]. Typically, they use coding based on integer numbers (binary in some cases) to represent a tree, but this type of representation involves the problem of having to define adequate genetic operators to visit only feasible solutions. In this respect, Najafi et al. [9] use a binary coding scheme that allows solving a large distribution network that, in practice, is reduced to 90 nodes. Considering a similar representation, Li and Chang [10] propose a new GA supported by a minimal spanning tree algorithm and illustrate the method with a small network. The computational performance of a GA appears to become worse when the networks are larger, requiring even longer computing time to solve problem instances of approximately 500 nodes [11]. Real networks can have hundreds of thousands of nodes [12], and thus, the algorithms described above may require extremely long computing times, making it difficult to construct a computational system to support decision making in this field. The greatest computational expense in the GA is associated with the calculation of the power flow at each new solution visited and with the handling of the operators that control the feasibility of the solution. One way of reducing this expense involves using a coding based on trees, which has been explored in the literature by considering other optimization problems in graphs [13].
There are several ways of representing trees within a GA, but there is little consensus as to which of these representations is "the best" [14]. Several researchers have proposed that the representation of a tree must have certain properties for the implemented algorithm to operate efficiently [13,15,16]. One way of generating this representation involves using the dandelion code [17], which belongs to the Cayley family of codes [5]. The results show that the dandelion code satisfies the properties identified by Palmer and Kershenbaum [16] and therefore constitutes an interesting representation for an evolutionary search. When using the dandelion code for the DTP, each distribution tree can be evaluated without concern for the feasibility when applying the cross-over and mutation operators. In this way, the computational effort is reduced exclusively to the evaluation of the power flow. This paper tackles the problem of the optimal generation of large electric power distribution trees through a GA, using the dandelion code to represent trees.
The second section of the paper describes the main components of the model, and the third section presents the results. The last section presents the conclusion of the study.

Problem Modeling and Representation
Let Γ = {τ 1 , τ 1 , . . . ..τ m } be defined as the set of all possible trees with n nodes, and let S be the set of all the integer strings of length n − 2. The dandelion code establishes a relationship between a code D ∈ S and a tree τ ∈ Γ such that to decode a tree τ ∈ Γ associated with D, a function

Representation of the Problem.
To represent a distribution tree, all of the substations are labeled with natural numbers from 1 to N, and the consumer nodes are labeled with natural numbers from N + 1 to M; an artificial node "0" is created to represent the source of supply for all of the substations. In turn, each substation feeds the consumer points following a tree structure. Thus, a dandelion code associated with a set of distribution trees can be identified with an array of n − 2 elements. Moreover, to ensure that each substation be the root of each tree, the positions corresponding to the substations have the value "0," which represents the artificial node.
As an example, consider an electric distribution system with N = 3 substations and 9 consumer points. Every set of distribution trees can be represented by a string of 10 natural numbers in {1, 2, 3,. . .,10}. To decode a given string, such as C = (4, 6, 2, 2, 4, 8, 7, 6, 11, 12), we first define a string A c with consecutive numbers from 2 to M − 1 (in this case, A c = (2, 3,4,5,6,7,8,9,10,11)) such that the first N − 1 positions correspond to the labels of the substations. The first two positions of C have the value 0, so matrix R is constructed with the elements of A c in the first row and the elements of C in the second row: Function ϕ(·), corresponding to the dandelion code function, is applied where, for each element of the first row, there is a corresponding element directly below it in the second row. In this example, ϕ(2) → 0; ϕ(3) → 0; ϕ(4) → 2; ϕ(5) → 2; ϕ(6) → 4; ϕ(7) → 8; ϕ(8) → 7. A cycle is detected and separated from ϕ(·), storing it in an ordered list from smallest to largest, such as in the regular dandelion decoding algorithm. Then, we continue with the following element of the first row-ϕ(9) → 6, ϕ(10) → 11, ϕ(11) → 12-until the elements of A c are finished. The tree is built as follows: all the nodes that represent the substations are connected with the artificial substation. In this example, nodes 1, 2, and 3 are connected with node 0. The cycles and the last node are added to one of these substations; in this case, 7, 8, and 12 are added to substation 1. A substation node is chosen on which to hang the cycles. In this case, the substation is labeled 1. To select substation 1, the criterion of choosing the substation with the lowest label that appears fewer times in C is applied. In the case of a tie, the one with the smallest label is chosen. The nodes are then added to the tree, as indicated by function ϕ(·). In this way, the code's bijection is retained, and a good locality is insured. Figure 1 shows the electric distribution tree corresponding to code C.
Any set of electric distribution trees that solves the DTP can be represented by the proposed model, with M substations and N consumer points. We call this representation D-DTP. For the GA code C representing a chromosome, each chromosome represents a set of electric distribution trees, and each set of trees represents a solution to the DTP.

Initial Population and Objective Function.
The initial population is generated using a modified Prim algorithm [18]. Each of the solutions of the initial population is represented by the D-DTP, which, in the GA, is equivalent to an individual. To evaluate the fitness of each individual, (2) is used, which allows the minimum construction cost and the minimum power loss cost to be determined: where f 1 (y) is the investment cost in equipment and f 2 (y, x) is the power loss cost. The construction cost between the f 1 nodes is proportional to the Euclidean distance of the arc between the nodes that form the tree, multiplied by the cost of the conductor used in each section. Furthermore, the cost of the losses f 2 is proportional to the square of the current that flows along each section and is determined using a power flow algorithm [19].

Experimental Design.
To perform the experiment, the test instances are generated first, which have the following input data: active power P, reactive power Q, and (X 1 , X 2 ), which contains the position of the nodes in geographic space. The instances that must be created are 35,000, 40,000, and 45,000 points, which correspond to the consumption nodes. Additionally, for each instance, 20 points equivalent to the substations are considered. P is generated randomly for each node, with values between 0 and 1. In contrast, Q is generated such that it fulfills the electric power relations that satisfy the condition Q = P tan(θ), where θ is generated by satisfying cos(θ) > 0.8. Similarly, the active power and the reactive power of the substations are generated. Additionally, the total power of the substations must be greater than the sum of the powers at the consumption nodes, including the losses. The geographic locations of the nodes are generated randomly with nominal values between 0 and 1. A conductor table (Table 1) must also be considered that contains the resistance, the impedance, the current capacity, and the price of each conductor.

Parameter Calibration.
The main parameters of a GA are: population size, the cross-over probability, and the mutation probability. In this paper, we will use the parameters proposed by Grefenstette [20], which are 30 to 50  individuals for the population size, 0.90 to 0.95 for the cross-over probability, and 0.01 to 0.05 for the mutation probability. For the termination criterion parameter, we used the number of generations, which is 1000 in this case.
Although it is recommended to use a low mutation probability [21], experiments prior to our work have shown that when the probability increases, the results improve. For this reason, a calibration of this parameter is made. The set of values studied for the mutation probability is the following: 0.005, 0.01, 0.02, 0.04, 0.08, 0.10, 0.12, 0.14, 0.16, 0.18, 0.20, and 0.22.
The calibration process was executed 5 times with an instance of 500 nodes and 20 substations. The network's data are shown in Table 2. As an example, for a given node P = 1, considering a base power of 1000 kVA, the P value for this node becomes 1000 kW. The results of the calibration show that the probability of 0.16 achieves the best results, as shown in bold numbers in Table 3.

Results
For each instance, the following input data were considered: 20 substations, base power 1000 kVA, base voltage 12 kV, and an operating time of 10 years. Figure 2 shows that the algorithm starts with good initial solutions, then moves away from the initial solution, and finally converges to a good-quality solution. The good initial solutions are explained because the initial population is generated with Prim's algorithm.   To validate the results obtained with the GA, the results are compared with a lower bound. To find such bound, use is made of a Prim algorithm with which the minimum construction cost of the network is sought. In this way, comparisons of the results are established that validate the results obtained with the GA. This can be observed in all of the test instances (Table 4).

Conclusions
This paper proposes a solution for the DTP that has a real application in the optimization of electric distribution networks. The problem is approached using GAs, proposing a model that considers construction and power loss costs. To perform the search in the distribution tree space, the dandelion code is used. The proposed approach shows that the code used is efficient in the representation of trees, and its use allows real problems to be solved using GAs.
The approach used also allows one to find solutions to problems with extremely large instances, e.g., for the instance of 45,000 nodes, which requires a computing time of 118:59:16.