A DECOMPOSITION-BASED CONTROL STRATEGY FOR LARGE, SPARSE DYNAMIC SYSTEMS

We propose a new output control design for large, sparse dynamic
systems. A graph-theoretic decomposition is used to cluster the states, inputs, and outputs,
and to identify an appropriate bordered block diagonal structure for the gain matrix. The resulting control law can be easily implemented in a multiprocessor environment with a minimum of communication. A large-scale problem is considered to demonstrate the validity of the proposed approach.


Introduction
The control of large-scale dynamic systems has attracted a great deal of attention over the past few decades (e.g., [15,16]).Systems of this type are characterized by high dimensionality, information structure constraints, and uncertainty, all of which pose significant theoretical and practical challenges.The central problem in this context is to develop control strategies that are computationally efficient and can ensure robustness with respect to uncertainties and nonlinearities in the system.The obtained feedback laws must also be easy to implement, preferably in a multiprocessor environment and with a minimal exchange of data.
A critical step in any large-scale design is the identification of a decomposition that can fully exploit the sparsity of the system matrix.The advantages of working with sparse matrices have been recognized as early as the 1960s, when it was established that an appropriate permutation can dramatically reduce the computational effort needed for solving large systems of linear equations [11,19].Since that time, a substantial body of literature has become available on this subject, including a number of excellent surveys [4,6,7].
With the advent of parallel computing, several new objectives emerged for sparse matrix ordering, perhaps the most important one being the identification of structures that are suitable for a multiprocessor environment.The simplest example of such a structure would be a matrix that can be expressed as  where A 0 has the standard block diagonal form shown in Figure 1.1, ε is a small positive number, and all the elements of A C are smaller than one in magnitude.A matrix with this property is said to have an epsilon decomposition, and the necessary permutation can be easily determined even for very large systems [13,14].One of the main advantages of this approach stems from the fact that discarding the term εA C can significantly sparsify the problem.This aspect of the decomposition has been studied extensively, both in the context of parallel computing [2,20] and in decentralized control (see [16] and the references therein).An important extension of epsilon decomposition arises when A 0 has the form shown in Figure 1.2.In that case, matrix A is said to have an overlapping epsilon decomposition [14,16,22].Problems of this type can be parallelized effectively in the expanded space, in which the overlapping blocks appear as disjoint.Solutions in the original and expanded spaces can be related using the inclusion principle and appropriate rectangular transformation matrices [10,16].The most general structure for matrix A 0 is the nested bordered block diagonal (BBD) form shown in Figure 1.3, whose potential for parallel processing has been acknowledged by numerous authors (e.g., [5,9,12]).Although any sparse matrix can be reordered in this way (at least, in principle), identifying an appropriate permutation matrix turns out to be a difficult graph-theoretic problem.Among the many heuristic schemes that have been developed for this purpose we single out the algorithm proposed in [21], since it represents an essential ingredient of the decomposition that will be developed in Section 2. This method was found to be effective over a wide range of nonzero patterns, and for matrices as large as 500,000 × 500,000.
While a considerable amount of research has been devoted to exploiting BBD forms in parallel computing, there have been virtually no applications to control.With that in mind, in this paper we propose a new output control strategy in which the gain matrix has a BBD structure.
We begin by developing an efficient graph-theoretic decomposition that is capable of identifying BBD structures with a suitable input and output assignment.Given such a decomposition, in Section 3, we propose an LMI-based optimization strategy that produces output feedback laws which can be implemented in a multiprocessor environment.A numerical example is provided in Section 4 to illustrate the effectiveness of this approach.

Input/output-constrained BBD decomposition
We consider the standard linear system where x ∈ R n is the state of the system, u ∈ R m is the input vector, and y ∈ R q is the output vector, and A = (a i j ), B = (b i j ) and C = (c i j ) are large, sparse matrices of dimension n × n, n × m, and q × n, respectively.In this section, our objective will be to develop a decomposition that simultaneously permutes matrix A into the BBD form and secures a compatible block-diagonal structure for matrices B and C. The proposed algorithm represents a modification of the balanced BBD decomposition method [21], with the added requirement that each diagonal block must have at least one input and one output associated with it.In the following, we will refer to such a decomposition as an input/outputconstrained BBD decomposition.
The graph-theoretic nature of the decomposition problem requires that we associate an undirected graph G(V,E) with system (2.1),where V is a set of vertices v i and E represents a set of edges e i j .We will assume that V = X U Y consists of three disjoint sets of vertices, the state set X = {x 1 ,...,x n }, the input set U = {u 1 ,...,u m }, and the output set Y = {y 1 ,..., y q }.Then, (x i ,x j ) ∈ E if and only if a i j = 0, (x i ,u j ) ∈ E if and only if b i j = 0, and (x i , y j ) ∈ E if and only if c ji = 0. We can further assume without loss of generality that matrix A is structurally symmetric; if that was not the case, we could simply use A + A T to obtain the decomposition.
The proposed algorithm can now be described by the following sequence of steps.
Step 2.1.The decomposition is initialized by identifying the shortest path from every input vertex to an output vertex.If any outputs remain unvisited at this point, the procedure is reversed with these outputs as the starting vertices.The initial graph is then formed as a union of the obtained paths.A typical situation after this stage of the decomposition is shown in Figure 2.1, for a system with four inputs and six outputs.
A. I. Step 2.2.The second step begins by forming incidence sets for each u i ∈ U and y i ∈ Y.The union of these sets is then added to the graph obtained in Step 2.1.If there are any elements x i ,x j ∈ X such that (x i ,x j ) ∈ E, these edges are added to the graph as well.
Once the new graph Γ is formed, its disjoint components .., k) are identified (note that by construction each set Γ i contains at least one input and one output vertex).A possible scenario corresponding to Figure 2.1 is shown in Figure 2.2, in which all edges added in Step 2.2 are indicated by dashed lines.It is important to recognize at this point that all vertices x i ∈ X that are incident to some input or output have been included in sets Γ i (i = 1,2,...,k).Vertices x i / ∈ Γ constitute set X C , which represents a separator for the graph (the term implies that the removal of X C from the overall graph results in disjoint sets Γ i ).A schematic representation of this property is shown in Figure 2.3.
Step 2.3.In most problems, the separator X C is large and needs to be reduced in order to obtain a meaningful BBD decomposition.To perform the necessary reduction, we first need to identify a subset of inputs and outputs that we wish to associate with the border block.For simplicity, let those inputs be contained in sets Γ k−1 and Γ k .We then form the initial border as the union of sets X C , Γ k−1 , and Γ k (this initial border has been shaded in Figure 2.3).The remaining sets Γ i (i = 1,2,...,k − 2) obviously represent the corresponding initial diagonal blocks, and the vertices of X C are now reconnected to these blocks one by one.The order in which the reconnection proceeds reflects a "greedy" strategy, whose objective is to achieve a minimal increase in the sizes of the diagonal blocks in each step.The reconnection continues for as long as at least two diagonal blocks remain.The situation at the end of this stage is shown in Figure 2.4, in which the final border is shaded.
A. I. Zečević and D. D. Šiljak 39 Several comments are in order at this point.
Remark 2.4.Reconnecting any further vertices from the border would require merging blocks 1 and 2. In that sense, the obtained border is minimal.
Remark 2.5.The decomposition secures that each of the diagonal blocks has at least one input and one output associated with it, since only vertices belonging to X C are reconnected.In addition, the preselected inputs and outputs remain associated with the border block.
Remark 2.6.The reconnection scheme uses the same data structures and tie-breaking criteria as the one described in [21].For further details, the reader is referred to this paper.
Remark 2.7.The fact that each diagonal block has at least one input associated with it does not automatically guarantee controllability.As shown in [16], it is possible to construct a graph that is input reachable, but generically uncontrollable due to dilation.However, this type of situation occurs very rarely.
Remark 2.8.The decomposition proceeds in a nested manner by separately applying Steps 2.1, 2.2, and 2.3 to each diagonal block.The algorithm terminates when one of the following two conditions is encountered: (1) a block has no more than two inputs and/or two outputs associated with it, making a further decomposition impossible, (2) the resulting border set is unacceptably large.In this case, the block cannot be decomposed in a meaningful way (or, to put it differently, the block is not sufficiently sparse).
The proposed decomposition produces a permutation matrix P that corresponds to the obtained graph-theoretic partitioning.The generic structure of matrix A BBD = P T AP is the one shown in Figure 1.3 (possibly with additional levels of nesting), while B D = P T B and C D = CP consist of diagonal blocks compatible with those of A BBD .
It should also be observed that the proposed decomposition can easily be extended to the form where ε 1 , ε 2 , and ε 3 are small positive numbers, and all elements of A C , B C , and C C are smaller than one in magnitude.This represents a combination of epsilon decomposition and input-constrained BBD decomposition.The structure in (2.4) can be obtained by discarding all elements of A, B, and C such that |a i j | ≤ ε 1 , |b i j | ≤ ε 2 and |c i j | ≤ ε 3 prior to executing the decomposition.In many cases, this can significantly increase the sparsity of A and produce a more flexible BBD structure.It also guards against poorly controllable or observable blocks, by discarding small entries in matrices B and C.
40 A decomposition-based control strategy

Output feedback design for a multiprocessor environment
We consider a nonlinear system described by the differential equations where x ∈ R n is the state of the system, u ∈ R m is the input vector, and y ∈ R q is the output vector.We will assume that A is an n × n sparse matrix that has been permuted into a BBD form, with diagonal blocks of dimension n i × n i (i = 1,2,...,N).As described in Section 2, the same permutation ensures that matrices B D = diag{B 1 ,...,B N } and C D = diag{C 1 ,...,C N } consist of n i × m i and q i × n i diagonal blocks, respectively.The nonlinear function h : R n → R n is piecewise-continuous in x, satisfying h(0) = 0.This term is assumed to be uncertain, but bounded by a quadratic inequality where H is a constant matrix, and α > 0 is a scalar parameter.
As shown in [17,18], a centralized state feedback law u = Kx can be obtained for the system in (3.1) by solving the following LMI problem in γ, κ Y , κ L , Y , and L.
If the optimization is feasible, the gain matrix computed as K = LY −1 stabilizes the closed-loop system for any nonlinearity that satisfies (3.2).From that standpoint, the obtained value for parameter α can be viewed as a measure of robustness.
Since our objective is to design output feedback that can be implemented in a multiprocessor environment, it will be necessary to introduce several modifications to optimization Problem 3.1.We will consider three possible scenarios.
A. Decentralized output feedback.In this case, the control law utilizes only locally available output information.To secure the desired decentralized output feedback structure, the following additional requirements need to be incorporated into Problem 3.
where Y 0 and Y C are unknown n × n and q × q block diagonal matrices, respectively.The diagonal blocks of Y 0 have dimension n i × n i , and those of Y C have dimension q i × q i .Requirement 3.3.Matrix U D is a user-defined block diagonal matrix of dimension n × q, consisting of n i × q i blocks.
Requirement 3.4.Matrix Y 0 must satisfy the equality constraint this condition is automatically satisfied by any matrix Y 0 of the form where Q D is an n × (n − q) block diagonal matrix such that In that case, we need to compute an (n − q) × (n − q) matrix Y Q , which is symmetric and block diagonal, with (n i − q i ) × (n i − q i ) blocks.
Requirement 3.5.Matrix L must have the form where L C is a block diagonal m × q matrix with blocks of dimension m i × q i .
To understand the rationale for Requirements 3.2, 3.3, 3.4, and 3.5, we should recall that Y −1 can be expressed using the Sherman-Morrison formula as (e.g., [8]) where It is now easily verified that Requirement 3.4 implies LY −1 = K D C D , with Requirements 3.2, 3.3, and 3.5 also ensure that K D is a block diagonal matrix, with blocks K i of dimension m i × q i .This gives rise to a decentralized output feedback law of the form B. BBD output control.The feedback in this case is assumed to have the form which corresponds to an m × q BBD gain matrix A control law of this type can be obtained in the same way as decentralized output feedback, the only difference being that matrix L C in Requirement 3.5 must now be an m × q BBD matrix with a block structure that is identical to that of the gain matrix K BBD in (3.15).This property follows directly from the fact that I − U T D SR is a block diagonal matrix with q i × q i diagonal blocks, by virtue of Requirements 3.2, 3.3, and 3.4.Remark 3.6.It is important to note that a BBD output control law is easily implemented in a multiprocessor environment.Indeed, any such structure can be efficiently mapped onto a tree-type parallel architecture, which guarantees a low communication overhead (e.g., [3]).

C. Proportional plus integral (PI) BBD output control.
Proportional plus integral BBD output control represents a natural extension of the decentralized feedback strategy proposed in [1].In this case, the control law is assumed to have the form where This type of control obviously corresponds to a pair of BBD gain matrices K BBD and F BBD , and can be represented in compact form as In order to determine such a control, we first define auxiliary matrices Ã, B, and H as where I n represents an n × n identity matrix.We now propose to solve LMI Problem 3.1 using matrices Ã, B, and H instead of A, B and H, with the following additional requirements.
Requirement 3.7.Matrix Y must have the form where In (3.21), Y 11 is an unknown n × n block diagonal matrix with n i × n i blocks, while Y 22 and Y C are unknown q × q block diagonal matrices with q i × q i blocks.
Requirement 3.8.Matrix U D is a user-defined block diagonal matrix of dimension (n + q) × q, which can be partitioned as In (3.22), U 1 is an n × q block diagonal matrix with n i × q i blocks, and U 2 is an q × q block diagonal matrix with q i × q i blocks.
where L C is an m × q BBD matrix with a block structure identical to that of the matrices K BBD and F BBD in (3.18).
If optimization Problem 3.1 is feasible under these conditions, the closed loop system is guaranteed to be stable for any h(x) satisfying hT (x) h(x) ≤ α 2 xT HT H x.
(3.26) Furthermore, Requirements 3.7, 3.8, 3.9, and 3.10 ensure that the product LY −1 can be factorized as where and S, R are the matrices introduced in (3.11).Since U T 1 Y −1 11 =C D by virtue of Requirement 3.9, we can partition LY −1 as (3.29) , it is not difficult to verify that both of these matrix have a BBD structure identical to (3.15).If we now use these matrices to form the control law proposed in (3.18), the closed-loop system (3.1) is guaranteed to be stable for all h(x) such that where • denotes the Euclidean norm.To see why this is so, we differentiate the state and output equations in (3.1).Setting v = ẋ, we obtain the closed-loop system now takes the form (3.25).Since by virtue of (3.19) and (3.30), it follows that x = 0 is a stable equilibrium of (3.25).This further implies that the corresponding closed-loop equilibrium of (3.1) must be stable as well.We should note that this equilibrium need not be at the origin, since lim t→∞ u(t) = 0 for the control proposed in (3.18).

An illustrative example
In this section, we will compare the three proposed output feedback strategies from the standpoint of robustness and computational complexity.As a typical test case, we consider a sparse system with 55 states, 8 inputs, and 8 outputs (the nonzero elements of    This additional sparsification reduces the dimension of all border blocks to 5 × 5, while the remaining diagonal blocks are 10 × 10.We should also note in this context that the proposed combination of BBD and epsilon decompositions allows for a straightforward adaptation of the problem to a prescribed number of processors.This property, known as scalability, is of considerable practical importance in parallel computing. In order to compare the three designs and the two decompositions, we utilized the proposed LMI optimization to obtain the appropriate output control laws, and established the maximal robustness bound α.As a measure of computational complexity, we also recorded the number of LMI variables associated with matrices L and Y (and also F in the case of PI BBD control).A summary of the obtained results is shown in Tables 4.1 and 4.2.
A comparison of Tables 4.1 and 4.2 shows that the PI BBD control for ε = 0.35 requires fewer LMI variables than any of the three designs corresponding to ε = 0.1.It is also readily observed that the degradation in α as ε increases is smallest in the PI BBD case.This suggests that a PI BBD output control provides the best balance between robustness and computational complexity.Such a conclusion has been supported by a number of additional numerical experiments.

Conclusions
In this paper, we developed a new output control strategy that is suitable for large, sparse dynamic systems.The method is based on an efficient input-constrained decomposition that simultaneously clusters the states, inputs, and outputs.The resulting bordered block diagonal form of matrix A induces the structure of the gain matrix, which is computed using LMI optimization.The obtained feedback law can be easily mapped onto a tree-type A. I. Zečević and D. D. Šiljak 47 multiprocessor architecture, which guarantees low communication overhead.A largescale example was provided to demonstrate the validity of the proposed scheme.

Requirement 3 . 9 .
Matrix Y 11 must satisfy the equality constraint Y 11 C T D = U 1 .(3.23)As discussed earlier, this condition can be automatically satisfied if U 1 is chosen as U 1 = C T D .Requirement 3.10.Matrix L must have the form
and 4.2, respectively.It should be noted that the matrix in Figure 4.1 has two diagonal blocks of dimension 25 × 25, and a 5 × 5 border block.The blocks shown in Figure 4.2 are substantially Zečević and D. D. Šiljak 37 Figure 2.2.Disjoint components after Step 2.2.
smaller, due to the fact that more elements are discarded when ε is set to equal 0.35.