Wind Pressure Coefficients Zoning Method Based on an Unsupervised Learning Algorithm

Damage of the cladding structures usually occurs from the wind-sensitive part, which can cause the damaged conditions to obviously vary from different areas especially on a large roof surface. It is necessary to design optimization due to the difference of wind loads by defining more accurate wind pressure coefficient (WPC) zones according to the wind vulnerability analysis. *e existing wind pressure coefficient zoning methods (WPCZM) have successfully been used to characterize the simple roof shapes. But the solutions for the complex and irregular roof shapes generally rely on the empirical judgment which is defective to the wind loading analysis. In this study, a classification concept for WPC values on the roof surface is presented based on the unsupervised learning algorithm, which is not limited by the roof geometry and can realize the multitype WPC zoning more accurately. As a typical unsupervised learning algorithm, an improved K-means clustering is proposed to develop a new WPCZM to verify the above concept. And a method to determine the optimal K-value is presented by using the K-means clustering test and clustering validity indices to overcome the difficulty of obtaining the cluster number in the traditional methods. As an example, the most unfavorable pressure and suctionWPC zones are studied on a flat roof structure with single wind direction and full wind direction based on the data obtained from the wind tunnel test. As another example, the mean pressure coefficient zones are studied on a saddle roof structure under 0and 45-degree wind direction based on the data obtained by the wind tunnel test. And the proposed WPCZM is illustrated and verified.


Introduction
During recent storms with strong wind conditions [1,2], cladding was found to suffer significantly more damage than the main structure. erefore, the importance of cladding safety under wind loads has attracted increasing attention. At the same time, with economic development and improved construction quality, "beauty" has gradually gained importance in building construction. "New," "odd," and "special" shaped roofs have been created widely, and these bring new challenges to wind-resistant cladding design. e damage on cladding often begins in wind-sensitive areas such as the corners, edges, and other parts of large-scale roof structures. erefore, the different parts of the roof surface have different wind vulnerability, and in particular, the roof corners and edges are more sensitive to wind. To evaluate the wind loading of the cladding more objectively and rationally, different zones must be defined according to the degree of wind vulnerability to design cladding for different wind loads.
Studies have extensively investigated wind pressure coefficient (WPC) zoning of traditional low-rise building roofs such as flat roofs, double-sloping roofs, and foursloping roofs. Most of their results have already been incorporated into several national load codes [3][4][5]. WPC zoning is generally divided into edges, corners, and central zones according to the distribution of WPCs on the roof surface. e edge zone size is defined as the distance required for the maximum negative peak or mean pressure coefficient at the leading edge to reach 70% of its total value or the point where the flow is being reattached [6][7][8]. is gives three types of WPC zones: edge, corner, and interior. e zoning method adopted in the ASCE/SEI7-10, NBCC-2005, and AIJ-2004 codes is mainly aimed at rectangular planar buildings with universal representative roof shapes. However, it does not provide objective WPC zoning results for irregular and complex roof shapes. For the roof shapes not covered by the code, most studies considered subjective WPC zoning according to the geometric features of the roof and characteristics of WPC distribution on the roof surface. Some researchers also proposed WPC zoning methods from different angles. Dong and Ye [9] proposed a WPC zoning method based on the separation bubble theory that is only applicable to the average WPC zone. e WPC zoning method proposed by Sun et al. [10] was based on the WPC Gaussian and that proposed by Ke et al. [11] was based on WPC correlation. However, these WPC zoning methods are either for a specific building geometry or for a specific wind direction and a specific WPC type. us, an objective and widely applicable WPC zoning method remains lacking.
To solve the problems of subjectivity, practicability, and applicability of the current wind pressure coefficient zoning method (WPCZM), this study proposes classifying WPC values on a roof surface based on unsupervised learning algorithm to realize WPC zoning of the roof cladding. Because classification is performed according to the WPC magnitude alone, the zoning is not limited by the roof geometry and the objective multitype WPC zoning can be achieved. Based on the classification, the dynamic clustering theory is introduced for WPC zoning and a fast WPCZM based on K-means clustering is proposed. e K-means clustering method requires the user to provide K-value in advance. However, in the actual WPC zoning, it is often impossible to determine the number of zoning classes in advance. us, the current method has been improved by the search range of the K-value in advance along with the use of multiple cluster validity indicators to determine the optimal K-value. e WPCZM is thus illustrated and verified by considering the most unfavorable WPC zoning on a flat roof surface for single and full wind directions as examples.

Wind Pressure Coefficient Zoning Method Based on Unsupervised Learning Algorithm
As an advanced and potential data analysis technique, machine learning has presented the general opportunities on solving some problems in civil engineering [12]. Machine learning techniques are increasingly applied in civil engineering [13,14]. For example, Sohn et al. [15] proposed a baseline-free damage classification on the carbon fiberreinforced polymer using cluster analysis. e supervised and unsupervised learning procedures are the two main learning methods. e unsupervised learning is presented with the input data only, which is that the learning network organizes itself internally to create categories by correlating the information available in the input data. Dynamic clustering is the most common unsupervised learning method to achieve that the sets of multiple objects are divided into several classes. Cluster analysis divides data into multiple classes without using predetermined decision boundaries [16]. e dynamic clustering method is a novel development from the static classification paradigm in which the model remains unchanged during the recognition phase. Actually, many tasks are dynamic and the training and recognition phases are interlinked. e dynamic clustering should consider the incrementality of the learning method to devise the clustering model and the self-adaptation of the learned model. As a consequence, dynamic aspects of the clustering model to be learned can be captured by adaptation of the current model. K-means is a common algorithm to be applied for clustering, which uses the Euclidean distance and assigns a data point into the cluster with the shortest distance from its centroid [17]. K-means clustering, which was originally proposed by [18], has been a key technique in the field of clustering owing to its advantages of applicability to largescale data, which is easy to understand the results, and low computational requirements. erefore, by using K-means clustering, a clustering-based WPCZM is proposed to illustrate the concept of wind pressure zoning based on unsupervised learning algorithm.
For the designers, the number of WPC clusters that have been divided according to the characteristics of the WPC distribution on the roof surface cannot provide accurate data. It can only provide an approximate range based on the past experience or intuition. In this case, the K-means clustering algorithm requires users to provide K-value in advance and has some limitations. Considering the zoning requirements of WPCs for roof claddings, the construction cost of claddings should be minimized while ensuring cladding safety. erefore, the current method is improved by limiting the search range of the K-value in advance and using cluster validity indicator (CVI) to determine the optimal K-value.
Actually, there are some other clustering algorithms including Gaussian mixture models, subspace clustering, self-organized mapping, etc., which have mainly differences on how to compute the distance between data samples and how to assign a data point to a specific cluster. But the comparison of different clustering techniques has not been involved in the scope of this study.

Search Range of Cluster Number.
e K-means clustering algorithm performs in two different parts. e first is to select a K-value, where K is the number of clusters. e second part is to consider the distance between each data point to the nearest center. So to use the K-means algorithm requires the number of clusters (K-Value) in the data to be prespecified. In other words, the performance of a clustering algorithm is affected by the chosen K-value. But it is more difficult to find an appropriate k-value which is generally a trial and error process by deciding what constitutes "correct" clustering subjectively. In this study, a method to determine the optimal K-value is proposed by using the K-means clustering test and clustering validity indices to overcome the difficulty of obtaining the cluster number in the traditional methods. e number of WPC zones affects the accuracy of WPC description on the roof surfaces of claddings and the convenience of design and construction of these claddings. Too few zones will lead to inaccurate description of the WPC on the roof surface, and too many zones will result in inconvenient design and construction. e maximum number of zoning classes k max for WPC zoning of the roof is less than or equal to the root of the number of measured points n on the entire roof; that is, k max ≤ � n √ [19,20]. Pal and Bezdek [19] noted that there are few theoretical deductions for the selection of k max ; however, many researchers use this rule in practice according to experiences. Yang et al. [21] proved the rationality of the abovementioned empirical rules from different perspectives. (CVI). WPC zoning of a roof is aimed at creating a clear boundary between zones and fully separating zones with larger WPC values from those with smaller ones. e wind loads on these zones are assigned and designed separately to increase the convenience of design and construction and to ensure the safety.

Clustering Validity Indices
is requires maximizing the distance between different classes and minimizing the distance within each class during clustering. Many clustering validity indices can satisfy these requirements. Table 1 summarizes 14 clustering validity indices that can reflect intra-and interclass information and also consider the geometric characteristics, statistical characteristics, similarities, and amount of data.
In this study, the selected CVI should include two parts. e first is to evaluate the differences between different zones. e second is to evaluate the internal consistency in each of the zones. At the same time, the number of zones should meet the engineering requirements. And a single representative value can fully represent the wind load in a single zone in order to meet the engineering requirements. With the higher zone number, the single value can represent the wind load more accurately for a single zone.
e structural design will be safer with the increasing of zone number. And with the lower zone number, the discreteness of wind loading on a single zone is greater. Although the design and construction will be convenient, Maximum value of the index Caliński and Harabasz [22] 2 C Index C index � (S w − S min )/(S max − S min ), S min ≠ S max , C index ∈ (0, 1) Minimum value of the index Hubert and Levin [23] 3 Maximum value of the index Baker and Hubert [24] Minimum value of the index Davies and Bouldin [25] 5 Maximum difference between hierarchy levels of the index Hartigan [26] 6 Maximum difference between hierarchy levels of the index Scott and Symons [27] 8 Maximum value of second differences between levels of the index Maximum difference between hierarchy levels of the index Friedman and Rubin [29] 10 Rubin index Rubin � ((det(T))/det(W q )) Minimum value of second differences between levels of the index Friedman and Rubin [29] 11 Maximum value of the index Krzanowski and Lai [30] 12 Silhouette index Minimum value of the index Halkidi et al. [33] Mathematical Problems in Engineering the decreasing of zone number may make the designing unsafe due to the more conservative value of a single zone. As a result, an optimal wind pressure zoning should be obtained. e optimal wind pressure zoning must present obvious differences between different zones and have a high consistency in each of the zones. And in the engineering application, it should reduce the difficulty of design and construction under the premise to fully ensure the structural safety. And a single representative value can fully represent the wind load in a single zone in order to meet the engineering requirements. e clustering validity indices should be effective to determine the optimal wind pressure zoning. e variables in Table 1 are described as follows. n is the number of observations, p is the number of variables, and q is the number of clusters.
X � {x ij }, i � 1, 2, . . ., n; j � 1, 2, . . ., p, is the n × p data matrix of p variables measured on n independent observations, � X is the q × p matrix of cluster means, and � x is the centroid of data matrix X. n k is the number of objects in cluster C k , and c k is the centroid of cluster C k and x i is the dimensional vector of observations of the ith object in cluster C k .   T is the between-group dispersion matrix for data clustered into q clusters.
N t � (n(n − 1))/2 is the total number of pairs of observations in the data set.
N w � q k�1 (n k (n k − 1))/2 is the total number of pairs of observations belonging to the same cluster.
N b � N t − N w is the total number of pairs of observations belonging to different clusters.   proposed. e existing K-means clustering has limitations in that it requires a given K-value (cluster number) in advance. To reduce the difficulty faced by engineers in setting the Kvalue, a method to determine the optimal K-value is proposed in this study. First, the maximum K-value is limited, and a K-means clustering test is performed with all values in the range [2, k max ]. Second, by using the clustering validity indices, the optimal K-value is obtained based on the test results in the first step.
e WPCZM based on the improved K-means clustering algorithm has the following steps.
Step 1. Input data set of WPCs for measuring points C � {c i | i � 1, 2, . . ., n}, where c i is the WPC at the i th measuring point and n is the number of WPC sets.
Step 2. Calculate and determine the upper bound of the cluster numbers according to the empirical rule k max ≤ � n √ .
Step 3. Clustering is performed for all cluster numbers using the K-means algorithm as follows.
① K objects are randomly selected from C as the initial cluster centers m 1 , m 2 , . . . , m j . ② Calculate the distance of the WPC from each measuring point to each cluster center individually and classify the WPC of each measuring point according to the principle of closest distance, that is, j, l � 1, 2, . . ., k, l ≠ j for the corresponding categories.
③ Calculate the distance of the WPC from each measuring point to each cluster center individually. According to the principle of closest distance, which is given as Mathematical Problems in Engineering ⑤ According to the new central positions, recalculate the distances between the WPC of each measuring point and the new clustering central point and reclassify. ⑥ Repeat step ④. e algorithm converges and the program terminates when the centers of the newly formed classes are equal to the centers of the previous classes. To prevent the termination conditions from failing to satisfy the infinite cycle, the maximum number of iterations is set as 1,000,000 when the algorithm is executed.
Step 4. Calculate the 14 clustering validity indices under different cluster numbers K.
Step 5. Calculate the optimal number of clusters determined by the 14 validity indicators. e most repeated optimal cluster number is the WPC classification number k * . And the corresponding WPC classification results are recorded.

Definition of Wind Pressure Coefficient (WPC)
e pressure coefficient is a dimensionless number which describes the relative pressures throughout a flow field in fluid dynamics. e pressure coefficient is used in aerodynamics. Every point in a fluid flow field has its own unique pressure coefficient.
In many situations in aerodynamics, the pressure coefficient at a point near a body is independent of body size. e pressure coefficient of an engineering model can be tested in a wind tunnel and determined at critical locations around the model. ese pressure coefficients can be used to predict the fluid pressure at those critical locations around full-size structures. e expression of pressure coefficient is obtained as follows:

Mathematical Problems in Engineering
where P i (t) is the measured time series of WPC at the i th measurement point, P ∞ is the static pressure at the reference height, ρ is the air density, and ] zT is the mean wind speed at the reference height. e mean WPCs C p can be obtained by averaging one full sample cycle of WPCs (corresponding to 10 min length for the full scale). e peak of WPC can be calculated by averaging 10 peak values of 10 different samples.
where N is the size of the sample and C pi is the extreme pressure coefficient.

Examples of Simple Structures
A flat roof structure under two working conditions has been used as an example to illustrate and verify this zoning method. Two types of incoming flow conditions are considered in this study, single wind direction and full wind direction. Single wind direction inflow means that a building is only affected by a single wind direction. e angle between the building (rectangular plane) and the wind direction can be decomposed into the normal and the oblique directions. Full wind direction inflow means that a building is affected by wind from full directions (360°). e maximum or minimum value is taken as the most unfavorable pressure or suction of the measuring point.

WPC Zoning under Single Wind Direction Condition.
e maximum clustering number k max is the root of the number of measuring points n on the flat roof to be zoned. In the present case, n � 210 gives k max � � n √ � ��� 210 √ ≈ 14.5; that is, k max � 14. By using the traditional K-means algorithm to achieve all cluster numbers K ∈ [2, k max ] and clustering flat roofs with the most unfavorable pressure (maximum WPC, positive pressure) and suction (minimum WPC, negative pressure) at 0°and − 45°, we calculate several clustering validity indices for different cluster numbers. Table 2 lists the optimal cluster numbers determined with each index. e most unfavorable pressure is obtained at 0°. Four, three, and two clustering validity indices give optimal cluster numbers of 2, 13, and 3, respectively. e other five clustering validity indices have optimal cluster numbers of 4, 5, 10, 11, and 12, respectively. Specifically, four clustering validity indices give the optimal cluster number as 2, and this value is repeated most often. erefore, the most unfavorable WPC zoning is obtained for optimal cluster number of 2 with wind direction of 0°, and the corresponding WPC zones are shown in Figure 4(a). Similarly, the proposed method determines the optimal cluster numbers of the most unfavorable suction at 0°, most unfavorable pressure at − 45°, and most unfavorable suction at − 45°as 3, 3, and 5, respectively. For validation, the WPC distribution on the flat roof surface is compared with the optimal zoning results. e results show that the zoning can match the WPC distribution on the surface of the cladding structure, as shown in Figure 5.    0°15  16  3  3  11  3  3  11  11  11  7  2  3  3  45°14  4  2  2  3  3  3  14  14  14  6 2 2 3

WPC Zoning under Full Wind Direction Condition.
Owing to the plane shape of the roof being biaxially symmetric, one-fourth of the roof can normally be used for analysis. However, for the purposes of verification and comparison, the right half of the roof surface is used for WPC zoning. For n � 120 measuring points on the right half surface (including the midline) of the flat roof, k max � � n √ � ��� 120 √ ≈ 10.95; that is, k max � 10. By using the traditional K-means algorithm to achieve all cluster numbers K ∈ [2, k max ], clustering is performed with the most unfavorable pressure and suction for the full wind direction. Several clustering validity indices are calculated for different cluster numbers, as shown in Tables 3 and 4. According to  Table 1, the optimal number of clusters determined by each index is shown in Table 5.
As shown in Table 3, the most unfavorable pressures in the full wind direction are as follows: four, three, and two clustering validity indices give optimal cluster numbers of 10, 3 and 7, and 8, respectively. e other two clustering validity indices give optimal cluster numbers of 5 and 6, respectively. For the most unfavorable pressure in the full wind direction, 10 optimal zoning classes are determined by four validity indices and the number of repetitions is the highest. However, considering that the distribution of the most unfavorable pressure has a gentle gradient, for convenience of design and construction, three optimal zoning classifications with the second highest number of repetitions are selected. Figure 6(a) shows the corresponding optimal zoning. e proposed method determines the optimal cluster numbers for the most unfavorable suction in the full wind direction as 4, as shown in Figure 6(b). To verify the effectiveness of the WPC zoning, the results of the WPC distribution nephogram and the optimal zoning on the right side of the roof are compared. e results show that the zoning matches the WPC distribution, as shown in Figure 7.

Wind Tunnel Tests.
e wind tunnel test for a saddle roof structure was conducted in a wind tunnel at Beijing Jiaotong University. Figure 8 shows the mean wind speed and turbulence intensity profiles. e saddle roof model has dimensions of 60 cm × 60 cm on a plane with lowest height (H) of 20 cm, and the rise span ratio is 1/8. ere are 265 pressure taps on the top of the saddle roof model, which is shown in Figure 9. e geometric scale, speed scale, and time scale of the wind test are 1/100, 3.125/10, and 1/31.25, respectively. e sampling frequency is 312.5 Hz, and the sampling time corresponding to 10 min in the full scale is 19.2 s.

Mean WPC Zoning.
e maximum clustering number k max is the root of the number of measuring points n on the saddle roof to be zoned. In the present case, n � 265 gives ; that is, k max � 16. By using the traditional K-means algorithm to achieve all cluster numbers K ∈ [2,16] and clustering saddle roofs with the mean pressure at 0°and 45°, we calculate several clustering validity indices for different cluster numbers. e clustering validity indices are calculated for different cluster numbers shown in Table 6. According to Table 1, the optimal number of clusters determined by each index is shown in Table 7.
According to the mean pressure obtained at 0°, six and four clustering validity indices have optimal cluster numbers of 3 and 11, respectively. e other four clustering validity indices have optimal cluster numbers of 2, 7, 15, and 16, respectively. From the results, the optimal cluster number 3 is repeated most often. erefore, the mean pressure zoning is obtained for optimal cluster number of 3 with wind direction of 0°, and the corresponding WPC zones are shown in Figure 10(a). Similarly, for the results of the mean pressure obtained at 45°, four clustering validity indices have optimal cluster numbers of 2, 3, and 14, respectively. e other two clustering validity indices have optimal cluster numbers of 4 and 6, respectively. Cluster number 2 is the minimum from the optimal cluster numbers of 2, 3, and 14 which have the same repetitions. So the mean pressure zoning is obtained for optimal cluster number of 2 with wind direction of 45°, and the corresponding WPC zones are shown in Figure 10(b).
For validation, the WPC distribution on the saddle roof surface is compared with the optimal zoning results. e results show that the zoning can match the WPC distribution on the surface of the saddle roof, which is shown in Figure 10.

Conclusions
is study presents a new concept to obtain the classifying WPC values on a roof surface based on unsupervised learning algorithm. e dynamic clustering theory is introduced for the WPC zoning and a fast WPCZM based on K-means clustering is proposed. Because this proposed classification is performed only based on the WPC magnitudes, the zoning process is not limited by the roof geometries and the objective multitype WPC zoning can be achieved. e proposed WPCZM is illustrated and verified. And the following conclusions can be drawn from this study.
(1) e concept of classifying WPC values based on unsupervised learning algorithm to achieve the WPC zoning can provide an effective solution to overcome the problems of strong subjectivity, weak practicability, and limited scope caused by the application of the existing WPCZM. (2) Based on the improved K-means clustering method, the WPCZM can realize the multitype WPC zoning more accurately and adequately under the conditions with different geometric structures, different flow field characteristics, and different types of WPCs. (3) e layout of the wind tunnel test points may have some influence on the zoning results. It is suggested that the pressure measuring points should be laid out as evenly as possible when using this method. (4) Although this proposed WPCZM is originally based on the large-area flat roof and saddle roof, it is applicable to not only the roof claddings but also the other similar structures.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.