Effects of Regional Trade Agreement to Local and Global Trade Purity Relationships

In contrast to the rapid integration of the world economy, many regional trade agreements (RTAs) have also emerged since the early 1990s. This seeming contradiction has encouraged scholars and policy makers to explore the true effects of RTAs, including both regional and global trade relationships. This paper defines synthesized trade resistance and decomposes it into natural and artificial factors. Here, we separate the influence of geographical distance, economic volume, overall increases in transportation and labor costs and use the expectation maximization algorithm to optimize the parameters and quantify the trade purity indicator, which describes the true global trade environment and relationships among countries. This indicates that although global and most regional trade relations gradually deteriorated during the period 2007-2017, RTAs generate trade relations among members, especially contributing to the relative prosperity of EU and NAFTA countries. In addition, we apply the network to reflect the purity of the trade relations among countries. The effects of RTAs can be analyzed by comparing typical trade unions and trade communities, which are presented using an empirical network structure. This analysis shows that the community structure is quite consistent with some trade unions, and the representative RTAs constitute the core structure of international trade network. However, the role of trade unions has weakened, and multilateral trade liberalization has accelerated in the past decade. This means that more countries have recently tended to expand their trading partners outside of these unions rather than limit their trading activities to RTAs.

Abstract. In contrast to the rapid integration of the world economy, many regional trade agreements (RTAs) have also emerged since the early 1990s. This seeming contradiction has encouraged scholars and policy makers to explore the true effects of RTAs, including both regional and global trade relationships. This paper defines synthesized trade resistance and decomposes it into natural and artificial factors. Here, we separate the influence of geographical distance, economic volume, overall increases in transportation and labor costs and use the expectation maximization algorithm to optimize the parameters and quantify the trade purity indicator, which describes the true global trade environment and relationships among countries. This indicates that although global and most regional trade relations gradually deteriorated during the period 2007-2017, RTAs generate trade relations among members, especially contributing to the relative prosperity of EU and NAFTA countries. In addition, we apply the network to reflect the purity of the trade relations among countries. The effects of RTAs can be analyzed by comparing typical trade unions and trade communities, which are presented using an empirical network structure. This analysis shows that the community structure is quite consistent with some trade unions, and the representative RTAs constitute the core structure of international trade network. However, the role of trade unions has weakened, and multilateral trade liberalization has accelerated in the past decade. This means that more countries have recently tended to expand their trading partners outside of these unions rather than limit their trading activities to RTAs.

Introduction
With the rapid development of international trade, as of 2020, the World Trade Organization (WTO) has 164 members representing 98 percent of world trade. However, in addition to this extensive multilateral trading system, the world has also witnessed unprecedented proliferation of regional trade agreements (RTAs) since the 1990s [8]. In 2013, 546 notifications of RTAs were received by the General Agreement on Tariffs and Trade (GATT)/WTO [19]. The role of RTAs raises questions among scholars and policy makers: what drives an increasing number of countries to join regional trade unions, and how will this affect regional trade patterns and globalization processes? Trade creation and trade diversion have been proposed to describe the effects of RTAs [14,35]. Trade creation refers to new trade arising between member countries due to the deduction of tariffs, while trade diversion means that imports from a low-cost outsider country are replaced by imports from a higher cost member country because of RTA [59]. Some have advocated for RTAs by arguing that, unlike multilateral trade liberalization, they promote"deeper" integration [9].
Despite the controversy in the literature, previous studies usually focus on the influence of RTAs on countries in given regions instead of quantitative analyses on a global scale. A common approach is to operationalize RTA membership as a categorical independent variable and analyze the influence of trade unions on bilateral trade using a gravity model [57,50,29,39]. However, the roles of RTAs in regional and global trade differ, which can also be seen in the description of trade creation and trade diversion. It is not comprehensive to study them separately, and we need to break through the limitations of existing research. In fact, international trade is a complex system with global characteristics and regional structures, and we should analyze the effects of RTAs on both regional and global trade environments. It is necessary to use quantitative models and network methods to analyze global trade as a whole, and the influence of other countries should not be ignored when discussing the trade flow between any two countries.
RTAs are usually signed between neighboring countries, so their effects on regional trade are coupled with geographical distance and other factors. The innovation of this paper is to study and describe the trade purity relationship of countries, with some other typical factors, such as economic volume, geographical distance, overall increases in transportation and labor costs, are separated. In contrast to the existing literature, which consistently increases observable variables to quantify trade costs [20,55,41], here, we define synthesized trade resistance [2], decompose it into natural and artificial factors, and propose a trade purity indicator (TPI) to describe the true trade environment and relationships between countries. The role of RTAs can be studied by comparing the TPI and its evolution within and outside a trade union. Here, we apply the expectation maximization (EM) algorithm to optimize the parameters and quantify the trade purity indicator. Compared with the exogenous parameter estimation in the existing research on trade cost quantification [5,34,15,18], the method in this paper is more scientific and effective, and it could be extended to discuss the effects of RTAs on a number of countries around the world.
Furthermore, international trade is a system that involves numerous countries and trade relations, and complex network modeling has the advantage of analyzing a number of entities and complex relationships [54,60,62]. Additionally, network theory can also facilitate the examination of both local and global properties [63], which is consistent with the goal of our work. However, trade flows are a direct result of trade openness, and related studies usually apply trade flows to weight the network [54,50]. Since trade flows could be influenced by a country's economic volume, geographical factors and artificial barriers, we prefer trade resistance, which removes the impact of the economy, to reflect the purity of the trade relationship between countries. In addition, communities in the international trade network are represented by clusters of countries where trade relations between countries in the same community are closer than those in different communities [50]. Therefore, comparing the members of typical trade unions and trade communities in the global trade network could facilitate research on the effectiveness of RTAs.
The paper is organized as follows: Section 2 briefly describes the data source and the gravity model with synthesized trade resistance. Here, we establish a maximum likelihood function to simultaneously estimate the unobserved parameters and quantify the trade purity indicator. Section 3 presents the results. Here, we focus on six typical RTAs: Belt and Road (BRI), European Union (EU), North American Free Trade Agreement (NAFTA), Organization of African Union (OAU), Caribbean Free Trade Area (CARIFTA), and Association of Southeast Asian Nations (ASEAN). We discuss the evolution of TPI at both the regional and global trade levels and analyze the effects of RTAs during the period 2007-2017. In addition, we discuss the evolution of trade communities based on network methods. This shows that the representative RTAs constitute the core structure of international trade network, but the role of trade unions has weakened and multilateral trade liberalization has accelerated in the past decade. Finally, Section 4 provides the conclusion and discussion.

Data Source
In this paper, we use trade data from the UN Comtrade Database, which includes 198 countries/districts. Here, we choose the "Goods" type of product and use the annual total of all Hs commodities (Harmonized Commodity Description and Coding Systems). In view of differences in time and statistical caliber, the flow data reported by the importer and exporter are not always the same. Here, we use the importer's report, with a supplement from exporters when the data are missing.
For GDP (current US$), we use the World Bank national accounts data and OECD National Accounts data files. It is calculated without making deductions There are several methods for calculating geographical distance. As some countries have many import and export ports, we do not choose the coordinates of the capital but use the mean position of the longitude and latitude to calculate the distance. A full description of the data sources is provided in Table 1.

Quantifying Trade Resistance with a Gravity Model
The gravity model is one of the most successful empirical methods in the field of social science [2]. Specifically, Isard and Tinbergen were pioneers in applying the gravity model to describe the patterns of bilateral aggregate trade flows among countries [36,56]. Their work spawned a vast empirical literature that appears to perform well at modeling trade flows and exploring the factors influencing them [2,33,40], as 80% − 90% of the variation in the flows could be captured by the fitted relationship [1].
Scholars have introduce possible explanatory variables and performed regressions with panel data to confirm whether trade growth or loss is more significant [59,17,14,4]. However, it is impossible to include all the relevant factors, so the estimation of effects might be biased and inconsistent due to omitted variables, with the possibility of significant over-or underestimation [50].
In Tinbergen's gravity model, distance d i,j is not limited to geographical distance, and it could be broadly construed to include all factors that might create trade resistance [56,40]. More recently, some papers have estimated synthesized trade costs or resistance from the observed pattern of production and trade across countries [16,48,6] and performed analyses based on quantified trade costs.
Based on defined trade resistance r i,j , the improved model used in this paper is depicted by the following formula: where m i and m j are the gross domestic products of countries i and j; r i,j is a defined composite variable; and α is the parameter to be estimated with the expectation maximization algorithm as the latent parameter in section 2.3, ε ij is error term. Here, if we consider r i,j to be symmetric, the mechanism described in equation 1 is similar to Anderson's structural gravity model [3,5] but with a simpler expression. Here, r i,j is representative of trade resistance, which we use as a composite of all the other factors that affect trade volumes other than countries' GDP. Equation 1 indicates that the trade amount F i,j is proportional to m i and m j but inversely proportional to the integrated effective distance between them, denoted r i,j .
In contrast to the traditional gravity model, here, a country's geographical distance d i,j is replaced with trade resistance r i,j . The new model not only captures proximity or distance in terms of geographical distance but also fully demonstrates the true and comprehensive relationships between entities in the system, which is significant for understanding the global economy, politics and culture [58].
In the literature, the trade cost measure can be derived from a broader range of models [3], which have different methods and results in the parameter estimation, such as the elasticity of substitution σ [5], the Frechet parameter ϑ [22], and the Pareto parameter γ [34,15,18]. With the estimated parameters and observed trade flow F i,j , m i and m j , the symmetrical trade resistance can be obtained from equation 1 using the least squares method.
However, the existing exogenous parameter estimation method will introduce unnecessary errors and doubts about validity. However, further analysis of trade resistance will inevitably involve the estimation of latent variables or parameters, and here, we use the EM algorithm from machine learning. In addition, there are many zero values in bilateral migration data, which is also a problem that has long puzzled researchers [37,51,28,24]. Here, we use the pseudo maximum likelihood (PML) method to preprocess the zero value flow; for details, see Appendix B.

Decomposing Trade Resistance through the Expectation Maximization Algorithm (EM)
For each pair of countries i and j, trade resistance r i,j is quantified by equation 1, and we assume that trade resistance can be separated into two components. The data R = {ln r 1,2 , ..., ln r i,j , ...} can be divided into two categories: I is mainly related to natural factors such as geographical distance d i,j , and II is affected more by artificial barriers than natural factors.
Here, a, b are constants. η i,j and ξ i,j are normally distributed random variables with different means and standard deviations, η i,j ∼ N (0, σ 1 ) and ξ i,j ∼ N (µ, σ 2 ). How should one estimate parameters Θ = {µ, σ 1 , σ 2 , a, b} based on observed data R and place each ln r i,j into the appropriate category?
To solve the parameter problem of two mixed distributions, we apply a commonly used method, namely, the EM algorithm. In statistics, the EM algorithm is an iterative method to find the maximum likelihood or maximum a posteriori (MAP) estimates of the parameters in statistical models, where the algorithm depends on unobserved latent variables [21,38,45,32].
The EM algorithm seeks to obtain the MLE (maximum likelihood estimate) of the marginal likelihood by iteratively applying the expectation step (E step) and maximization likelihood step (M step), with t = 1, 2, ... representing the number of iterations. The detailed process is as follows: 1. Expectation step (E step): In step t, based on the last estimation of the parametersΘ (t−1) , calculate the expected value of the probability of belonging to a certain category. Separately calculate the probabilities of observation ln r i,j belonging to category I and category II.

Maximization likelihood step (M step): Based on theΘ
(t) τ obtained from the E step, we find the parameter estimate Θ (t) that maximizes this likelihood. The likelihood function L of R occurring is multiplied by the expected probability of all trade resistances as follows: τ can be calculated from that function:

Exploring Community Evolution based on the Extracted Backbone Trade Network
Here, we regard countries as the nodes, and the relationship between two nodes can be described by an edge. The reciprocal of trade resistance is the weight of the edge. Since trade resistance is symmetric for country pair (i, j), the network is also symmetric. For node i, the node cluster coefficient C i is calculated by the equation below [31]: where e i is the number of edges connected to adjacent nodes and k i denotes the number of nodes that are adjacent to node i. The cluster coefficient of the network is the mean of the cluster coefficients of all nodes.
To make the community classification more efficient, we apply the disparity filter method to obtain a backbone network [52].
where α ij is the probability of an edge between node i and j, k indicates the degree of a given node, p ij is the normalized weight of the edge and α s is a significance level for the null hypothesis.
After extracting the backbone network, to classify the network into several communities, we apply the Louvain community detection algorithm [10] and evaluate the result using the Q index [47].
where w ij is the weight of the edge between nodes i and j, A i = j w i,j is the sum of the weights of the edges attached to node i, c i is the community to which node i belongs, and δ(c i , c j ) is 1 if c i = c j and 0 otherwise. m = 1 2 i,j w i,j is the sum of the edge weights. Based on the quantified trade resistances during the period 2007-2017, we can construct the backbone network of global trade for each year and attempt to explore the community classification of the network.

Results and Discussion
3.1 Alienation of Global Trade Relationships 1. Trade Purity Indicator for Countries. Based on the extended gravity model, we can quantify the international trade resistance r i,j for 198 entities (Figure 1). We suppose that most trade resistance can be divided into two categories. The first has low expected barriers, which are mainly related to natural factors such as geographical distance, and the other includes countries with relatively high artificial trade barriers, such as trade restrictions, border blockades, cultural differences and political policies. It shows that most of the trade relations among the United States (red dot), China (green dot) and other countries belong to the first category, that is, most of the trade resistances are positively related to geographical distance, so they are concentrated near the blue dotted line (Figure 1 (a,c,e)). For the United States and China, only a small number of bilateral trade relations are affected by more artificial barriers.
Using the EM algorithm and the defined latent parameter θ = [a, b, µ, σ 1 , σ 2 ], we can fit the distribution of trade resistance r i,j well and obtain the characteristics of the two categories [38]. The fits of the distribution for 2007, 2012 and 2017 all pass the Kolmogorov-Smirnov test, and the parameters efficiently convert to the optimal values (Figure 1 (b,d,f)), which confirms our hypothesis of two categories of trade relations.
Here, the trade resistance of each pair has a probability of belonging to the limited trade resistance group (natural barriers, or category I). For each country i, we define the trade purity indicator T P I i by summing the probability that its trade relation r i,j belongs to category I as T P I i = 1/N j P (z ij = 1 |θ), where N is the number of countries, z ij equals 1 when the trade relations between i and j belong to category I, 0 otherwise. The TPI indicator provides a quantitative measure of the openness of a country's trade environment. Considering that global trade resistance could be affected by the growth of transportation costs or other factors, we also analyze the trend of the trade purity indicator from a more rigorous perspective. The distribution of the trade purity indicator (Figure 2(b)) also indicates the alienation of the global trade network. Obviously, the mean TPI decreased in from 2007 to 2017. The alienation of global trade is thought-provoking. In recent years, some scholars have highlighted this trend in international trade [13,46]. To protect trade interests, some countries seek to maintain the friendly regional trade relations by signing trade agreements and creating trade unions. Since the 1990s, RTAs have proliferated, including regional unions with members that are geographically near one another (e.g., EU, NAFTA) and countries or regional blocs with diverse and geographically distant partners (e.g., ASEAN and BRI) [27,7]. The impact of RTAs has always interested politicians and scholars. Can RTAs adapt to such an international trade environment? Why might a government be willing to compromise its sovereignty and sign an agreement? The answer is interdependence. Based on the quantified TPI, we attempt to analyze the effects of regional trade unions in the following sections.

Effects of Regional Trade Agreements (RTAs)
The policies imposed by any government could affect the wellbeing not only of its own citizens but also those in other countries. Trade creation and trade diversion are common effects of RTAs identified in the recent literature [23,49], and in empirical work, their mixed effects are more complex; the results are difficult to quantify [11,42]. This paper attempts to describe the effects of RTAs on both global and local trade relationships through a quantitative model and empirical analysis.

Relatively Closer Trade Relationships between Union Members.
Here, we analyze six typical RTAs, including those between the 28 EU countries,  Table 2, the average trade resistance between member countries is lower than that outside the unions. This demonstrates that the member countries of a union generally have closer trade relations with one another. In addition, in Figure 3, with the x-coordinate expressing the TPI within the union and the y-coordinate expressing the TPI beyond the union, the size of the dots is proportional to its proximity to the present (TPI in 2007 has the smallest radius, and TPI in 2017 has the largest radius); the solid three spots of the same color indicate the TPI in 2007, 2012 and 2017. Obviously, most spots are located below the diagonal, which means that the relationships between union members are closer than those with other countries outside the union.
Therefore, it indicates that all trade unions help to lower average trade resistances and create closer trade relations among the members compared with other countries.

Decreasing Trend in Trade Relationships for Union Members.
These six unions can be divided into two types. Specifically, the EU and NAFTA are type one, and most countries in these unions are developed countries. For these two RTAs, the spots move vertically over time (Figure 3). The TPI inside the unions barely changes, but the TPI outside the unions fluctuates and tended to increase. BRI, OAU, CARIFTA and OAU are type two, and the spots of these unions move towards the bottom left. In brief, by comparing the TPIs in 2007 and 2017, except for the EU and NAFTA, the TPIs within unions all declined. The trade environments of the EU and NAFTA are more friendly than those in the other four unions.
In Figure 6 (in the Appendix), we can more clearly see these two types of unions. The red labels indicate a trade deficit, while blue labels indicate a trade surplus, and the size of spots represents the net trade flow (exports minus imports). The EU and NAFTA (Figure 6(a)(e)) have fewer member countries, and have higher economic development and surplus trade flows. Therefore, the dots are highly concentrated. Other unions (BRI, OAU, CARIFTA and ASEAN ( Figure 6(b)(c)(d)(f)) are more uneven, as the dots distributed from low TPI to high TPI, and some member countries have trade surplus, while the others have a trade deficit. In addition, this indicates that the countries with surplus trade flows (blue label) have a higher TPI both inside and outside their unions.

Comparison of Trade Unions and Trade Communities
Trade unions are formed through agreements signed by countries. With the development of globalization, it is worth further exploring whether they can reflect real trade affinity. As mentioned in section 2.4, we extract the backbone of the global trade network based on quantified trade resistance and classify it into several communities. Trade communities are obtained from the analysis of the network structure, which can objectively describe the trade relationships between countries.
1. Communities in the Global Trade Network. In Figure 4, the nodes that share the same color are assigned to the same community. The modularity of classification is Q = 0.780 in 2007 and Q = 0.769 in 2017, which means that the classification is credible. There were some structural changes between 2007 and 2017. First, in 2007, the communities show significant regionality. The map ( Figure  4 (a)) shows that countries on the same continent are more likely to be clustered in the same community, confirming that geographical characteristics play an important role in forming trade patterns. For most members, the six trade unions are signed among regional countries, and based on their relatively close trade relations, it is not difficult to understand that most members of RTAs are clustered in the same community. In 2017, the distribution of community members was more divergent. With the development of globalization, trade between countries is no longer restricted by geographical or transportation factors.
Second, from the empirical results, over these ten years, the network density decreased, which means that countries in the global trade network are connected more loosely (the cluster coefficient changed from 0.1370 to 0.1006).
There is another interesting phenomenon. Some countries are on the same continent and belong to the same RTA but are more closely related to countries in other unions than the members of their RTAs. Most African countries have multiple RTA memberships [30], and the continent's east and west coasts belong to different marine routes in the global marine transport network [61]. Therefore, it is easy to understand why eastern and western Africa are clustered into different communities. France, Spain, Portugal, and Belgium are EU countries, but they are classified into the community where most members are African countries. This shows that they have closer trade relations with African countries than with other EU members, which may be due to language, culture, colonial influence and their trade structures.
Here, we apply the external-internal index (E-I Index) and compare regional trade cohesion and global trade cohesion as follows: External edges connect nodes from different communities, while internal edges connect two nodes belonging to the same community. EK and IK are the sum of external and internal degrees for all nodes; EW and IW are the sum of external and internal weights for all nodes. Based on the results of the backbone network in 2007 and 2017, the E-I index (degree) dropped from 0.2711 to 0.1000, and the E-I index (weight) increased from -0.1019 to 0.0281. The relationships in the global trade network are more diversified, but trade intensity is concentrated in local communities.

Correlation and Evolution of Unions and Communities.
We have identified the trade unions resulting from negotiations between countries, and the trade communities clustered from the empirical data. What is the correlation between them? Do the members of trade unions truly have closer trade relations? Which trade unions have no obvious effect on restraining and helping member countries? To answer these questions, we measure the correlation coefficient between the members of trade unions and trade communities. Figure 5 shows the matrix of the Jaccard similarity coefficient of six trade unions and ten typical communities. 'Others' indicates countries that do not belong to the six unions. Green color means a greater correlation and a higher commonality of members between trade unions and communities. In contrast, yellow color means that the members of the union and community are basically different.
In general, the similarity matrices of 2007 and 2017 have similar structures. Each trade union has only one or two grids with a great deal of green, which indicates that some trade unions and communities have high consistency in membership. Several very green grids are shown in Figure 5 (a), which presents the overlap of ASEAN and community 6, BRI and community 0, CARIFTA and community 4, OAU and community 1, etc. The EU and NAFTA are relatively 'free' trade unions, and their members are not limited to one or two communities, which overlap with several separate communities. In addition, in 2007, the 'Other' countries that do not belong to six trade unions are also relatively concentrated in three communities, with a certain aggregation, and it is quite different from the results from 2017.
The similarity was higher in 2007; that is, the trade unions were more similar to the actual trading clustering result. In 2017, the role of the trade unions weakened. For most trade unions, the maximum matching of members to communities is decreasing. Here, the EU and NAFTA remain exceptions, having relatively higher similarity with communities 1 and 8. This might be due to their mature trading background. In addition, the TPIs of the EU and NAFTA remained stable, while the TPIs of other trade unions decreased (Figure 3). TPI indicates a trade-friendly relationship with other countries, while communities also reflect the different trade relations between countries inside and outside the community. Therefore, it is reasonable and scientific to conclude that EU and NAFTA have particularities in both results. Compared with 2007, the community structure of "Other" countries has become more decentralized.
In short, RTAs appear to have an impact that strengthened the formation of true trade relations [50]. Based on the similarity matrix, each trade union is mainly concentrated in one or two communities. However, in 2017, this kind of consistency clearly weakened, and multilateral trade liberalization has accelerated over the past decade.

Conclusion and Discussion
The innovation of this paper is to study and describe the trade purity relationships between countries, considering some other typical factors, such as economic volume, geographical distance, overall increased transportation and labor costs, are separated. In addition, this paper does not use the exogenous parameter estimation method, and we define latent parameters and use the likelihood function and EM algorithm to quantify and analyze the trade purity indicator more scientifically and effectively. In brief, the extended model prompts the development of the gravity model in theoretical research on international trade.
In the empirical analysis, some unobserved characteristics of the trade relationships can simultaneously be defined and optimized, and the analysis uses a trade purity indicator to describe the trade environments and positions of countries in both regional and global trade relationships.
With the data from the UN Comtrade Database, we quantify the international trade resistance of 198 countries/districts. This analysis shows that the trade relationships of the 198 entities can be divided into two categories. The trade resistance of countries in category I has an approximate log-linear relation with geographical distance, and these countries have a relatively open and friendly trade environment, where the main trade barriers are natural factors. The countries in category II have higher artificial trade barriers, and countries with poor trading environments frequently fall into category II. Here, we obtain well fitted results using the EM algorithm from machine learning. All latent variables converge rapidly to optimal points, which validates the extended gravity model proposed in this paper.
In addition, this paper defines and identifies a trade purity indicator for different RTA countries during the period 2007-2017. It can describe the true trade environment and relationships. Countries with higher indicators have friendly trade environments and obtain large trade flows, such as the United States, China, Japan, Korea, South Africa, Singapore, Australia, and Malaysia. For these countries, most trade partnerships are mainly related to natural factors such as geographical distance, and they have no obvious trade barriers. The analysis of the indicator and its evolution could help to research the characteristics and trends of international trade. This indicates that although the global and most regional trade relations gradually deteriorated over the period 2007-2017, the RTAs bring closer trade relations between members, especially contributing to the relative prosperity of the EU and NAFTA.
Finally, based on the trade resistance matrix, we build a network mapping the relationship of 198 countries/districts. The Louvain community detection method identifies several communities in the global trade network. Here, we analyze the effects of RTAs by comparing the members of trade unions and communities. The results show that the representative RTAs constitute the core structure of international trade network, but the role of trade unions has weakened and multilateral trade liberalization has accelerated in the past decade. This means that more countries have recently tended to expand their trading partners outside their unions rather than limit their trading activities to their RTAs.
The authors declare no conflict of interest including any financial, personal or other relationships with other people or organizations.

Funding Statement
This work was supported by the Chinese National Natural Science Foundation (71701018, 61673070), the National Social Sciences Fund, China (14BSH024), and the Beijing Normal University Cross-Discipline Project.

Data Availability statement
The empirical data used in this paper could be downloaded from sources listed in Table 1. Or please contact with Dr. Xiaomeng Li (lixiaomeng@bnu.edu.cn) to request the data. Figure 6 (Appendix) presents some detailed information. The x-coordinate expresses the TPI between a specific country and other countries in the same union, while the y-coordinate expresses the TPI between a specific country and other countries outside the union. The size of dots is proportional to the net trade flow, measured as the absolute value of the difference between exports and imports; a red label means that the country had a trade deficit, while a blue label means a trade surplus flow. Most dots are below the diagonal, which means that the TPIs of most countries inside the union are lower than those outside the union. In addition, countries with surplus trade flow (blue labels) have a higher TPI both inside and outside the union.

B Pretreatment of Flow Zero Value
For the gravity model (equation 1), F i,j is the trade flow from country i to country j; m i and m j is the combined size of their economies; r i,j is the trade resistance need to be quantified. It is generally believed that the model cannot describe zero flow because the gravity is universal [25], even if the size of two countries is very small and the geographical distance or trade resistance is very large, as long as the volume m i · m j is not equal to zero and the resistance r ij are not infinite, the trade flow between them may be very small, but not zero.
However, the situation of zero-value flow is very common in the empirical data, around 50% in the global trade network [34], and it creates an additional problem for the log linear form of the gravity equation (including the traditional and structural gravity model in trade studies). In the early studies, some scholars often deal with the zeroes trade observation by truncation method, such as deleting them completely or substitute by small positive constant [26,12]. It's obviously not rigorous enough [25]. In reality, the zero-value trade flow is generally considered to be not observable or due to measurement errors from rounding. So stochastic versions of equation are used in empirical studies [53,34]. Here we can add an error term ε ij , and assume that the error function is positive and obeys lognormal distribution [53], as ln ε ij ∼ N (µ, σ 2 ) in equation 10.
For clarity, we assume X = ε ij , and Y = X + F i,j . The probability density function of the random variable X is, x > 0 0 x ≤ 0 XXIV S. Huang et al.
The probability density function of Y is calculated as follows: If we assume that trade resistance is bilateral, then we can simply deduce r i,j for each pair of countries by the least square method with, Different kind of Pseudo Maximum Likelihood (PML) methods are proved to be effective to deal with the zero-valued trade flow and the logarithm transformation [43,44,53]. The method in this paper is not exactly the same as the gravity model, and the main different is that we replace the geographical distance with r i,j which needs to be quantified. So we use the idea of PML, but improve the likelihood function here. Then, we maximize the probability E(Y ) = E(X)+F i,j , with the defined likelihood function, With the method of maximum likelihood estimation, we can optimize the parameters µ and σ to get the max µ,σ (L), which make E(Y ) = E(X) + F i,j the most likely to occur in reality.
The optimized parameters are listed in Table 3, and Figure 7 shows the distribution of random error ε ij during 2007-2017. It can be seen that the mean value of random variables is basically around 1-2, and the variance is relatively small, which conforms to the basic assumption of statistical error in trade flows.