Green Technology Collaboration Network Analysis of China’s Transportation Sector: A Patent-Based Analysis

Shanghai Financial Technology Research Centre, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China School of Financial Technology, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China School of Management, Shanghai University, Shanghai 200444, China School of Artificial Intelligence and Law, Shanghai University of Political Science and Law, Shanghai 201701, China Department of Computer Engineering and Science, Shanghai University, Shanghai 200444, China


Introduction
Not surprisingly, China is growing into one of the world's most crucial transportation market. A report conducted by Forbes.com shows that ten transportation enterprises in China made the Forbes Global 2000, with each rising significantly in rank Forbes Global 2000: the World's Largest Transportation Companies 2018 (https://www.forbes.com/ sites/antoinegara/2018/06/06/forbes-global-2000-the-worldslargest-transportation-companies/#6e40011b100f). According to the World Bank database, China's definite advantage in transportation infrastructure prompts its rank rise in the logistics performance index system. e latest report on the logistics performance index shows that China has become the top performer among upper-middle-income economies [1]. Based on the booming development of China's transportation market, research on China's GTTs is also gradually rising. Wang et al. [2] applied the transportation mode-technologyenergy-CO 2 model to analyse the energy consumption and CO 2 emissions in China's transportation sector. Wang et al. [3] investigated the impact of three different policies on the implementation of electric vehicle technology. Liu et al. [4] employed the DEA approach to measure the technological progress and environmental efficiency of China's road transportation industry.
As a complex innovation activity that impacts public transportation and trade flow, transportation technologies (TTs) are undergoing revolutionary upgrades to cope with the changing demand market [5]. ere into, green transportation technologies (GTTs) are regarded as practical solutions to improve the sustainable performance of the transportation sector in the increasingly severe environmental crisis [6]. In line with the sustainable requirement, a continuous body of research literature and research applications explored the creation, verification, and adoption of GTTs [7]. For instance, Pelletier et al. [8] presented an overview of the technologies and marketing research in the field of green transportation represented by electric vehicles. Perboli and Rosano [9] employed simulation optimization technology to study business and operational models of traditional and green couriers. OECD promoted the environmentally sustainable transport (EST) initiative to construct a common understanding across the global world on the basic concepts of green transportation [10]. McKinsey & Company released a study on the critical factors of efficient urban transportation in 24 world cities to help leaders understand the knowledge and technologies needed to improve public health [11].
Furthermore, scholars, enterprises, and organizations widely participate in related innovation activities to boost the development of GTTs. e role of collaboration in stimulating the growth of the transportation market has been gradually emphasized. Marra et al. [12] researched green technology companies in San Francisco, New York, and London to recognize their specialization and collaboration field of the transportation sector and forecast the potential emerging technologies. e available literature on the analysis of GTTs collaborative innovation is on the strength of various levels, such as the regional case study [13], advanced model simulation [14], and green operation strategy [15]. Luan et al. [16] constructed an analysis framework that analysed the effect of collaboration among traffic information service providers, local governments, and users. Sun and Rahwan [17] investigated the co-authors' network of scientific collaboration in transport research by using published metadata.
It is worth noting that the above outputs about China's transportation sector and GTTs are achieved through cooperation and collaboration between individuals and/or organizations. In the context of the emerging development of the transportation industry, it is crucial to find out the major players in China's GTTs collaboration activities in the transportation sector and how those collaboration activities among different players influence green collaboration innovation performance. However, few studies analysed the collaborative activities aimed at transportation technology innovation, especially for GTTs in China. Moreover, existing research focused attention on the segment transportation market or specific technology [18,19] than on the whole transportation industry [20].
us, those literature gaps motivate the demand to explore the current performance and future trend of GTTs collaboration innovation activities in China's transportation sector. From the methodological perspective, the social network analysis approach, which is widely used in bibliometric analysis and complex social analysis, has also been employed to present the collaboration relationship among different organizations and individuals. Key nodes and links in the collaboration can be detected accordingly. Al-Tabbaa and Ankrah [21] applied the social network analysis method to uncover the dynamics of social capacity for university-industry collaboration. Liu et al. [22] used social network analysis to investigate the evolutionary course of the global nanotechnology collaboration network. From the view of research data, patents are recognized as a valid form of transforming knowledge into technology [23]. Chai et al. [24] investigated to empirically examine the intensity and structure of the entire city network in the Yellow River Basin using the social network analysis method and ArcGIS software. More and more individuals and organizations put a large amount of investment in the research of new technology and the declaration of patents, especially green technology patents. Technically, the World Intellectual Property Organization (WIPO), the international organization for intellectual property services, released a guidance list called IPC (International Patent Classification Number) green inventory for facilitating patent searches and applications relating to environmentally sound technologies (ESTs). e list has been applied for energy technologies studies [25] and macroanalysis of green technology innovation [26]. Moreover, the "collaboration effect" of the cooperative patent application by multiple entities has been widely demonstrated and implemented in patent research [27][28][29].
Motivated by the abovementioned content, our research aims to investigate GTTs collaboration in China's transportation sector based on the social network analysis method and GTTs patent data. is study makes contributions to the transportation field as the following research objectives described.

Research Objectives.
(i) To measure the evolution and current performance of GTTs collaborative innovation in China's transportation sector with the IPC green inventory and patent data (ii) To identify the participants who involve the collaborative innovation of GTTs patents and the collaboration relation in China's transportation sector (iii) To provide relevant policy implications on these results e rest of this study is arranged as follows. In Section 2, a detailed research framework is depicted. Section 3 provides the results and discussion. e conclusion and future research direction are summarized in Section 4.

Research Framework.
To clear the research flow of this study, a detailed research framework, including four steps, is illustrated in Figure 1.
Step 1. First, the GTTs-related collaboration patents at the SIPO (State Intellectual Property Office of China) database were collected by a developed Python crawler tool (the source code is available at the github.com https://github.com/huangPark/IPC_patent_collection. git). SIPO database is the official patent database of China that contains the complete patent information in China and frequently used in innovation and technology studying [30]. e survey period is set to 2007-2018. WIPO's green IPC inventory list is used for the filtering of GTTs.
Step 2. A multiattribute index system is constructed to preprocess the patent data and analysed the GTTs' characteristics. Attribute indicators include IPC code classification, patent applicant, approval time, and region information.
Step 3. Subsequently, employing these patent documents, we conducted the patent collaboration networks by the social network analysis tool, Gephi software.
Step 4. Finally, the assessment of the network structure and policy suggestions for the collaboration activities of GTTs innovations were identified via the statistical analysis of patent information and the social network analysis of patent collaboration.

Data Source.
e research data in this study for GTTs analysis are patent data, collected from the SIPO database through a developed web crawler tool. e transportation category of the IPC green inventory (Table S1 in Appendix A) was selected as the green transportation technology list for this study. Five first-level classifications (vehicles in general, vehicles other than rail vehicles, rail vehicles, marine vessel propulsion, and cosmonautic vehicles using solar energy) and 57 second-level classifications of IPCs are regarded as the code of GTTs. Practically, two search methods, IPC taxonomy and keywords searching, are widely used by scholars in the field of patent investigation. Both of them may face some drawbacks.
e IPC approach may result in duplication of patent data, as a patent can often be subordinate to multiple IPC classifications. e keywords searching approach can retrieve patents containing specific information. However, the keywords searching method is often subjective, and the keyword coverage area is usually not complete. Here, we choose the IPC taxonomy method because of its full recognition. After crawling all the transportation patents in the IPC green inventory, duplicate data will be deleted according to the patent application number to avoid duplicate collection of patents.
ere are three types of patent rights in China, namely, invention, utility model, and design. e "Utility Model" patent is regarded as the main type of patent analysis and technology innovation evaluation by the majority of scholars [28,31]. Utility models appeal to some users because they provide more accessible, cheaper, and faster patent protection for the traditional invention patent system. Prud'homme [32] developed an evaluation system, including six institutional calibration strategies to investigate the regime and innovation of utility model patents. Zhang et al. [33] employed the data of utility model patents in the field of China's offshore wind power to examine the technological progress and conduct statistical analysis on the evolution.

Collaborative Identification.
e focus of this research is the collaborative innovation of patented technologies. Here, patents containing two or more patent application entities are identified as cooperative patents [31]. To conveniently characterize the sorting relationship for the multimember in cooperative patents, the first application entity of the collaborative patent is regarded as the leader node and the second and subsequent application entities of the collaborative patent as the follower node. e reason for this setting is that the first applicant for a patent often has a more significant contribution to the patent [34]. Here, four types of partners are generated, namely, business enterprises (B), individuals(C), research institutions (I), and universities (U), mainly referring to the principles of the previous literature on innovation collaboration [28]. It is noted that the node type attribute represents the category of patent application organization or individual, and it is an indicator that needs further confirmation. e type of an organization (other than an individual client) will be determined by the relevant information corresponding to the organization name in the official enterprise database (National Enterprise Credit Information Publicity System, http://www.gsxt.gov. cn/index.html). e static pattern analysis per year of GTTs patents is extracted by the patents registered between 2007 and 2018, while the evolution pattern analysis is abstracted through patent time-series, which is divided into three 4year periods (2007-2010, 2011-2014, and 2015-2018). Finally, a directional collaborative network based on the GTTs patent has been constructed.

2.3.
e Social Network Analysis Method. e proposed research framework applied social network analysis for the GTTs collaboration patent analysis. A social network is recognized as a set of nodes (e.g., companies, scholars, or other social entities) and links (e.g., topic, cooperation, or other social relations) [35]. As a practical approach to transferring resources and information between nodes, the social network plays a significant function [36]. Social network analysis (SNA) is the process of investigating social structures using networks and graph theory and the mapping and measuring of relationships and flows between people, groups, organizations, and other connected information/knowledge entities. e SNA method is employed to analyse the fundamental nature and structure of network nodes and network links. is analysis method can help to Scientific Programming identify the global and local patterns in a social network and recognize the influential entities and relationships in a network. Once the time-series patent data are collected, the dynamics of the network can also be illustrated. Due to the interdisciplinary nature, social network analysis has been widely employed in policy analysis [37], risk analysis [38], and industrial innovation analysis [39].

Network Structure Analysis
Network Density. Network density is employed to characterize the denseness of interconnected links between nodes in the network. Also, network density is defined as the ratio of the number of actual links in the network to the upper limit of the number of links that can be accommodated. Here, the calculation formula of degree centrality is as follows: where T refers to the number of links in the network, n refers to the number of nodes in the network, and n(n − 1) refers to the maximum possible links in the network.
Network Average Degree. e degree of a node refers to the number of links connected to the node. e average degree of the network can be expressed as the average of the degrees of all nodes in the network. e formula for calculating network average degree is where N is the number of nodes in the network.
Network Average-Weighted Degree. e weight of a link indicates the number of times the link has been traversed between a pair of nodes. e weighted degree of a node is based on the number of nodes' links. However, the weight of each link is different. e average weighted degree of the network can be expressed as the average of the weighted degrees of all nodes in the network.

Network Diameter.
e diameter of the network can be defined as the longest path among all the shortest paths calculated in the network. e diameter of the network is a network characteristic that represents the shortest distance between the two furthest nodes in the network.
Network Average Clustering Coefficient. One node may have K neighbor nodes. e actual number of links between the K neighbor nodes over the maximum possible number of links between the K neighbor nodes is the clustering coefficient of the node, C 2 K � K * (K − 1)/2. e average clustering coefficient of the network can be expressed as the average of the clustering coefficient of all nodes in the network.
Network Average Path Length. e average path length is another essential characteristic measure in the network. It is the shortest average distance between all the node pairs in the network. Here, the distance between nodes refers to the minimum number of edges to be experienced from a node, where the maximum distance between all nodes is called the diameter of the network. Average path length and diameter measure the transmission performance and efficiency of the network.

Network Centrality Analysis.
In the field of social network analysis, network centrality is a vital index applied to measure the importance of nodes in a network. Based on SIPO database

IPC green inventory
Python crawler tool Node type Time Step: 1 data collection IPC green code Step: 2 data processing

Region
Step: 3 Visualization Gephi so ware Patents document Step: 4 analysis Network structure Strategy and policy the different centrality algorithms, there are different centrality evaluations for the nodes, as described below.
Degree Centrality. Degree centrality is the most direct metric to describe node centrality in network analysis. e larger the degree of a node and the higher the degree centrality of the node, the more valuable the node is in the network. Generally, such nodes are at the centre of the network being studied and have a higher influence on other nodes. If the research object is a directed network, that is, a link points directionally from one node to another node; then, a node of this has two different types of degrees. Input degree is the number of links input to the node. Output degree is the number of links that the node outputs. Here, the calculation formula of degree centrality is as follows: where d(n i ) refers to the degree centrality, j�1 x ji is employed to calculate the number of direct links between node i and other node j (i ≠ j, excluding the relation of the node i to itself ).
Betweenness Centrality. Betweenness centrality requires the average length of the shortest circuit from each node to the other. In other words, for one node, the closer it is to the other nodes, the more centered it is. For instance, this kind of facilities that need to be used by as many people as possible is relatively close to the centre. Here, the calculation formula of betweenness centrality is as follows: where b jk (i) indicates the power of node to manage the link between node j and k.
Closeness Centrality. Closeness centrality refers to the number of times a node acts as the shortest bridge between the other two nodes. e higher the number of times a node acts as an "intermediary," the higher the centrality of its intermediary. Here, the calculation formula of closeness centrality is as follows: where d ij represents the distance between nodes i and j.

Results and Discussion
In this section, two kinds of analysis results are provided. First, Section 3.1 presents an overview development of GTTs' patent in China. e quantity scale of collaborative GTTs patents and the growth trend of various GTTs patents are clearly demonstrated. Second, Section 3.2 provides the structure analysis of the GTTs collaboration network, including network evolution, network properties, key nodes, and link analysis.

Patent Number.
A search based on IPC green inventory list shows that the total number of utility models for GTTs-related patents from 2007 to 2018 is 59,809, which includes 4,467 cooperative patents. Figure 2 presents two histograms of the selected patents for China's GTTs during the investigation periods. During the study period, the total number of GTTs and the number of cooperative GTTs have steadily grown. It is worth noting that the number of GTTs and cooperative GTTs increased significantly in 2012. Importantly, the number of approved collaborations GTTs remains relatively high. In general, the green transportation innovation activities represented by GTTs and cooperative GTTs are active during the investigation period, especially for the significant growth of total GTTs.

Patent Collaboration Classification.
According to the classification principle of cooperative entities in Subsection 2.2.2, the 4,467 cooperative GTTs patents are divided into four leadership groups: type B (business organization lead), type C (individual lead), type U (university organization lead), and type I (institute lead). Results in Figure 3 show that type B and type C cooperative patent applications account for the majority of the total, while the number of type U and type I remains at a lower level. On the one hand, Figure 3(a) illustrates that the dominance of type C and type B changed significantly during the study period. Type C lead was still much more massive than type B lead in the first two years (2008-2009). After 2011, the trend changed radically. Type B keeps an account for more than 50% every year. On the other hand, type U and type I have no advantage in absolute quantity compared with type B and type C. Type I has a slightly higher proportion than type U. In summary, type B (business organization) has gradually grown into a significant leader in the field of cooperative GTTs.

Subsector
Classification. Subsection 2.2 points out the subsector classification of IPC green inventory transportation categories; the growth trend of different groups cooperative GTTs patent data are analysed by time accordingly in this subsection. Figure 4 shows GTTs outputs of various transportation subsectors during the investigation period. For the most part, the number of cooperative GTTs patents in the field of rail vehicles has been a leader, and the trend continues to grow. Accidentally, there was a significant spike in approved GTTs volume around vehicles in general in 2012, which found out the reason for the sharp increase in 2012 in Figure 2 from the perspective of the subsector. By contrast, GTTs in the field of marine vessel propulsion have been at a low level, and the green cooperation innovation activities are not active.

Geographic Classification.
e cooperative patent data record the province information of the first patent applicant. Patent data and geographic information are combined to form Figure 5, which represents the patent distribution of Scientific Programming cooperative GTTs. As shown in Figure 5, the number of cooperative GTTs in most coastal provinces of China is far higher than that in other regions. Notably, the GTTs output in Beijing and Jiangsu is outstanding. e outstanding performance of green collaboration innovation activities in both places may benefit from a high number of local universities and enterprises. e creation of such an innovation atmosphere is also conducive to the output of collaboration innovation.

Network Evolution. GTTs patent cooperation in China
is divided into three stages: S1(2007-2010), S2 (2011-2014), and S3 (2015-2018). By constructing the GTTs cooperation network with 4,467 cooperative patents in three stages, the evolution of the GTTs cooperation network over time has been presented. Figure 6 shows the patent collaboration networks in three stages. Table 1 provides the performance of key parameters using the social network analysis method. Cooperation innovation is becoming more and more active, which indicates a steady evolution of the patent network. In detail, the number of nodes and links in the network are constantly increasing. Among the two main nodes and two main links, the proportions of B type node and B lead link are increasing, while the proportions of C type node and C lead link are decreasing. e results show the key role of B type node and B lead link in collaborative GTTs. e increase of average degree and average-weighted degree indicates that the capabilities of other nodes connected by a single node in the network are improving, and the GTTs collaborative network is getting closer. e increase of diameter and average path length represents that the speed of knowledge transformed into technology between nodes may slow down; the cost of conversion might be increased.

Node and Link
Analysis. In addition to showing the evolution process of the network, the social network analysis method could also identify critical nodes and critical links in the network. Here, the two characteristics of node centrality and link weight are used for analysis. Table 2 provides the top ten nodes of degree centrality, betweenness centrality, and closeness centrality, respectively. e results show that all the three centrality indicators of State Gird rank first, which reflected the critical position of this entity in the GTTs network. Besides, many CRRC subsidiaries appear in the table, which also show that CRRC is also an important node of the network. Table 3 provides the top 10% of weighted links. e results show that the links B-B occur most frequently, illustrating the close cooperation between the two entities, which also indicates the importance of this type of cooperation link in the GTTs network.

Networks of Different Leadership Types.
In the previous two subsections, a collaboration network that completely contained four leading entities was analysed. Another interesting question is what are the structural characteristics of the innovation collaboration network led by the four innovation entities, separately. Table 4 illustrates the attributes of the four networks. Figure 7 shows the networks of four different leadership types.
As given in Table 4, the type C network owns the most nodes, while the type B network owns the most links. e type U network has the lowest percentage of its nodes, which indicates that this type of network has the highest willingness to cooperate. e result of the type C network is the opposite of the type U network.

Scientific Programming
From a network topology perspective, the type U network has the most extensive network density, which represents that network members are most closely connected. e type I network has the highest average degree and average-weighted degree, indicating that the nodes in this type of network have the best connectivity with each other. Among the last three network characteristic indicators, the type B network reflects the highest value, which indicates that the spatial scale of this type of network is the largest. ese two networks are the most complex on a physical scale. In general, the type B network and type C network are the most complex two types of GTTs collaboration innovation networks.

Conclusions
GTTs are the critical driving forces to promote the sustainable development of the transportation industry, in which technical collaboration plays an active role. is study presented the investigation of GTTs collaborative in China's transportation sector based on the social network analysis method and GTTs patent data. e following conclusions are offered.
(1) e collaboration patent data selected in this study are an active measurement indicator of GTTs' progress. e results show that cooperative GTTs continued to grow from 2007 to 2018. e growth of GTTs in the railway subindustry has been particularly marked. e geographic information contained in the patent reflects the substantial GTTs cooperative innovation activities in Beijing and Jiangsu of China.
(2) e SNA method is a simple and efficient method to understand the structure and characteristics of the technical cooperation network. e research framework proposed in this study is feasible. e results show that the cooperation mode of B type leadership has gradually become the main form of cooperation among different entities. State Grid and CRRC are the main nodes of cooperative GTTs technology. B-B is the main mode of GTTs collaboration.
(3) Some policy suggestions can also be derived from the results. e collaboration innovation of GTTs has maintained a high level. Each innovation entity plays a role in the innovation cooperation network. Organizations or individuals need to choose specific areas based on their abilities and decide where to lead or follow. e government should also formulate policies to actively guide organizations or individuals to participate in cooperative innovation activities.
Indeed, this study mainly focuses on patent data on green transportation technologies. ere are still limitations in this study. To measure and evaluate the innovation activities in the field of transportation comprehensively, we need to obtain more dimensional data, such as the literature review [40], green transportation research [41,42], and sustainable green technology inventions [26]. A broader range of data may further improve the robustness of the study.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

10
Scientific Programming