Collaborative Production Structure of Knowledge Sharing Behavior in Internet Communities

1Zhongshan Institute, University of Electronic Science and Technology of China, Guangdong 528402, China 2South China University of Technology, Guangzhou 510640, China 3College of Business Administration, Dongguan University of Technology, Guangdong 523808, China 4Law School, Nankai University, Tianjin 300071, China 5School of Business, Dalian University of Technology, Panjin 124221, China 6China Europe International Business School, Shanghai 201206, China


Introduction
Peer production is a new mode of knowledge production that has recently emerged from software and content production.Online encyclopedias are one typical application of peer production.The primary purpose of online encyclopedias is to accumulate and share knowledge collaborative through the effort of large numbers of volunteers [1].The most distinctive difference between traditional and online encyclopedias is that anyone can edit, search, or browse the content of online encyclopedias for free.Knowledge sharing occurs when internet users are motivated to take time and effort to edit articles.Online encyclopedias facilitate collaborative document editing efforts, which rely on the contributions of multiple authors in concurrent systems.Knowledge can then be viewed as a public good and knowledge sharing can be viewed as a public good phenomenon [2].Hence, the collective action of volunteers in creating and improving articles is a vital resource underlying community sustainability.
Online encyclopedias allow relationships to be established over a range of contribution actions such as creation and improvement [3].Social network analysis can provide a research paradigm that can be used to unravel patterns of relationships involving various individuals.Different types of explicit and implicit relationships related to collaborative effort in online encyclopedias may pave the way for network analysis.For example, Korfiatis et al. used social networks measurements to evaluate contributions to Wikipedia [3].Brandes et al. used network analysis to describe and analyze the collaboration among Wikipeditors [4].Iba et al. converted the edit flow among contributors into a temporal social network and then used a dynamic social network method to analyze the editing patterns of Wikipedia contributors [5].In 2011, Silva et al. used a network method to analyze cross-citation on Wikipedia [6].In 2013, Jankowski-Lorek et al. modeled the administrator (admin) elections using multidimensional behavioral social networks derived from 2 Mobile Information Systems Wikipedia edit histories [7].In 2010, Singh took a social network approach to investigating the impact of a communitylevel network structure on member developer productivity in an OSS community [8].
The patterns, structure, and compositional values of relationship networks help the user understand the basic structure and properties of those networks and even explain their behavior.Some studies have shown that the network structure of relationships affects individuals' knowledge sharing behavior [9,10] and community performance [11].In 2004, Brandes et al. showed that structural network indicators are correlated with the quality labels of their associated Wikipedia articles [4].In 2010, Velden et al. found that different types of coauthor-linking patterns exist between author clusters, representing different forms of cooperative behavior on Wikipedia [12].In 2013, Grewal et al. examined the effects of network embeddedness or the nature of the relationships among projects and developers on the success of open-source projects in similar environments, which is another type of peer production [11].However, most existing research has only focused on individual behavior.The addition of information to existing networks not only influences network structure but also can be propagated throughout that network structure [13].Fox example, in 2013, Nastase and Strube considered how to automatically discover new relationships and determine their rate of formation based on existing network of categories and articles to characterize the dynamic of network structure relationship in online encyclopedias [13].
The network structures mentioned above are in general constructed by treating the entries as nodes and the citations among the entries as edges.The agents and ties in the network represent entities at various levels of collectivity and the different types of relationships among actors.Then, basic measurements, such as clustering coefficients, modular structures, density, cores, degree distribution, and betweenness, were here used to investigate the characteristics of entire networks.However, interesting questions may arise regarding whether these network measures are related to individuals' contribution behavior online encyclopedias and what specific overall characteristics these network structures have.Unlike in studies of Boolean network structures, heterogeneity is introduced to analyze the collaborative actions in encyclopedias, and a heterogeneous network model is here proposed as a means of studying the collaborative production structure in Baidu Encyclopedia.As a new measurement of network structure reflecting the global network characteristics, a network spectrum and quantum mapping are used to investigate these collaborative production structures.These transform the spectrum of heterogeneous network structure into an energy spectrum in a quantum system.
This study makes two main innovative contributions: (1) heterogeneous network analysis is presented to realistically reflect the heterogeneity of these large numbers of volunteers; (2) a heterogeneous network spectrum was presented to investigate the characteristics of network structure.In this way, the methods used in this paper can provide new insight into collaborative production structures involved in creating and sharing knowledge in internet communities.

Research Methodology
The study was reviewed and approved by Institutional Review Board, Zhongshan Institute, University of Electronic Science and Technology of China (Ethics Committee).This section provides a brief introduction to the network spectrum used for measurement and then introduces a heterogeneous network mapping model, which transforms the network structure into a quantum system.
2.1.Network Spectrum Method.In network analysis, various measures, such as size, density, diameter, average distance, degree, and degree distribution (e.g., power-law degree distribution), clustering coefficient (also called transitivity), betweenness centrality, small-world phenomenon, even community structure, and fractals, have been developed to indicate the structure and characteristics of the network structure from different perspectives, all of which can help the user understand and utilize the information on network structure [14][15][16].These measures either involve looking for clusters of individuals who are tightly connected to one other or look for sets of individuals who have patterns of relationships similar to the rest of the network.Most topological features, such as network diameter, node degree, degree of robustness, other structural properties, and the presence of cohesive clusters, long paths, and bottlenecks and how random the network is, are related to the network spectrum.The spectrum of a finite network is by definition the spectrum of the adjacency matrix, that is, its set of eigenvalues together with their multiplicities [17].The purpose of network spectrum is to construct a continuous multidimensional representation of the network in which the coordinates of the individuals can be analyzed further to produce a variety of kinds of information about these structures and their relationships with the rest of the network [18].Like the Fourier (or Laplace) transform, the spectral method can be viewed as a transform.Using the spectral decomposition method, some properties of a network in the topology domain can be represented by a graph consisting of a set of nodes connected by a set of links, while other properties may be more conveniently dealt with in the spectral domain, specified by the set of eigenvalues and eigenvectors.Hence, spectral analysis of the networks can extract a large amount of information about the topological structure and diffusion process.In this way, the network spectrum can be proposed as a fingerprint of network [19,20].

Heterogeneity. Collective action in online encyclope-
dias is what occurs when more than one individual is required to contribute to the improvement of the article.Intuitively, the content of online encyclopedia is subject to social dilemmas during both of its creation and maintenance [21].Large numbers of individuals are more likely to enjoy the content without contributing, Too much of this, especially during early stages, can doom a voluntary collective action to failure [22].However, some online encyclopedias, such as Wikipedia and Baidu Encyclopedia, have survived the startup problem and have almost achieved sustainability.Hence, critical mass theory has explained this phenomenon by reasoning that heterogeneity is the basic and key condition required for these collective actions to emerge [23].Olson [21] and Russell [24] argued that the group heterogeneity is favorable for collective action.If an interest group is heterogeneous, there may be some highly interested or highly resourceful people available for a critical mass even when the mean interest or resource level is rather low.Thus, the heterogeneity of the population, specifically the number of such deviants and the extremity of their deviance, is one key to predicting the probability, extent, and effectiveness of collective action.Heterogeneity is also an effective condition that solves the biggest challenges in forecasting and fostering an online encyclopedia.Heterogeneity and accelerating the type of production function cause collective action to begin and remain sustainable.In this way, all of these nodes representing the contributors and the edges between these nodes in network model are heterogeneous.A heterogeneous network model is more appropriate to characterizing the relationships among these heterogeneous contributors.

Quantum Mapping Model.
The model of  electrons whose behavior is governed by a Hamiltonian pattern in quantum system can be as follows [25]: Here, −ℎ 2 ∇ 2 /2 is the electrons' kinetic energy, (r n − r m ) is the potential describing the interactions between electrons, and (r n ) is an external potential (including interactions with ions or nuclei, which may be considered stationary on most of the time scales relevant to electronic process).
The fermionic field operators ψ+ (r) and ψ(r) create and annihilate fermions at position r and must satisfy anticommutation requirements.They can be expanded in a state basic set {  (r)} as follows: , where ĉ+  annihilates (creates) a fermion in the state   ().Clearly, these operators also obey fermionic anticommutation relations; that is, In terms of these second quantization operators in formula (2), the Hamiltonian in formula (1) can be rewritten as follows: Here ) and the labels , , , and  are taken to define the spin and the basic function.The Hückel model is one of the simplest approaches to avoiding this problem and calculating   and   directly from the finite set in context of molecular system [26] and crystals.It can be used to approximate [27]   = 0, ∀, , , .Then as with atoms in a molecule, formula (3) is as follows: Here, ĉ+  annihilates (creates) an electron with spin  in an orbital center at site .The standard notation is   = −  ,   = −  if site  and site  are connected by a chemical bond, and   = 0 otherwise.Note that   refers to the species of atom and the values of   will clearly be different on different species;   depends on the species of each of the atoms between which the electrons moved.In this way, formula (4) can be rewritten as follows: Here, ⟨⟩ means that the sum is only over those pairs of atoms jointed by a chemical bond.Note that a network with  nodes is always described using an associated adjacency matrix  = (  ) × ,   ∈  + .A network with  nodes can be then viewed as a large molecule with  atoms; the nodes were considered atoms and the edges sgn(  ) as a chemical bond between atoms.Then the structure of the network can be mapped into a quantum system using a Hamiltonian pattern in formula (5); that is, Here Â represents the hopping integral between atoms and the scaled element in associated adjacency.In this way, the structure of a network represented by associated adjacency embedded in the Hamiltonian pattern of a quantum system.Then the associated adjacency of a network can be found to be related to the Hamiltonian pattern of a quantum system with formula (6); that is, If it is assumed that  is the eigenvalue of the Hamiltonian (i.e., the energy spectrum) and the corresponding eigenvector (i.e., the quantum state) is (r) (also marked as |  ⟩), then (r) = ( − )(r) with (r) = (r).The energy spectrum of quantum systems can be obtained by parallel moving of the eigenvalue of the adjacency matrix.
Networks with all identical nodes are called homogeneous networks.Yang et al. used a similar mapping model to map homogeneous networks into quantum systems [28][29][30].However, the nodes in current network model cannot all be identical.Hence, the associated adjacency matrix of this heterogeneous network is a general matrix.Apparently, homogeneous networks with all identical nodes are just a special case in the current paper.Energy spectra play an important role in quantum systems by reflecting the state of the system in a dimensionally reduced way.Random matrix theory may be used to understand the energy levels of complex quantum system [31,32].One of the most important concepts in random matrix theory is the nearest neighbor level spacing (NNLS) distribution.A general picture has emerged from experiments and it is theorized that if the classical motion of a dynamical system is regular, the NNLS distribution of the corresponding quantum system behaves according to a Poisson distribution.If the corresponding classical motion is chaotic, the NNLS distribution behaves in accordance with the Wigner-Dyson ensembles [33].For this reason, the NNLS distribution of a quantum system was here used to reflect the dynamical properties of the corresponding classical system, that is, the collaborative production network structure.

Analytical Data
Undoubtedly, Wikipedia is the best-known example of wiki application, and it has attracted considerable academic attention due to its popularity and unconventional operating mechanisms [34].Baidu Encyclopedia, like Wikipedia, is also typical of wiki applications.It is the largest online encyclopedia in China.However, Baidu Encyclopedia is a combination of a self-organized community and other types of organizations, while the Wikipedia community has been considered a self-organized community.The Baidu Encyclopedia is also heavily self-censored in accordance with special regulations.Some lower-level positions are open to all volunteers, but higher-level administrative positions are given out by appointment.All articles written and edited by registered users must be reviewed by behind-the-scenes administrators before publication.Most significantly, there is no formal way to contract the administrators.For this reason, current research mainly focuses on the knowledge sharing behavior in Baidu Encyclopedia.
Baidu Encyclopedia classifies all the eligible articles and allocates every contributor's personal contribution space while recording contribution history.Almost all articles are edited more than once by more than one contributor.Fortunately, Baidu Encyclopedia also allocates every article revision history.The revision history records detailed information for the article, such as the contributor, time of edit, and content of the edit.Because knowledge contribution takes place when the contributor spent time editing the article, the collective cooperation structure can be extrapolated from the revision history of articles.With the aid of category navigation, all revision histories of articles in the "physics" subclass and the "mathematics" subclass are chosen.The  mined data are given as "Article-Edited Time-User."The global information of data in two subclasses is shown in Table 1.
In addition, these contributors' personal information was also mined from their personal contributor pages.Personal information includes career and educational attainments.Then part of the goodness of the heterogeneity of users can be explained by these two factors (Figures 1 and 2).
Contributors' cooperation production networks can be produced by transforming the original two-model networks into single-mode networks, where contributors are considered nodes and edges considered the numbers of common edited articles.The two subclasses were integrated into a larger subclass and an original network was constructed.This collective cooperation production network was produced solely by the behavior of contributors; that is, it did not contain any declared information regarding social relationships but was completely based on edit histories.
In order not to lose the timeliness, the total data is then divided into seven sections by quarter to study the dynamic relationship.The first section of network structure and its maximum connected subnetwork are in Figure 3.
Notice that many solitary nodes center around a central maximum connected subnetwork.These solitary nodes represent inactive contributors who edited an article by chance.It is more reasonable to adopt the maximum connected subnetwork to represent the section of cooperation network.The present work focused on the spectral sequence features of maximum connected subnetwork in each section of cooperation structure.

Spectrum Analysis of Collaborative Production Network Structure
Mapping the cooperation network structure into the corresponding quantum system and then the spectrum of network can be obtained from the associated adjacency matrix.Then the nearest neighbor level spacing distribution is used to identify the dynamics in the corresponding classical dynamics of the complex network.Notice that the scaling process does not change its distribution.Thus the spectrum of adjacency matrix can be used to analyze the nearest neighbor level spacing distribution directly.The process of determining the NNLS distribution from the spectrum of adjacency matrix was briefly reviewed as follows [28].
Given the spectrum of adjacency matrix, denoted with {  |  = 1, 2, . . ., },  is the total number of the energy levels.The initial spectrum must be mapped into a new variable called "unfolded energy level" to ensure that the distances between the energy levels can be expressed in units.
Dividing the accumulative function () into two components, that is, smooth term  av () and fluctuation term   (), then unfolded energy levels can be obtained as   =  av (), where the smooth term was obtained using a polynomial to fit the accumulative function.The nearest neighbor level spacing was defined as follows: Here  is a factor that can be used to make sure the value of the NNLS within a conventional range to get reliable fitting results, and   = √ ∑ −1 =1  2  /( − 1).Because the Brody distribution indicated that the corresponding classical complex system was in a soft chaotic state and can reduce to Poisson and Wigner distribution with extreme structure parameters, the Brody distribution was used to describe the NNLS as follows [31]: Here  is Brody parameter.The Brody distribution can be reduced to Poisson and Wigner distribution with  = 1 and  = 2, respectively.To determine whether the NNLS satisfies the distribution, an accumulative function is needed to introduce to transform Brody distribution in formula (10) as follows: () = ∫  0 ().The Brody distribution can be then transformed using the following formula: ln  () = ln (ln ( 1 1 −  () )) =  ln  −  ln .
This produces reliable values of the parameters  and  with linear fitting in log-log coordinates.The process described above was performed on each collaborative network section and a high-order polynomial was used to fit the accumulative function to produce the smooth terms.Specifically, the first and second sections are presented in Figure 4.
As in the method used by Yang et al. when determining the values of parameters with formula (11), the present work also focused on whether the NNLS distribution follows a Brody distribution from the main linear trend region [28].The fitting results were checked by comparing the theoretical results to the actual ones in main linear trend regions.The fitting results in first section are in Figure 5.
Other six structure parameter pairs can be obtained using the same process.Then the structure parameters of seven consecutive network sections can be summarized in Table 2.
If a quantum system is in a chaotic state, the corresponding states in classical dynamics are the collective motion modes, just like phonon in regular lattices [28].In this way, the state induced by structure of heterogeneous network is attributable to collective behavior rather than the individual properties of each atom.The dynamics on these networks should be in a state of collective chaos.Table 2 shows that the main structure parameters in first, second, and fifth sections all approximate to 1 and in other network sections were obviously less than 1, and the fitted parameter in forth network section was significantly less than in other network sections.
According to the relationship between quantum system and corresponding classical system, these parameters mean that the collective state induced by the network structure in first, second, and fifth sections was in an ordered state while the collective state induced by other network sections was in a softly chaotic state.In addition, the structure parameters in heterogeneous networks were almost entirely consistent with that of a homogeneous network.Figure 6 also shows that only in the fourth network section was there significant difference between the two kinds of networks.
To further study the difference between the two kinds of network structure, a means of measuring network heterogeneity based on the variance of the connectivity, which was initially proposed by Horvath is here discussed [35].The heterogeneities in the network structure sequence were in Figure 7. Notice that the heterogeneity in the fourth network section was significantly more pronounced than in other sections.Further, the largest difference in heterogeneity between two kinds of networks is also presented in the fourth network section.This shows that heterogeneity may have an important influence on the collective state induced by the structure.
From the analysis given above, the NNLS distributions of the collaborative network followed the Poisson or Brody distribution.However, the NNLS distributions of biologicals network follow the Wigner form [36].In this way, there exist significant differences between the collaborative networks of the biological network.Because the dynamic collaborative network sections are completely attributable to the voluntary contribution behavior, all of these voluntary contributors can be viewed as the agents.In socioeconomic complex systems, it is necessary to emphasize that it is the agents' intelligence and suitability that give the entire system a collective order.Coincidentally, this phenomenon may be verified by spectrum analysis of collaborative relationships.

Conclusion
Online encyclopedias have provided a platform for volunteers with different interests, motivations, and knowledge to create and share knowledge.These shared behaviors involve potential social interactions.Many individuals participate in these internet communities to make professional contributions.Identifying the cooperative structure in process of knowledge sharing in these internet communities would help both academics and practitioners gain insight into knowledge sharing processes on online encyclopedias.Heterogeneous social network analyses were used to explore the collective cooperation structures across various contributors in the process of creating and sharing knowledge on Baidu Encyclopedia.By introducing the Hückel model used in quantum system, the cooperation network was mapped into a quantum system and then the spectra of these networks were considered in the quantum system.The results showed that the collective state induced by the network structure in the first, second, and fifth sections was in an ordered state but the collective state induced by other network sections was in a softly chaotic state.These collective states induced by the structure of the collaborative network were completely different from the individual dynamic states of agents in networks, that is, if the classical dynamical process on networks displayed collective order, even if some nodes may be in chaotic states, and vice versa.
Due to the openness of the Baidu Encyclopedia, objective data are mined from two representative subclasses.The heterogeneity of contributors is determined quantitatively using the contributors' career and educational attainment.Spectrum analysis can be used to analyze the entire properties of these network structures and help us investigate some of the properties in spectral domains.Energy spectra play an important role in quantum systems by reflecting the state of the system.Hückel model is introduced to map the network into the quantum system.Spectrum analyses were used to provide an abstractive but complete means of measuring structures.
In interpreting the findings, a few limitations should be noted.First, our analysis of heterogeneity focuses on the contributor's career and educational attainment.More factors can be used to explain the heterogeneity of contributors.Second, to study the dynamic relationship, we divide the total data into seven sections by quarter.The main reasons may be that quarter is a common ruler used for dividing the time series.Practically, the total data can be divided into different sections based on different standards.Third, the relationship between the heterogeneity and the collective state induced by the structure remains to be examined; richer insights may be derived from additional studies.
In summary, these successful online communities provide a fruitful basis for understanding the social mechanisms and contribution behavior.More research on the knowledgesharing behavior and cooperation structure network should be considered.In particular, the heterogeneous network mapping model presented in this paper can be used for any kind of network, and the NNLS distribution can be used further to study the dynamical characteristics on other kinds of networks.

Figure 3 :
Figure 3: First section cooperation network and its maximum connected subnetwork.
First, the cumulative density function was defined as () =  ∫  −∞ (), where () is the density function of initial energy levels spectrum.It can be showed that () |  +1 >  ≥   = .Then the cumulative density function for the mean was normalized and set to zero and the variance was set equal to one; that is,

Figure 4 :
Figure 4: Cumulative density functions of the spectra in first and second sections.Fitting curves in the fourth section are 24-order polynomial function while the fitting curves in other sections are 19-order polynomial functions.

Figure 5 :Figure 6 :
Figure 5: Determination of the values of parameters for first network section in the main region (a).In the main region, the theoretical results fit the actual ones closely.

Figure 7 :
Figure 7: The heterogeneities in network section sequence.

Table 1 :
Information on data based on revision history of article.

Table 2 :
The fitted parameters of Brody distribution by linear fitting in log-log cooperation.