Research on the Effect of DPSO in Team Selection Optimization under the Background of Big Data

Team selection optimization is the foundation of enterprise strategy realization; it is of great significance for maximizing the effectiveness of organizational decision-making. Thus, the study of team selection/team foundation has been a hot topic for a long time. With the rapid development of information technology, big data has become one of the significant technical means and played a key role in many researches. It is a frontier of team selection study by the means of combining big data with team selection, which has the great practical significance. Taking strategic equilibrium matching and dynamic gain as association constraints and maximizing revenue as the optimization goal, the Hadoop enterprise information management platform is constructed to discover the external environment, organizational culture, and strategic objectives of the enterprise and to discover the potential of the customer. And in order to promote the renewal of production and cooperation mode, a team selection optimization model based on DPSO is built. The simulation experiment method is used to qualitatively analyze the main parameters of the particle swarm optimization in this paper. By comparing the iterative results of genetic algorithm, ordinary particle swarm algorithm, and discrete particle swarm algorithm, it is found that the DPSO algorithm is effective and preferred in the study of team selection with the background of big data.


Introduction
1.1.The Purpose and Significance of the Research on Team Selection Optimization under the Background of Big Data.The team selection optimization problem is an optimal team combination solution decision problem.The decision-making of this problem needs to analyze the construction status of the enterprise team on the basis of reasonable evaluation and technical analysis, and propose a solution suitable for working on a reasonable resource platform from both economic and strategic aspects.In the market competition and the increasingly fierce competition of data, how to choose the most promising team members in a large number of candidate teams and apply it to the implementation of the project cooperation is one of the important factors for the success of the enterprise or group.
1.2.Related Literature Review.Nowadays, large data resources have been introduced into team selection optimization by many scholars.At present, scholars at home and abroad have studied team optimization and selection from the perspective of establishing team member selection index system, qualitative analysis, and quantitative analysis.The research on index of selecting members is mainly focused on the qualitative analysis.Liu et al. design a method to solve the selection of virtual team members on the cloud service platform and put forward the selection index system of the team members' comprehensive performance from two aspects of individual performance and cooperative performance [1].Jiang et al. focus on team comprehensive quality indicators and professional skill indicators when selecting cross-functional team members [2].An optimal selection method for virtual network team members based on members' comprehensive performance is proposed by Liu and other researchers.Through the analysis of the virtual collaborative network process, the selection index system based on the comprehensive performance of members was established, and the strength Pareto evolutionary algorithm was used to solve it.The validity of the algorithm was verified [3].The quantitative research on team members' optimal selection mainly lies in decision analysis.Xiaohong et al. have studied the selection of virtual innovation team members.The threedimensional index system and the multilevel extension comprehensive evaluation model have been put forward to further study the problem of virtual team members [4].Yanping and Pannen have constructed a team member selection method based on the indicator expectation crossfunctionality.Through the analysis of the comprehensive quality index and professional skill index of the team members, the objective function of the maximum satisfaction was established, and the model was optimized [5].Hsu et al. proposed screening of working members based on the team members' skills and performance.Considering that the work team is a complex nonlinear system, it studies the complexity and diversity of team member selection using the ABM model based on agents and uses a project instance to verify it [6].Starineca and Voronchuk outline the importance of team ability in the selection of project team members, summarize the task as the ability set of the project team members, describe the process of selection based on competency, and apply the analytic hierarchy process to make the choice of ability based more reasonable [7].
We could draw the conclusion from the related literature above that most researches put the emphasis on the qualitative analysis of index of selecting members and the quantitative research of team members' optimal selection so far.However, there are few researches that introduced large data resources into team selection optimization by scholars.The big data era has brought opportunities and challenges for enterprise cooperation.Enterprises should pay attention to the impact of large data when choosing engineering partners and selecting team members.This is exactly the purpose of writing this paper.
1.3.Key Research and Contribution in this Paper.In previous studies, genetic algorithm and analytic hierarchy process were used to evaluate the effectiveness of team selection, in the background of today's big data era, corporate teams generate huge amounts of data every day, and genetic algorithms can no longer solve large-scale computational problems.It is easy to get into "precocious."The process of determining the weights of the analytic hierarchy process has a strong subjective color, and there are relatively few researches on the big data platform in this field, and there are fewer factors and data to consider in index selection.In order to solve the problem of team selection and optimization, the optimization model is constructed based on the equilibrium matching and dynamic gain of the enterprise members, which is solved by using genetic algorithm and discrete particle swarm algorithm, based on the data information of the Hadoop enterprise information management platform.The optimal DPSO parameters are determined; the genetic algorithm and the traditional particle swarm are compared.The iterative algorithm of discrete particle swarm optimization is applied to find the best algorithm for team selection.
There are seven parts in this paper.First is the introduction, and then the relationship of optimization of enterprise information management and team selection under the background of big data and the establishment of analysis platform for enterprise information management big data based on Hadoop was introduced in Section 2; theoretical basis which included genetic algorithm and PSO algorithm was presented in the next section; the principle of the optimization in team selection and matching was demonstrated in Section 4, which contained strategic balanced matching principle and the principle of resource gain effect.The establishment of team selection optimization model under the background of big data was presented in Section 5, and then in Section 6, we made a comparison of several algorithms by the case analysis to find the most effective one.In the end of the paper, the preferred optimization method for team selection in the context of big data and the conclusion were presented.

Team Selection of Big Data Analysis Platform
Framework Based on Hadoop brought opportunities and challenges for people and has a great impact on the existing enterprise management system and team building model and on how to catch up with the wave of big data era and promote the research process of team selection, and optimization has become the focus of attention of domestic and foreign enterprises and researchers.
In the era of big data, huge amounts of data are produced every day, and this huge data resource has promoted the development of research in various fields.Schonberger in the "Big Data Age" in 2013 proposed that the era of big data will have a huge impact on our thinking model, management model, and business model.He believes that big data affects the team's choice of optimization issues in the follow-up company's construction process mainly through influencing the company's data management library, knowledge management library, and corporate decision-making environment [8].Yunhai and Lanqiu focus on the impact of large data information on e-commerce and take the cloud computing technology into consideration, study the processing technology of large data, and find that the development of the new technology will promote the upgrading and optimization of enterprise management teams and decisions [9].Dan et al. study the role model of large data in the enterprise team and put forward that big data is mainly through discovering the external environment of the enterprise, discovering the potential value of the customer, and promoting the renewal of the mode of production and cooperation, thus promoting the innovation of the business model of the enterprise [10].The development and utilization of big data technology requires huge human, financial, and material resources.It is difficult to bear all development needs with the strength of only one company.Therefore, it is necessary 2 Complexity to establish a reasonable team alliance to achieve the construction goals and the sharing of data resources.The general steps of team selection mode under big data background are shown in Figure 1.

The Significance of the Research on Team Selection and
Optimization under the Background of Big Data.Under the background of big data, the definition of boundaries between enterprises is vaguer, and the development of team building is developing in a nonlinear and irregular trend.Through cooperation and sharing, enterprises can promote the expansion of their living space and development space, help create common values, and promote their own development [11].Big data is affecting the internal and external environment of enterprises, including partner selection, strategic decision-making, and operation mode.In the current market of rapid development of data, it is of great significance for enterprises to make rational use of data resources and data analysis techniques to achieve team selection and optimization.

The Framework of Hadoop Large Data Analysis Platform.
Hadoop is an open-source software framework whose core is the application of simple computing models and the distributed processing capabilities of computer clusters to handle big data.It mainly relies on thousands of inexpensive server clusters and cooperative work within the cluster to complete the computation of big data.Hadoop's high reliability, high scalability, high efficiency, high fault tolerance, and low cost make Hadoop the most popular big data analysis system [12].Hdfs is a distributed file system for Hadoop.Its hightolerance and high-throughput characteristics of data enable Hadoop clusters to process large data sets through a large number of inexpensive machines.Hdfs is the foundation of Hadoop, and the follow-up experiments and enterprise management data information based on team selection are also built and obtained on the basis of Hdfs.The typical Hadoop big data platform architecture is shown in Figure 2.
The application data in Hadoop system often comes from a variety of data sources, including running business system databases, unstructured data generated by a large number of third-party Web applications, professional databases, and IT system data based on the Hadoop architecture.

The Design of Analysis Platform for Enterprise
Information Management Big Data Based on Hadoop 2.3.1.Characteristics of Hadoop Enterprise Information Management Big Data Platform Architecture.Under the trend of big data, the management core of enterprises is gradually transferred to data assets.Data has become a new source of value, and different enterprises will create, use, and share data with others.In the process of team member selection, the data assets of enterprises need to be evaluated first.The data source of Hadoop enterprise information  3 Complexity management large data platform shows the characteristics of diversity.It needs to organize the existing information resources effectively; carry out information processing around the enterprise strategy, management, production, and so on; and provide the required information for all levels of the enterprise.
The big data system platform needs to integrate with BI (business intelligence) systems and KM (knowledge management) systems to describe data through metadata and improve the availability of enterprise information assets.Hadoop-based enterprise information management big data platform needs to gather data information from different platforms and enterprises, then organize and manage it uniformly.Enterprise information platform based on Hadoop is an independent system rather than a component or subsystem of the original system.It needs to extract data from the original system and establish an independent data warehouse; it needs to bear the ability to compute large-scale data and ensure higher system response; it needs to interact with the original system and provide decision support to the original system.The focus of Hadoop building energy management platform is to analyze and mine historical data of cross-system enterprise development.

Hadoop Architecture of Large Data Platform for
Enterprise Information Management.Through the analysis of the connotation of the enterprise information management system and the typical Hadoop architecture, this paper proposes a large data platform architecture based on Hadoop enterprise information management, as shown in Figure 3.
The platform architecture is logically divided into 5 layers.The bottom layer is the data collection layer, which mainly contains various types of acquisition equipment deployed in the enterprise.These devices collectively aggregate data into collection terminals.The collection terminal saves data for a period of time by using embedded storage technology, then the data packets are sent to the remote server on time and through the network technology to complete the preliminary collection of real-time data [13].All collected data are input to the general data storage layer,  4 Complexity and after the corresponding structured processing, the general enterprise information management database is constructed.The general storage layer isolated the sensor network from the Hadoop platform, which ensured the operation of the existing system, and improved the data quality and the acquisition efficiency of the Hadoop data source.At the bottom of the Hadoop data storage layer is a set of Hadoop distributed file system deployed on Hadoop cluster.
Based on this, it builds a data warehouse, The whole Hadoop platform provides unified configuration management for each layer in the longitudinal direction, so as to achieve "high cohesion" and "low coupling" among components in the system.

Function Design of Hadoop Enterprise Information
Management Big Data Platform.The function design of the large data Hadoop platform for enterprise information management is for the application layer of the platform.This paper divides the platform into four aspects as follows, as shown in Figure 4.
(1) Data Display.After analyzing and managing the enterprise management information, the online visualization of enterprise operation data information is carried out in the form of charts.The data visualization of the Hadoop platform should focus on the display of historical management data of multienterprise team.Due to the support of big data from the Hadoop platform, data statistics report will provide more references to enterprises in the horizontal direction, and the analysis report with higher data dimension should be provided.In addition, big data processing tasks submitted to the Hadoop platform should also have visual status information that is convenient for maintenance personnel and users.
(2) Data Mining and Analysis.The core function of the Hadoop platform is data mining and analysis for enterprise management information big data.The purpose is to add value to the platform data.The enterprise information management data mining application mainly focuses on five aspects: enterprise strategic target information, enterprise financial status information, enterprise technology level information, enterprise organizational culture information, and enterprise market capability information.In addition, the platform must build generalized and highly reusable model libraries to manage data mining models and algorithms in a unified manner.
(3) Cluster Management.The good operation of the system and data security in the cluster are the key to the Hadoop platform.The platform configures and manages the components and nodes in the Hadoop system through the Hadoop Zookeeper component; builds parameter library and configuration management system to implement plugin management for application layer functional components; at the same time, provides unified UI pages for cluster monitoring; and performs cluster load balancing control, fault detection, and system security management.5 Complexity scheduling strategy is used to schedule and manage the tasks in the cluster.
(4) Openness.Hadoop's enterprise information management big data system is a coexistence relationship between an independent system and existing energy management platforms at all levels and logically shares the same common data storage layer with existing systems.The Hadoop platform's ability to analyze and process large data can be opened to all levels of platforms, which can give full play to the value of the platform.

Theoretical Basis
3.1.Genetic Algorithm.Genetic algorithm is a method of random global search and optimization by using the evolutionary and genetic theories of Darwin and Mendel to imitate the mechanism of the natural biological evolution theory.With the principle of "survival of the fittest," genetic operators such as replicating, crossing, mutating, dominant, and inverted are used to make the performance of the final population and optimize the performance of the final population to ensure that good individuals are inherited to the next generation; it has the advantages of group search, intrinsic heuristic random search, and not easy to fall into local optimum [14].
3.1.1.Basic Operation.The basic operation of the genetic algorithm is to use all individuals of the biological population as research objects and set an appropriate fitness function to meet the quality requirements of the genetically optimized individual.The selection operator, crossover operator, and mutation operator in the genetic operator are used to carry out the next step of genetic operation [15].
(1) Selection.According to the individual's fitness value, select good individuals from the previous generation to inherit.
(2) Crossover.As an important genetic operation, individuals are randomly assigned to exchange chromosomes for information exchange, and the probability of occurrence is cross probability.
(3) Variation.The allele is transformed by changing the individual's genetic value.The probability of occurrence is the probability of mutation.(d) Variation probability: generally, take 0.0001-0.1.

Algorithm Flow
Step 1. Determine the number of parameters, select the encoding method, and randomly generate the initial population of each parameter, and each individual is represented as a gene encoding a natural chromosome.Determine the size of the population and the maximum evolutionary generation, see Figure 5.
Step 2. Select appropriate objective function and fitness function to calculate individual fitness.The general definition of the objective function is as follows: Select individual fitness function as Step 3. Determine if the maximum evolutionary algebra has been reached.If so, the algorithm terminates; otherwise, go to Step 4.
Step 4. Select.According to each individual's fitness, select some excellent individuals from the current population to inherit to the next generation.
Step 5. Termination judgment.If the termination condition is satisfied, the individual with the greatest fitness obtained in the evolution process will output the best solution and terminate the operation.Otherwise, iteratively execute Step 2~Step 5.
The optimization by genetic algorithm can avoid the local optimal and fast convergence speed of the group to a certain extent.At the same time, it has a good effect in combining the intelligent algorithm system to optimize the parameters [16].6.
An inspiration from this model is used to solve the optimization problem.Each solution of the optimization problem is a bird in the search space, called "particle."All particles have an adaptive value determined by the optimized problem.Each particle has a speed to determine the direction and distance of their flight.Then, the particles follow the current optimal particle in the solution space and move the individual in the group to the good area according to the fitness of the environment, and the speed of the movement is based on the velocity.The flight experience and the flight experience of the partners are adjusted.
The basic steps of PSO include five steps, as shown in Figure 7. POS is initialized into a group of random particles (random solutions).The particle i is represented as x i = x i1 , x i2 , … x iD and has d dimensions.The velocity of particle i is v i = v i1 , v i2 , … v iD , then the optimal solution is found by iteration.In each iteration, the example is to update itself by tracking two "extremes," the first one is the optimal solution found by the particle itself, which is called the individual extremum P best .Another extreme value is the best solution that the whole population finds at present, and it is the global extreme value g best , and it can be a part of the neighbor of the particle, then the mechanism in all the neighbors is the local extremum.The iterative termination condition is generally selected as the maximum number of iterations or the optimal position obtained by the entire population satisfies a predetermined minimum adaptation threshold.
When searching for the two optimal values, the root of the particle i updates its own speed v i and its position x i in the dimension d by using the following formula: while v id t is the current speed of the particle i in dimension d, which absolute value is less than or equal to v max .xid t is the current position of the particle i in dimension d. v id t + 1 and x id t + 1 are the next moment value.
P best,d and g best,d have been defined before.rand 1 and rand 2 are random numbers between zero and one but do not include them.w is an inertia constant.C 1 and C 2 are two acceleration constants.
Particle swarm optimization can accomplish the optimization task well, and the main point is modifying both flight speed and position in each generation.The velocity update formula which appears in the analysis mode consisted of three parts.The part wv i controls particles to maintain their original flight inertia.The part C 1 rand 1 makes particles have the best tendency to fly to themselves.The part C 2 rand 2 gives the particles a tendency to fly towards the current global optimum.The value of w, C 1 , C 2 , and V max has great impact on the operation results and efficiency of the algorithm.
The basic POS algorithm requires a few user-determined parameters, and the algorithm is simple to operate and easy to use.However, it also has some problems that can easily fall into local extreme points and some other questions.Therefore, some scholars have studied the parameters in formula  7 Complexity (4) and made corresponding adjustments [19], and there are a series of hybrid particle swarm algorithms proposed to solve some of the optimization problems encountered in the practical application process [20], such as the adaptive POS algorithm [21] and the hybrid PSO algorithm based on genetic algorithm ideas [22].

Discrete Particle Swarm Optimization (D-PSO).
The basic particle swarm optimization algorithm is used to solve the continuous optimization problem.However, many practical engineering problems are described as combined optimization problems, so Kennedy and Eberhart proposed the binary discrete particle swarm algorithm [23].They proposed a probabilistic model of ion judgment.They proposed a kind of probability mode based on an ion discrimination in which true is equal to 1 and false is equal to (0).

P x t+1
id = 1 = f x t id , v t id , p i,best , g best 5 In a discrete binary space, particle x id t + 1 tends to be judged to be 1 or 0.Parameter v id t determines a probability of small threshold selection.If v id is higher, the particle will be more likely to choose 1.If v id is lower, the particle will be more likely to choose 0. And the threshold must be located in [0, 1].Sigmoid function can meet the requirement:sigmo id v id = 1/ 1 + exp −v id .The sigmoid function is often used in neural network theory.Also v id has to set an up and down change amplitude to ensure that the value of sig moid v id cannot be too near to 0.0 and 1.0.
In this way, we can increase the chance of changing the bit position of the value x id , and the value sigmoid v id will not fall into local extreme.
The adjustment of the particle position is similar to the basic POS algorithm (1).In order to tend to find the optimal location for itself and the group, the adjusted formula in the discrete particle swarm algorithm is as follows: id < sigmoid v t+1 id , then x t+1 id = 1, else x t+1 id = 0, 6 while p t+1 id is a random digit between 0.0 and 1.0 and the other parameter is similar to basic PSO.
PSO has the disadvantage of easily trapping into the local optimum on solving multimodal functions and poor local search in the later stage.In DPSO, several detecting particles are randomly selected from the population, and the detecting particles use the newly proposed velocity formula to search in spiral trajectories.As a whole, the detecting particles and common particles would do the high-performance search.DPSO tries to improve PSO's performance on swarm diversity, the ability of quick convergence and jumping out the local optimum.

The Principle of the Optimization in Team Selection and Matching
There are obvious differences in the knowledge level, management mode, and professional background among the various corporate teams involved in collaborative cooperation.In order to ensure the orderly and efficient completion of cooperative projects, there are some principles that should be taken into consideration when selecting team members.The main principle includes goal identity, knowledge professionalism, knowledge complementary, communication and effectiveness [24].The process of selecting team members in the context of big data is shown in Figure 8. From the team subgroups of different professional functions, according to the overall strength and focus of the team, select the most suitable team members to collaborate to complete the project.Depending on the overall task, based on the overall level of each member, the task can be decomposed, and the expected goal can be achieved effectively.The process of selecting team members in the context of big data is shown in Figure 8 as follows.
4.1.Strategic Balanced Matching Principle.The principle of strategic equilibrium matching means that the strategic contributions of selected team members can be balanced in all dimensions and meet the strategic needs of the strategic team as much as possible [25].The matching principle is as follows.8 Complexity Suppose that the strategic goal of the team is B = x 1 , u 1 , x 2 , u 2 , … , x n , u n .The sequence pair x i , u i indicates that the team has a fuzzy demand x i for strategic needs u i .The candidates' strategic contribution to the team is G = y i,1 , u 1 , x i,2 , u 2 , … , x i,n , u n .Suppose that the number of the chosen team number is g and π represents a fuzzy synthesis operator.So the overall contribution of selected members is as follows: Strategic equilibrium matching is to make G total and B close to each other at most.This similar degree is called the strategic equilibrium matching degree, expressed in zlz [26].The degree of closeness is described in the improved fuzzy theory, using ρ q A, B = 1 − d q A, B to express the closeness between fuzzy variables while there is When q is equal to two, d q A, B is the weighted Euclidean distance [27], because the concept of the approach degree proposed in this paper has directional requirements, which means that when the overall strategic contribution of the selected team members is higher than the overall strategic needs of the team, it will promote the strategic development of the enterprise team; when it is lower than the overall strategic needs of the team, there will be a slight boost.In order to make up for the shortcomings of the existing closeness concept, we introduce the above approach degree and below approach degree.
In the above approach degree, let f x A, B = ∑ n i=1 μ A x i − μ B x i .If f x ≥ 0, then ρ q A, B is the above approach degree of A to B. The proximity value is 1 − d q A, B .
In the below approach degree, let The above approach degree is [0, 1].The below approach degree is [−1, 0].Use zlz to indicate the above approach degree, that is, the top strategic equilibrium matching degree; use zlz to indicate the below approach degree, that is, the below strategic equilibrium matching degree.The strategic equilibrium matching principle is as follows: When When The Principle of Resource Gain Effect.The principle of resource gain effect is the increased benefit which is generated by structural change of resource allocation.It lies in two aspects: one is how to maximize the increased benefit which is generated by structural change of resource allocation; the second is how to minimize the input cost of structural change of resource allocation in the case of the output in a stable condition, while the problem of team selection optimization is just about accord with the first case, that is, in the case of the personnel input in a stable condition, how to build a team to maximize the increased benefit of the corporation.Therefore, it is reasonable to consider the resource gain effect as one of the principles.The resource gain efficiency function generally needs to be comprehensively determined based on the actual situation of the enterprise and a combination of factors.Using R r+1 = ψ R r indicates gain effect.While R t indicates the t period of resources, R t+1 represents the t + 1 phase of resources after gains and ψ is the gain function.The gain function can be linear or nonlinear and may have different resource gains at different times [28].The paper uses a linear function to describe ψ.

Strategic Balanced Matching Principle and Associated
Hypothesis of the Principle of Resource Gain Efficiency.Assume that there is a certain correlation between strategic balanced matching principle and the principle of resource gain efficiency in the enterprise team, that is, the ultimate choice of team members can obtain the maximum degree of resource gain when they satisfy a good strategic equilibrium matching degree.There is a functional relationship between strategic balanced matching principle and associated hypothesis of the principle of resource gain efficiency which is R t+1 = ψ R t , zlz .When determining strategic balanced matching principle, it usually considers how both the 9 Complexity above approach degree and the below approach degree affect the enterprise team.
Setting the value interval zlz, 1 or −1, zlz and using this interval as a basis for combination optimization.Because it is impossible to achieve a complete sense of the combination of matches, so there is R t+1 = ψ zlz, 1 or −1, zlz .We introduce the resource gain factor δ to indicate that when strategic balanced matching is in zlz, 1 or −1, zlz , resources will be δ% gain in the next period.There exists as follows: When the value of zlz and zlz are different, it expresses that the enterprise team finally chooses different strategic balanced matching intervals.At this point, the corresponding resource gain coefficient will also be different.We assume that the resource gain coefficient is the same in each iteration period.

The Establishment of Team Selection
Optimization Model under the Background of Big Data 5.1.Algorithm Design.There are not many adjusted parameters in the discrete particle swarm, and this simplicity gives the DPSO algorithm a better global search capability.However, the setting of parameters is still one of the significant factors affecting the solution performance of the algorithm.
The selection of reasonable parameters has a very important influence on the solution accuracy, the solution speed, and the convergence of the optimization problem.The parameters in the DPSO algorithm are not independent of each other, and the correlations and couplings between the parameters are relatively strong.Therefore, the parameter setting makes the algorithm become a complex optimization problem [14].
We use a simulation experiment to qualitatively analyze the main raw parameters of the particle swarm algorithm.The particle swarm optimization algorithm in ( 6) is used to solve the global minimum of the test function.Then, the value of the parameter is changed within the allowable value range.Observe how the fitness value of the test function is changed when the parameter value is analyzed.The specific test steps are as follows.
Step 1. Initialize the population of particles and generate the initial position x i and speed v i of each particle randomly.The speed of d dimension is v id = rand × V max .rand is a random number between −1 and 1.
Step 2. The parameters of the Griewank algorithm are set according to the values given in Table 1, and the values of the parameters are changed from small to large at certain intervals within the permissible range.Each time the value is changed, an algorithm is executed to ensure that other parameter values remain unchanged.
Step 3. Calculate the fitness value of each particle, and get the individual historical optimum p best and global optimum g best of each particle.
Step 4. Compare the fitness value of each particle and update the individual optimum and the global optimum.
Step 5. Update the speed and position of each particle.
Step 6. Set the iteration algebra t = t + 1; repeat steps 3-5 until the termination iteration condition is satisfied.
Step 7. Return to step 3 and perform steps 2-6 until the termination loop condition is satisfied.This paper conducts qualitative analysis to the five main parameters which are m, C 1 , C 2 , ω, and V max .

The Effect of Population Size m on Algorithm
Performance.The impact of population size m on DPSO algorithm is mainly reflected in the convergence speed and solution accuracy.Set the value population size m to 10, 30, and 50, respectively, and conduct DPSO optimization process to test function Griewank in different population sizes.Gradually increase m and observe the optimal solution trend of the DPSO search for the test function under different population sizes.The results are shown in Figure 9.As can be seen from the results, the larger the value of m, the slower the convergence rate and the higher the solution accuracy.When the value of Griewank function is more than 50, the  10 Complexity value has little effect on the accuracy of the algorithm.When the value of m is larger, the larger the number of times which DPSO runs during the execution, the more the calculation will increase, and the convergence speed will be decreased.
If the value m is small, it will easily fall into a local optimum.Therefore, the number of a population size between 30 and 50 can meet the requirements.11.From the figure below, when the maximum speed V max of the function Griewank is between 30 and 50, we can achieve the highest solution speed.

Influence of Inertia Weight ω on Algorithm
Performance.Inertia weight is the coefficient in the speed update program in the particle swarm algorithm which represents the memory ability of the particle to its own speed.The larger the ω is, the stronger the ability of the particle to maintain its own speed is, the stronger the trend is for new area to search, and the stronger global optimization ability is.On the contrary, the smaller the ω is, the easier the particles fall into the local optimum.This paper mainly tests the solution accuracy and convergence trend under different ω values and linearly decreasing ω values.The linear decreasing ω value is calculated as follows: Among them, ω max = 1 4 and ω min = 0.For each value, 20 tests were performed and the global optimal fitness values were averaged.The test results are shown in Figure 12.The average optimal fitness values for the functions under different inertia weights are shown in Table 2.
It can be concluded that the test function converges to the global extremum in the case of linearly decreasing ω that requires more iterations, and the convergence trend is obviously slower than the fixed ω value.As the iteration progresses, ω decreases linearly, local searching ability becomes stronger and stronger, and the particle population searches more and more locally.In the Figure 12, ω linear decrement does not show obvious advantages, but this is because the fixed linear decrement taken in the experiment does not show obvious advantages, and this is because the fixed ω value taken in the experiment is the empirical value after repeated trials.The optimal average fitness value of the Griewank function appears at ω = 0 95.Optimal fitness value

Team Selection Optimization Model Establishment.
There are many influencing factors in the selection process of team members, like project benefits, project work hours, and the project team's ability level.The large amount of uncertainty contained in project work hours and project team's ability level further increases the complexity of project portfolio decision-making.Therefore, it is necessary to adopt a new method to deal with uncertainty to study the issue of project portfolio selection and personnel allocation.Let t t = 1, … , T denote the total execution period of a project cooperation project, the number of candidate team members for each project is N i , r t,i,k denotes the demand for resource k in the tth phase project i, v t,i denotes the value of the return that can be obtained by realizing the project, R t,k denotes the total constraint on the resource k for the kth item, and f v t denotes the tth income.Define x t,i as a 0 and 1 decision variable.When item i is selected in item t, the value is 1; otherwise, it is 0.
Among them, (14) represents the value corresponding to the maximization of t-period profit, (15) is the t-period resource constraint, and ( 16) and ( 17) represent the constraints of strategic equilibrium matching and resource gain.When the team coalition decision-maker gives the upper and lower strategic equilibrium matching intervals and the resource gain coefficient, it is simplified according to formula (18), where A t − 1 represents the project member set extended to the tth period in the project members selected in the t − 1 period, A t − 2 represents the set of project members that are extended to the t = 2 period in the project members selected in d.

Example Analysis
In this paper, a 100-dimensional matrix is randomly generated as a resource gain matrix for a 100-person organization.Select a team of 70, 50, and 20 people from this which is equivalent to finding the largest principal submatrix of the corresponding order of the matrix.Then, randomly generate a 50-dimensional random array as a 50-member organization's resource gain matrix, selecting 35, 25, and 15 teams, respectively.The above selections are calculated using genetic algorithms, ordinary particle swarm optimization algorithms, and discrete particle swarm optimization algorithms.The relationship between the resource gain values  12 Complexity of the three algorithms and the number of iterations is shown in Figures 13 and 14.
As we can know from Figures 13 and 14, with the increase in the number of iterations, the team's resource gain value is also increasing.When the number of iterations of the GA algorithm reaches 90, the value of the resource gain reaches the maximum; when the number of iterations of the PSO algorithm reaches 80, the value of the resource gain reaches the maximum; when the number of iterations of the DPSO algorithm reaches 60, the value of the resource gain reaches a maximum, indicating that the DPSO algorithm has a stronger convergence ability.By comparison, it can be found that the DPSO algorithm has a better optimization effect than the GA and PSO algorithms.
To further illustrate the problem, this paper uses particle swarm optimization (PSO) and discrete particle swarm optimization (DPSO) to calculate 100 times and gain averages obtained by the two algorithms that are shown in Table 3.
From above example, we can see that for the problem of selecting m members from n members to establish a team, when m < n/2, the discrete particle swarm algorithm is better than the traditional particle swarm, but when m is close to n, this advantage is not obvious.

Conclusion
The selection of team members is an important method to increase team resource gains, and it is one of the most important decisions of corporations, which is a hot research topic in recent years.However, most researches put the emphasis on the qualitative analysis of index of selecting members and the quantitative research of team members' optimal selection so far.Nevertheless, there are few researches that introduced large data resources into team selection optimization by scholars.
The point of innovation in this paper is the analysis of the impact of large data when choosing engineering partners and selecting team members, which has great meaning both in practical and reality.In accordance with the principle of strategic equilibrium matching and resource gain for team members, we proposed a genetic algorithm, particle swarm optimization algorithm, and the team members of the discrete particle swarm optimization algorithm to solve the problem of the team selection optimization.Through a case study, the unique advantages of the discrete particle swarm optimization in team member selection are proved (for the problem of selecting m members from n members to form a team, when m < n/2).It follows that when the problem of selecting m members from n members to form a team, when m < n/2, the preferred optimization method for team selection in the background of big data is the discrete particle swarm optimization algorithm.
The study in this paper is far from complete.The study on the team selection in the background of big data is based on the hypothesis of selecting a single team from a certain member to achieve the optimization with the principle of strategic equilibrium matching and resource gain; however, in reality, the corporation needs to select more than one team from a certain member to get the optimization in the overall objective of the corporation.Thus, in the next study, we will discuss the method of selecting one or more teams in the background of big data to thoroughly analyze the problem of team selection optimization.

Figure 1 :
Figure 1: Team selection mode in the context of big data.

Figure 2 :
Figure 2: Frame composition of large data platform based on Hadoop.

Figure 3 :
Figure 3: Platform architecture diagram of big data of enterprise management information based on Hadoop.

Figure 4 :
Figure 4: Function design diagram of enterprise information management big data Hadoop platform.

2 Figure 6 :
Figure 6: The basic principle of particle swarm optimization.

Figure 8 :
Figure 8: Formation of team member selection process.

Figure 9 :
Figure 9: The effect of population size algorithm performance.

C 1 =Figure 10 :
Figure 10: The influence of learning factors on algorithm performance.

Figure 11 :
Figure 11: The effect of maximum speed on algorithm performance.
[17,18]e swarm optimization (PSO) is an evolutionary computation technology based on swarm intelligence, proposed by Eberhart and Kennedy[17,18].Swarm intelligence system can produce unpredictable group behavior by simulating local information, such as ant colony algorithm and particle swarm optimization; the former is the simulation of the food gathering process of ant colony and has been successfully applied to many discrete optimization problems.3.2.1.PSO Algorithm.Particle swarm optimization (PSO) is a stochastic optimization algorithm based on swarm intelligence.It is inspired by bird predation behavior and imitates this behavior.Imagine such a scene: a flock of birds searching for food at random, only one piece of food in the area, all birds do not know where the food is, but they know how far it is, and the easiest way to find food is to search the area around the nearest bird, as shown in Figure

Table 1 :
The setting of the parameters of the DPSO algorithm and test function.
5.1.2.The Influence of LearningFactors C 1 and C 2 on the Performance of the Algorithm.Reasonable values for c 1 and c 2 can effectively adjust the ability of the particle to the individual to be the best and the global optimum which can effectively improve the accuracy of the solution.This paper tests the three kinds of conditions.One is changing C 1 and C 2 at the same time, another is only changing C 1 , and the last is only changing C 2 .The results of the experiments are shown in Figure10.From the simulation results, it can be concluded that the test function is difficult to solve to a true global optimum when the values of C 1 and C 2 are too large or too small.When C 1 and C 2 are smaller, it cannot effectively follow individual extremes and global extremes and may even fall into a local optimum in areas far from the optimal solution; when C 1 and C 2 are larger, the learning ability of particles is strong and can be quickly approached to the individual extremes and the global extremes, but it is easy to cross the global extremes to reduce the accuracy of the solution at the same time.When the value of C 1 is less than 2, it has little effect on Griewank.But when the value of C 1 and C 2 are between 1.5 and 2, we can get better results.5.1.3.The Effect of Maximum Speed V max on Algorithm Performance.When analyzing the maximum speed V max , set up different values for and see the convergence trend of the test function in different V max values under the DPSO optimization.The result is presented in Figure

Table 2 :
The average optimal fitness value of the function under different inertia weights.

Table 3 :
Comparison between PSO algorithm and DPSO algorithm.: average value A represents the average value of the optimal value obtained through the PSO algorithm; average value B represents the average value of the optimal value obtained through the DPSO algorithm.50-35 represents a team of 35 from a 50-person organization.The rest have similar meanings. Note