Understanding Service Providers’ Competency in Knowledge-Intensive Crowdsourcing Platforms: An LDA Approach

,


Introduction
As facilitated by highly developed information technologies, knowledge-intensive crowdsourcing (KIC) taps into the creative and innovation fields (e.g., designers), changing the traditional way of how business is conducted [1]. For example, in the design industry, penetration of technology, especially Internet and smartphone apps, has changed how design is practiced, produced, and traded. In terms of the critical role, it plays in today's knowledge-based economy, KIC is considered to be one of the most promising domains of crowdsourcing in the future [2,3]. e KIC platform operates as a two-or multisided market, meaning that each side of the market derives externalities from the participation of the respective other side, which is called the network effects [4,5]. Due to this, the KIC platform experiences tremendous growth in the user base and becomes a large complex network. In this network, a large number of open and crowdsourcing service providers (SPs) constitute the backbone of the KIC platform. Also, based on the resource-based view, the network effects allow the SPs to be converted into critical resources, creating a continuous competitive edge for the platform [6], and bring certain benefits. As a vital connection between the platform and service requesters, they assist the KIC platform in running a wide range of business and services catering to numerous end customers of different types and attain business success in their areas of professionals. In this process, they can meet the service requesters' demand. Especially for those high-quality and competent SPs, they can perform knowledge-based tasks well and generate satisfying outcomes. For the SPs, delivering services with committed quality and performance to achieve sustainable business growth is a concrete demonstration of competency, which enhances service requesters' satisfaction and trust toward the platform [7]. In turn, service requesters who previously had a good shopping or interactive experience are more likely to engage with the platform on an ongoing basis. By delivering anticipated services to their customers on the platform, the SPs enable the platform to attract more customers, generate more traffic, and capture more market share, so that the platform can remain competitive in the market [8,9].
Despite the certain benefits the SPs have brought to the platform and service requesters, some potential risks in terms of service quality and platform health may arise from the crowdsourcing activities. For instance, some SPs may prove precarious, due to the fact that the supplier database consists of complex and previously unknown crowds [10].
ere may also be potential hazards such as cheating, manipulating task outputs, or extracting sensitive information from crowdsourcing systems [10,11]. Even so, the KIC platform still weakens the input controls and simplifies the registration procedures to attract SPs. Additionally, the relationship between the platform and the SPs is not the classic principal-agent relationship (i.e., the platform does not hire these SPs), which offers SPs the flexibility of their work time and schedules [5]. Accordingly, concerns about the uncertainties of the operation process, including SPs' availability, service awareness, financial and intellectual property, and privacy risks, are growing, which may, in turn, jeopardize the platform's reputation and affect its healthy operation [10,11]. Under this circumstance, it is imperative for the platform to take some necessary actions to identify those who are not competent for the knowledge-intensive tasks.
Regarding the above benefits and risks, it is the responsibility of the platform, as an indispensable manager and regulator of the KIC activities, to manage the SPs on the platform. To this end, conducting competency analysis and understanding the SPs' competency is an effective way for the platform to fulfill the responsibilities and to have a clear vision of the SPs' performance standards and expectations [12]. On the one hand, the competency analysis offers a solution for the platform to construct a competency evaluation framework for the SPs [12,13] and differentiate high performance from middle and low performance [14]. On the other hand, the competency analysis helps the SPs to achieve certain goals under the standardized framework developed by the platform. erefore, we attempt to associate the competency analysis with the context of KIC.
Practically, we have noted that some KIC platforms have taken measures to facilitate the SPs' performance. According to our survey, some platforms in China will release a list of leaders online in terms of different criteria, such as bidding price and daily or total sales. Also, the reputation system of the SPs about their service quality, speed, and attitude is visible on their homepage. However, it is biased to consider only reputation, total sales volume, or platform score. Some other factors that may determine the SPs' competency are not taken into consideration, such as entrepreneurial experience [15], communication ability [16], and innovativeness [17,18]. It shows that not enough attention has been paid by practitioners to those aspects, and, as yet, a comprehensive understanding of the SPs' competency is still lacking. erefore, both the theoretical backgrounds and the practical applications motivate us to investigate the SPs' competency (i.e., the detailed components of competency that are required for being competent for the SPs' business in KIC). To this end, this paper aims to address the following question: What is about the SPs' competency in the context of KIC and to which competency criteria should the practitioners pay more attention when assessing the SPs' competency and performance?
To understand the SPs' competency, we aim to explore and identify competency criteria (i.e., the competency factors and dimensions). We analyze online interview records posted by several KIC platforms in China, which includes successful SPs' opinions about the qualifications to obtain sustainable business growth and conduct their career well. We first crawl these online interview records and then extract and identify the competency dimensions leveraging the Latent Dirichlet Allocation (LDA) model. We then construct the KSAT competency model for the KIC platform to evaluate and manage the SPs. Further, questionnaires are developed to collect experts' opinions and the Best-Worst Method (BWM) is applied to determine the weights and priorities of the competency criteria in the proposed KSAT competency model. erefore, the main contributions of this research are summarized as follows: (1) is study expands the outreach of the competency theory by introducing it to the KIC environment.
To the best of our knowledge, it is the first time that the competency theory is applied in the field of KIC.
(2) is study presents a novel research framework enabling the KIC platform to transform the online textual information into useful knowledge about SPs' competency. e incorporation of text mining and decision-making methods offers insights for researchers and practitioners to understand the hidden content behind the large collection of unstructured textual information on the KIC platform.
(3) e proposed KSAT competency model provides a rich set of indicators and variables that allow both researchers and practitioners to flexibly design, build, and formulate specific evaluation framework in terms of different requirements and goals.
In the following sections, the related works are presented in Section 2. e methods used in this research are detailed in Section 3, followed by the data analysis and results in Section 4. We then discuss the KSAT competency model and 2 Complexity criteria importance in Section 5. Finally, the conclusion is drawn in Section 6.

Knowledge-Intensive
Crowdsourcing. e collaboration landscape has changed remarkably over recent decades where users can shape the Web and availability of information via highly developed information technology. Traditionally, collaborations were mostly concentrated within organizations' internal function departments [19] and also limited to messaging tools such as e-mail. However, it is nowadays possible to leverage the knowledge and intelligence of an immense number of people across geographic and organizational boundaries through crowdsourcing [20,21]. Since its appearance, crowdsourcing has attained much success and has become a widely commercial phenomenon [22]. Meanwhile, the KIC is considered to be one of the most promising domains of crowdsourcing in the future, and in terms of the critical role, it plays in today's knowledge-based economy [3]. It refers to crowdsource human intelligence-and expertise-related tasks, such as question answering, image annotation, product development, website design, logo design, and software development [23], which cannot be performed by computers, to a crowd of people in an open call [24,25]. In recent two decades, many KIC companies like readless, InnoCentive, Amazon Mechanical Turk, and some Chinese crowdsourcing platforms, such as ZBJ.COM, epwk.com, and 680. com, are established to support posting and performing various KIC tasks [23]. e KIC intermediaries are typical two-or multisided markets, where the network effects will attract an increasing number of participants from both supply and demand sides, to join the platform for valuable advantages [26]. Oosterman et al. [27] suggested that the KIC has the advantage of low cost for service requesters; therefore, it allows cost-saving and efficient use of resources. Additionally, by inviting a crowd of customers to new product development through a crowdsourcing practice, taking Dell IdeaStorm [28] for example, innovative product or service ideas can be generated and then applied to the production process, thus making the products more attractive to markets and adding value to companies' business [19]. Further, crowdsourcing eliminates geographical limitations and offers SPs the chance to develop their careers and pursue valuable and interesting jobs [29]. Consequently, an increasing number of service requests turn to the KIC platform for business ideas from the SPs and many SPs adopt online KIC crowdsourcing to gain knowledge and monetary benefits from transactions with service requests [23,30,31].
Based on the resource-based view, the network effects can turn SPs into critical resources that bring sustained competitive advantages to the platform [6]. As an intermediary, the KIC platform has to better manage SPs for customer satisfaction, competitive advantages, and sustainable growth. e most direct measure is to control the output quality and enhance customer satisfaction. Existing research proposed different quality control approaches to estimate the SPs' quality for a specific task. Vakharia and Lease [32] and Li et al. [33] reviewed different task-oriented quality control methods implemented by crowdsourcing platforms practically and researchers academically. e qualification test refers to a set of golden questions with known true answers that the SPs have to answer [34]. ey are allowed to perform the real tasks until they pass the test and achieve a threshold score. For example, Clickworker, Crowdsource, and MTurk provide prequalification systems to assess the skill level (e.g., language level) of potential SPs. e gold-injected method mixes golden tasks with real tasks and workers do not know which tasks are gold ones during the task completion process [35]. For example, Crowd-Computing Systems and CrowdFlower enable service requesters to inject gold standard data, i.e., a collection of tasks with known correct answers, into their tasks to measure the SPs' performance automatically. Iterative computation methods iteratively compute and update the SPs' quality by leveraging all other SPs' answers for all tasks. e underlying concept is that the SPs who frequently submit reliable answers will be assigned with high quality scores and answers provided by the SPs with high scores will be selected as true answers [36]. Furthermore, many other iterative computation approaches have been applied and developed to compute and measure the SPs' quality, such as EM-based (expectation estimation) methods, and graph-based methods [37]. However, these SPs' quality assessment methods are task-specific and have limited applications [38]. In addition, these approaches focused on the determination of a single label [39], e.g., the quality score of the SPs, or whether the SPs are allowed to perform a task.
ere also exists another steam of research using multicriteria decision-making (MCDM) methods considering varied attributes to select the appropriate SPs and ensure the output quality for a task in the KIC context. Gong [3] proposed an integrated AHP (Analytic Hierarchy Process) and TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution) approach and took attributes such as credit, skill test score, and the number of completed tasks into consideration, to evaluate and select SPs in a network of crowdsourcing marketplaces. Zhang and Su [40] considered several criteria of the SPs, namely, interests, competence, reputation, and availability to participate, and offered a combined fuzzy DEMATEL (Decision Making Trial and Evaluation Laboratory) and TOPSIS approach to select the candidate SPs for a KIC task. e relationships among these criteria and their indicators were further explored in [41]. However, the MCDM methods mainly aggregate the SPs' attributes from the perspective of the platform operators or service requesters and only consider a single or a small part of the criteria on behalf of their research goals [40].
Existing studies indicated that the management of SPs is of great significance for the KIC platform. Much attention has been paid to task-specific test design, quality computation methods improvement, and the combination of MCDM approaches to determine the weights of criteria. However, the competency analysis about the SPs and investigation of the competency components are not included, even though it is vital for the SPs to be competent in their crowdsourcing business. As a result, we attempt to connect Complexity the competency analysis with KIC and thus the competency analysis is introduced.

Competency Analysis.
e concept of the competency analysis was coined by McClelland and the McBer and Company in the 1970s, and it is defined as components of performance associated with "clusters of life outcomes" [42].
is definition views competency very broadly as any psychological or behavioral attributes associated with longterm success [43]. Later on, the competency movement started in the 1970s. Gibbons [44] argued that the movement was mainly caused by the disconnection between education and the labor market. Professional organizations had to articulate performance standards and requirements and develop competency profiles with which candidates have to comply to be professional [8]. e concept of competency is multidimensional, and various conceptions emerged. Now competency is generally conceptualized as "knowledge, skills, abilities, or other characteristics (KSAOs) that differentiate high from low performance" [14]. To date, the competency analysis is widely applied in many facets of human resource management. Kurz and Bartram [45] and Bartram [46] introduced the Great Eight competency model as a generic competency model that can be applied across a variety of jobs and organizations. By including the research of McClelland, Mirabile [14] proposed the KSAOs competency model that consists of a set of attributes possessed by the workers, typically indicated as knowledge, skills, attitudes, and personal traits required for effective work performance [47]. Based on the Great Eight, Krumm et al. [48] developed the KSAOs model for virtual teamwork which contains 60 potentially relevant items and compared the differences of KSAOs requirements between virtual and traditional teams. According to the empirical study in Hertel et al. [49], the authors found that a set of KSAOs (e.g., persistence, willingness to learn, creativity, independence, interpersonal trust, and intercultural skills) were related to tele-cooperation performance and indicated that creativity and independence significantly contributed to the team performance. Cogliser et al. [50] indicated that computer self-efficacy was the main performance predictor of a virtual management organization. Maurer and Lippstreu [51] conducted a survey to rate a varied set of KSAOs in terms of improvability, importance, and "needed at entry" facets. Prahalad and Hamel [52] pointed out that focusing on a collection of core competencies, i.e., the company's collective knowledge about how to coordinate diverse production skills and technologies, will help the corporation gain competitive advantages. Boyatzis [53] offered a "total" system approach that determines which characteristics of the managers enable them to be effective in various management jobs. Wu and Lee [54] developed the competencies of the global managers by using eight different IQs and proposed the fuzzy DEMATEL method to segment the required competencies into meaningful portions. Li et al. [7] proposed a multicriteria competency analysis framework for the crowdsourcing delivery personnel.
In general, these studies present a fact that individual or organizational performance is influenced by various competency items. Across different domains of surveys, however, models are rather heterogeneous in terms of which specific competencies are significant to performance. Although these studies offer valuable contributions and insights that help us to understand the competency of different roles, the differences in their findings highlight that their model structures are by no means universal and strongly depend on the characteristics of the specific context.

Methodology
To investigate the competency dimensions and understand the SPs' competency in KIC, this paper takes the advantages of the SPs' experience sharing information and explores the SPs' competency based on their understanding and perception. We first apply the topic modelling techniques to the crawled raw data and then construct the three-level KSAT competency model based on the extracted and identified competency dimensions. Later, the BWM is leveraged to explore the importance and weights of competency criteria in the KSAT competency model. Our research framework is depicted in Figure 1. We detail our methods in this section.

Topic Modelling.
e rising development and accessibility of large electronic archives, along with increased computational facilities, have led to an interest in the textual content analysis [55]. e topic modelling is an effective modelling method for extracting implied themes in largescale text based on word cooccurrence for each document in the corpus [56][57][58]. Moreover, the topic modelling is a useful way to "let the text talk" due to the independence from the evaluator's personal opinions or experiences [59], and it has been studied and applied in various fields, such as recommendation systems [60], online health communities [61], and customer-generated content analysis [62]. e LDA is a well-known unsupervised machine learning technique for natural language processing [63] and is the simplest and most popular topic modelling algorithm [64,65], which has the advantage of recognizing the hidden topics and mining deep semantics of huge amounts of textual documents through an effective way. e basic idea of the LDA is that each document exhibits a mixture of latent topics wherein each topic is characterized by a distribution over the words, i.e., per-document topic distributions and per-word topic distributions [66,67]. e generative probabilistic model of the LDA is represented in Figure 2.
e LDA defines the following terms: (1) A word is the basic unit of discrete data, defined to be an item from a vocabulary indexed by 1,

Complexity
As shown in Figure 2, the words within documents w d,n are observable variables, while the other variables, including the topics φ k , k � 1, 2, . . . , K { } (the distributions over words), the topic distribution per document θ d , and the perword topic distribution z d,n , are not known. ese latter items represent unobservable variables (white circles in Figure 2) that should be estimated from the analysis of observable variables (shaded circles in Figure 2). Parameters α and β are the hyperparameters for prior distributions of θ d and φ k , respectively. e plate notations at the bottom of each rectangle denote their usage to illustrate the replications; i.e., the K plate represents the number of topics, the N plate represents the total number of unique words within documents, and the M plate represents the number of documents. e arrows represent conditional dependencies among components in the following way: per-word topic distribution z d,n is dependent on the topic distribution per document θ d , and the observed word in each document w d,n is dependent on z d,n and all the topics φ k . e conditional dependencies enable the definition of the joint distribution of observed and unobserved variables. As a result, the LDA has remarkable advantages in employing Bayesian learning to infer latent variables by calculating their posterior distribution from the joint distribution. Learning the unobservable components allows us to capture the hidden semantic structure in the documents. More specifically, the main outputs yielded by the LDA are, namely, topics φ k and their weights in each document θ d , and per-word weight within each topic z d,n . at is to say, the outputs of LDA consist of K topics, wherein each topic is denoted as a combination of words with different probability of occurrence. However, the combination of words cannot deliver the precise meaning of each topic; it would be better to label them. In other words, human judgments and intervention are a usual way to label the topics on the basis of the semantic similarities of included words [68]. For more details about the LDA, one can refer to [66,67].

Best-Worst Method.
e BWM is a recently developed MCDM method [69]. It uses a structured way to conduct the pairwise comparisons, which has several major advantages [7]. (i) e decision-makers are required to identify the best and worst criteria (or alternatives) prior to conducting a pairwise comparison, which enables them to have a clear understanding of the range of evaluation, consequently, resulting in more consistent and reliable pairwise comparisons [70]. (ii) By using two opposite references (best and worst) in a single optimization model, which is called consider-the-opposite-strategy, the BWM has been proven Complexity to be effective in mitigating possible anchoring bias arising during the pairwise comparisons process [71,72]. (iii) e BWM better balances the data and time efficiency and, at the same time, enables the decision-makers to check the consistency of the provided pairwise comparisons [71]. On the one hand, compared to other pairwise comparison-based methods using a single vector such as the Swing and SMART family, the BWM overcomes the main weakness that it is unavailable to check pairwise comparison consistency, while maintaining the high data (and time) efficiency of such single vector input-only methods [71]. On the other hand, the number of pairwise comparisons needed to be conducted in the BWM is less than that of full-matrix-based methods, such as AHP, which effectively enhances time and data efficiency. Although the number of pairwise comparisons under the full-matrix-based method is sufficient to check the consistency, decision-makers have to answer too many questions, which can result in confusion and inconsistency [71]. e method has been widely applied in many real-world problems, such as the supply chain, manufacturing, logistics, airline industry, supplier selection, and service quality evaluation. For a review of the applications, one could refer to Mi et al. [73].
In this study, we use the BWM due to its advantages and wide applications. Specifically, the steps to determine the weights of criteria using the BWM are as follows [69,70]: (1) Determine a set of decision criteria c 1 , c 2 , . . . , c n by experts or decision-makers. (2) Identify the best (B) and the worst (W) criteria by experts or decision-makers. (3) Determine the preference of the best over all the other criteria by experts or decision-makers using a number between 1 and 9 (where 1 is "equally important" and 9 is "extremely more important"). e result of best-to-others comparisons is vector V B � (a B1 , a B2 , . . . , a Bj , . . . , a Bn ), where a Bj shows the preference of criterion B over criterion j. (4) Determine the preference of all the criteria over the worst by experts or decision-makers using the same scale (1 to 9). e result of others-to-worst comparisons is vector V W � (a 1W , a 2W , . . . , a jW , . . . , a nW ) T , where a jW indicates the preference of criterion j over criterion W. (5) Compute the optimal weights (w * 1 , w * 2 , . . . , w * n ). e optimal weights are computed by minimizing the maximum absolute differences of |w B − a Bj w j |, |w j − a jW w W |} for all j, which can be expressed by the following optimization problem: subject to n j�1 w j � 1, w j ≥ 0, for all j.
Equation (2) is converted into subject to (4), indicating the optimal weight of criteria. ξ * is the result of the objective function in equation (4), indicating the consistency of the provided pairwise comparisons. If ξ * is closer to zero, a higher level of consistency is in the pairwise comparisons by experts.
When the MCDM problem is a hierarchical criteria tree, then the results of equation (4) are called local weights. To determine the global weights of subcriteria in the last level of the tree, their local weights are multiplied by the weights of the category to which they belong. When we have a number of experts, we follow all the five steps for each expert individually. To aggregate the final results (global weights), we use the geometric mean.

Data Analysis and Results
In this section, we give details of the data collection and preprocessing, the data analysis, and the results of our model. Related procedures are conducted on an Intel Core i7 CPU, 16 GB RAM machine. e raw data are crawled and analyzed based on scikit-learn under Python version 3.6.5.

Data Collection and Preprocessing.
According to the report [74], the transaction size of the KIC market in 2020 increased by 306.3 billion RMB compared to 2019 in China. As facilitated by the Chinese policy "mass entrepreneurship and innovation" [75], crowdsourcing platforms in China, such as ZBJ.COM, epwk.com, and 680.com that are specialized in supporting KIC activities, also experienced booming growth. Given that SPs participate in these platforms at different time and with various business capabilities, in order to facilitate their business growth and career development, these platforms provide them the opportunities for knowledge and experience sharing and learning. Particularly, these platforms will regularly organize interviews with successful SPs (i.e., top order winners, top income earners, or long-established SPs) to share experience regarding a series of questions, such as "how did you manage to get so many orders?"; "what did you do to get an order at such a high price?"; "what efforts did you make to be a longestablished SP?"; and "What is your plan to further facilitate your business in the future?". en, the interview records will be posted on their platforms in order that more SPs will access and learn from these successful experiences.

Complexity
According to our survey with SPs of ZBJ.COM, they considered the experiences of the successful SPs to be highly informative, providing effective guidance on the issues they were facing at that time. We think that these online interview records are of great significance for the KIC platform to understand SPs competency and derive competency dimensions that can improve SPs management. To undertake this study, we apply the following steps to collect the raw data, i.e., the interview records, as the corpora for this research.
(1) Selecting the data sources: to ensure the quality and quantity of the raw data, we select the most three popular KIC platforms, namely, ZBJ.COM, epwk. com, and 680.com, as the data source.
(2) Writing the clawer programming: due to the large amount of content available on these platforms, accessing the data online through the programming is an effective and efficient way. Consequently, we code the programming in Python.
(3) Crawling the raw data: running the programming in Step 2, we crawl the online interview records on the platforms. Key information including the title, content, poster, post time, view times, comments, and responses is collected and finally the corpora contain 1760 records in total within a time frame from January 2014 to July 2019.
e raw data collected from the KIC platform are all unstructured and in Chinese. To effectively extract implied topics from these large-scale texts, we apply the general text preprocessing steps to clean the unstructured text for topic modelling as follows: (1) Cutting each article into sentences and eliminating all numbers and alphabets, just leaving the Chinese characters: note that Chinese text is processed in UTF-8 encoding format.
(2) Defining the user dictionary and tokenizing each sentence into multiple space-separated words: in this step, we refer to a predefined user dictionary to know where to pause in a sentence. Unlike in English, there is no space between two Chinese characters or phrases. Given the different segmentation, the meaning of the sentence may be completely different. With a predefined dictionary, a complete Chinese sentence will be cut into several tokens.
(3) Removing the stopwords from each sentence: to better clean the data, except for using the popular Chinese stopwords list, we construct our own stopwords list with words that are domain-specific and highly appear in our corpora but useless for analysis (e.g., crowdsourcing, the platform, innovation design).
(4) Constructing the term-document matrix: after removing the stopwords, the term-document matrix (i.e., the distribution/frequency of terms (rows) within documents (columns)) used as the main input to the LDA is constructed.

Competency Dimensions Identification.
Prior to building a topic model from the experience sharing articles, we need to exogenously give the number of topics K. To obtain the proper topic number, Aletras and Stevenson [76] and Ramage et al. [77] introduced cosine measures to capture the similarities of generated topics, where the lowest average cosine similarity denotes the best model and thus determines the appropriate topic number. In this sense, we determine the optimal number of topics from a discrete range rather than a continuous range. We calculate the average cosine similarity setting the number of topic K over a number set 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 { }. Figure 3 presents a comparison over discrete topic numbers in terms of the average cosine similarity of topics. As shown in Figure 3, we find that the average cosine similarity score becomes the lowest (0.0375) when the number of topics is set to 18. erefore, the appropriate number of topics is 18. We then set the topic number K to 18 and we obtain 18 clusters of words from the corpora (see the second column in Table 1). As discussed earlier, the topics are distributions over words; the top five keywords with the highest probability (most frequency) derived from the posterior distribution (i.e., z d,n ) are provided for each topic in Table 1. Given the corpora are in Chinese, we translate the words in English for readers to better understand the topics and attach the original Chinese words.
However, presenting as the combinations of words cannot characterize the underlying content of the topics. Unfortunately, automatic labeling of the topics is infeasible because the topic extraction by the LDA is an unsupervised learning process. Instead, it requires human judgment and intervention to check the coherence and meaningfulness of these topics and then label them through their judgment [66,68]. As a result, the authors interviewed the managers and the SPs in ZBJ.COM and the researchers in related fields and then validated and labeled the extracted topics, by referring to the Spencer's competency dictionary [13]. As in Kurz and Bartram [45], we refer to the label of each topic as a competency dimension, as indicated in the last column of Table 1. [14] proposed the KSAOs competency model that consists of a set of attributes possessed by workers, typically indicated as knowledge, skills, attitudes, and personal traits required for effective work performance. In order to frame our findings in theory and also offer more significant insights the LDA results, the eighteen competency dimensions are classified into four competency factors, namely, knowledge, skill, ability, and trait, according to the aforementioned interviews with managers, the SPs and related researchers, on the basis of the meaning of topics and competency dimensions. As in Kurz and Bartram [45], we define the main criteria as competency factors and subcriteria as competency dimensions. Particularly, we summarize the two competency dimensions, demand understanding and reasonable suggestion, into one single dimension, task analysis, and    Table 2. e model embraces a comprehensive and hierarchical set of competency criteria that offers the researchers and practitioners in KIC the flexibility to systematically build, verify, and change the SPs' selection and evaluation mechanisms to suit their requirements.

Weights of Competency Criteria.
Due to the KIC platform's goals on healthy and sustainable operation process, the proposed KSAT competency model can be further employed to evaluate the SPs' competency and select and keep the competent SPs. Identifying the importance of each competency criterion plays a significant role in better guiding both the SPs and the KIC platform to plan, design, and implement mechanisms and strategies for sustainable development and management. We develop an online  Entrepreneurial experience e English words represent the translations of the original Chinese words in parentheses, and the decimals following " * " indicate the probability that each word belongs to that topic. Complexity questionnaire involving the aforementioned KSAT competency criteria shown in Table 2 and employ the BWM to determine the weights of competency criteria. In this step, we invite 15 experts and ask for their opinion. Of the 15 respondents involved in this research, 9 with over 5 years' working experience are employed as managers in crowdsourcing companies, like ZBJ.COM. e remaining 6 are researchers in the crowdsourcing field with an average of 7 years' research experience. To ensure that all the respondents have adequate information to conduct the comparisons, the description of the BWM and the competency factors and dimensions are also provided.
In this paper, the importance of the competency criteria for the KIC platform to evaluate the SPs' competency is examined and evaluated based on the four competency factors (i.e., knowledge, skill, ability, and trait) and twenty competency dimensions (eighteen in level 2 and two in level 3). Tables 3-5 present the local and global weights of the competency factors and dimensions in levels 1-3, respectively.

Discussion
In this study, our concentration is to investigate and understand the SPs' competency by recognizing detailed competency criteria (i.e., competency factors and competency dimensions) in KIC. We first explore the successful SPs' experience sharing information using the LDA approach to identify natural and hidden competency dimensions. en, we construct the KSAT competency model and determine the weights of competency criteria for the KIC platform evaluating the SPs' competency. Such information acts as a source on how to evaluate, manage, and encourage SPs.

Results Analysis Related to KSAT Competency Model.
e highly competent SPs are necessary for the KIC platform to meet fluctuating demands for numerical and functional flexibility [80]. In KIC markets, the SPs' competency acts as decision factors that have a great influence on the quality of task outcomes, customer satisfaction and loyalty, and the platform's reputation and sustainable growth.
is study aims to understand the SPs' competency in KIC and identifies a list of competency criteria by extracting from the competent SPs' experience. e KSAT competency model is subsequently constructed based on competency theory that Mirabile [14] conceptualized competency as "knowledge, skills, abilities, or other characteristics" that differentiate high from low performance. Specifically, the KSAT   Knowledge e prior trial and error experience related to successful crowdsourcing business launches, development, and resolution of emerging problems [15,78], that can help SPs to be innovative, trigger new ideas, and seize opportunities [13,79] Entrepreneurial experience e experience that increases with every firm launched and helps to acquire detailed knowledge of administrative procedures for registration, corporate tax declarations, and social security [15] Profession experience e knowledge and experience in the same industry domains where firms launch [15], that help to perform tasks, manage, and run a business [79] Customers' industry background e knowledge and experience related to customers' industry domains which can help better understand customer requirements, and output satisfying service products [13] Skill e domain-specific occupational expertise required to perform knowledge-intensive tasks [13,80] Task analysis e expertise to deal effectively with customers' requirements and problems of service demands [81,82] Demand understanding e ability seeking to understand customers' expressed needs and requirements [83][84][85] Reasonable suggestion e ability seeking to develop superior suggestions and solutions to meet customers' demands [83][84][85] Modification and after-sales service e service supports such as free modification of designed service products and problem-solving provided by service providers once the transaction takes place [86] Customer relationship management e ability to build and maintain friendly, warm relationships with customers by managing their buying behaviors and feedback for continual sales and sustainable collaboration relationship [13,83,87] Branding e ability to promote their own service brand image [86,88,89] and generate word-of-mouth among customers [16] Team management e ability to have a detailed and accurate understanding of how the organization operates functionally via team member selection, team commitment building, and promotion [54,78,83] to improve teamwork and cooperation [13] Skill e domain-specific occupational expertise required to perform knowledge-intensive tasks [13,80] Team composition e members of the team are with functional heterogeneity and have different backgrounds and knowledge [83], which can complement the way for firms to obtain information benefits [78,90] Team environment e ability to create a harmonious and positive team atmosphere [83] Ability Task performance or the effective outcomes achieved by crowdsourcing business process [7] Communication ability e ability to keep customers and cooperators informed with useful information [13] Customer acceptance e ability to influence customers and gain their approvals and trust [13] Innovation ability e ability to be innovative in thinking and create novel ideas and solutions to problems [54] Online and offline coordination e ability to coordinately operate online and offline business [91] competency model proposed in this work includes knowledge (entrepreneurial experience, profession experience, and customers' industry background), skill (task analysis, modification and after-sales service, customer relationship management, branding, and team management), ability (communication ability, customer acceptance, innovation ability, and online and offline coordination), and trait (professional dedication, trustworthiness, competitive spirit, and achievement orientation). Knowledge has been deemed to be a key factor for success in the empirical research of entrepreneurship and crowdsourcing [15,79]. In KIC, the SPs are mainly regarded as self-employed workers or small-and medium-sized enterprises (SMEs) and viewed as entrepreneurs [93,94].
Entrepreneurial experience is thought to be mandatory when starting up a new business as it can help focus on strategic issues to alleviate the liabilities of newness, such as establishing new business partnerships [15]. Also, entrepreneurs will be more sensitive to business opportunities, future technologies, and customer demand via learning by doing process [15,95]. Furthermore, profession experience can boost experiential learning, enhance the development of operational knowledge, and ease the transfer process of prior knowledge before starting the new business [15]. Relying on entrepreneurial and professional experience, entrepreneurs can also save costs on routine development. In addition to its importance in entrepreneurship, knowledge is the determinant of the good performance in crowdsourcing.   Trait e physical characteristics and consistent responses to situations or information [13] Professional dedication e willingness to put the team's needs before personal needs, and align own behavior with the needs, priorities, and goals of the business to meet business goals [13] Trustworthiness e confident expectations of customers on SPs' performing a particular transaction, reflected by customer reviews, store images, and transaction performance [86,92] Competitive spirit e willingness to confront challenges and make efforts to stand out from the mass peers [13] Achievement orientation e concerns for working well or for achieving business goals and ambition [13] 12 Complexity According to [96], knowledge diversity of the individual crowds facilitates all types of contribution to open innovation projects as having knowledge in diverse fields allows the contributors to understand the task requirements [97] or blend disparate solution elements in novel ways. Being familiar with the unique industry background of the targeted customers will help the SPs better understand customer requirements, cut to the heart of the matter, and achieve consensus on the solution with customers [97]. Skill, defined as the domain-specific occupational expertise in this work, appears to be advantageous for both the KIC platform and the SPs. In the crowdsourcing context, skill is regarded as the basis for preselecting the proper SPs in auction systems as it demonstrates how the SPs are at doing particular tasks [40,41]. Rather than exploring and extracting dimensions of skill, the state of the art mainly concentrated on the indictors in a specific context to measure the level of skill or directly assign a numerical value to quantify [41]. e KIC tasks are commonly complex and creative, which cannot be done by computers [3]. A complete understanding of the task requirements and sound advice can help service requesters to better visualize the desired outcomes, thus reducing the perception gap between the service requesters and the SPs [97]. We conclude the two subdimensions (i.e., demand understanding and reasonable suggestion) as task analysis, which is viewed as a critical starting point to the task success. Meanwhile, the SPs in KIC have to offer modification and after-sales service, because the task outcome is not going to be perfect and matches the customers' expectations all at once. Offering such post-sale services could reduce customers' risk perception of task failure and poor transaction experience. Accordingly, offering high-quality services is a powerful way to enhance customer satisfaction and thus retain the targeted customers [86]. Customer relationship management, a strong tool in marketing and business [87], is another competency dimension of the SPs to maintain a long-term relationship with their service requesters. In addition, the SPs commonly own one or several virtual stores on the KIC platform. Brand image of the stores plays a vitally important role in attracting customers as it can create a positive attitude towards the SPs' virtual stores that will, in turn, encourage the intention to repurchase and produce positive word-ofmouth [86]. Furthermore, team management is also mentioned by the SPs as a core competency dimension in conducting KIC business. In China, many SPs register on the KIC platform as a team so as to organically aggregate human resources and leverage each member's strengths. However, existing research rarely explores the team management competency of SPs in the environment of crowdsourcing.
Despite the domain-specific skill, ability, defined as the task performance or the effective outcomes achieved during the crowdsourcing business process, is also regarded as a main competency factor by the SPs. e KIC activities involve many intelligence-related tasks and intangible services that everyone is likely to have a different understanding of the task requirements and outcomes. To mitigate such cognitive differences, communication is essential. Concretely, the accurate and useful information about the understanding of customer demand and expected outcomes, problem-solving plans, and modification suggestions need to be efficiently and actively conveyed by the SPs. As communication helps narrow cognitive gaps, innovation ability accounts for divergent thinking and creative ideas, differentiating services, and products from others [98]. In innovation contests, the innovativeness of solutions is considered an important reference for the selection of the winner [17]. In addition, customer acceptance reflects the degree of customers' trust in the SPs and the probability that the customers expect to obtain high-quality outcomes from the SPs [40]. Gaining customers' trust and acceptance has always been considered as one of the ultimate goals of marketing [92]. Additionally, the online SPs may also offer services offline as a complementary sales channel, which is a distraction of SPs' time and effort on online business. Hence, coordinately operating online and offline crowdsourcing business at the same time is a challenge for the SPs.
Besides, personal traits play a critically important role in determining the SPs' competency in the KIC activities. Previous research has widely investigated the significance of personal traits in exploring individual work performance or entrepreneur success [81]. Batey and Furnham [99] found that the "big five" personality traits account for up to 47% of the variation in divergent thinking. A meta-analysis by Feist [100] showed that the individuals who are open to new experiences, conscientious, hostile, confident, and emotionally impulsive are more likely to generate creative outcomes. Findings by Sebora et al. [81] asserted that the achievement orientation of the founders is positively related to the success of e-commerce entrepreneurial ventures as it helps the entrepreneurs overcome obstacles and compensates for other weaknesses. In crowdsourcing delivery, personal traits were considered as a main criterion in [7] for evaluating the delivers' competence and responsibility is one of the most important subcriteria when quantifying the competence score. ese researches indicate that personal traits have been found to be a robust factor of high-quality outcomes and performance [81]. Specifically, in knowledgeintensive crowdsourcing, the SPs regard professional dedication, trustworthiness, competitive spirit, and achievement orientation as the competency dimensions.

Results Analysis Related to Competency
Importance. e BWM results show that the competency criteria have different importance when the KIC platform evaluates the SPs' competency. Table 5 indicates that "customer relationship management" and "communication ability" have the highest priorities of all the competency factors, while "team environment" is the least important. "Customer relationship management" emerged in the 1970s [101] as a useful tool for managing and optimizing sales-force automation within companies, enhancing customer satisfaction and loyalty [102,103], and consequently reaching and retaining long-term partnerships with customers [104]. In this sense, the SPs with high-level customer relationship management will encourage the loyal end customers to retain on the platform, thus helping generate sustainable value and maintaining long-term growth. is attests the business model of Upwork (one of the largest online KIC marketplaces) charging its fees on a sliding scale to encourage the longer-term relations between the SPs and service requesters [105].
As to "communication ability," the KIC activities involve intensive and dynamic interaction among the end customers, the platform operators, and the SPs. Online chatting, instant messaging, and social media are the most popular ways for the SPs to interact with their customers [16]. With more information processed, communication about the task ideas may shift the interpretation or understanding of the task at hand [106], towards a way that helps to reach consensus between the SPs and the service requesters about the way how tasks are performed and the form of outcomes, thus reducing the gap between the service requesters' perception and expectation about the service and enhancing their overall satisfaction. Foundational research emphasized the role of communication in the form of dialogue, feedback, and other contextual factors, in the way a message is received and interpreted by the viewers, as well as how they respond [106,107], which indicates that the KIC platform may develop different types of communication systems to assist the SPs with their interaction with customers in various manners. Also, a high-level of communication ability that focuses on professional interactions, being honest and responsive, and respecting customer culture during interactions, will impress the customers with professional knowledge, and high-quality services, leading to a good customer relationship and reputation [89]. erefore, it is suggested that the SPs need to focus on the language used, cultural awareness, and promptness of their response during the interaction process. While "team environment" is weighted as the least important competency factor by experts, it is in line with the actual situation that, as a platform does not hire the SPs, it is relatively impossible for a platform to observe their interactions within the organization. However, the KIC platform need to pay attention to this factor as it is considered as an essential dimension by the SPs that influences their competency and performance.
As shown in Table 3, "skill" is the most important of all the four competency factors. e main reason for that is domain-specific skills are regarded as the key to a successful business [108,109]. Unlike the short and lowcomplexity simple tasks [110], the KIC tasks are domainspecific, high in complexity, and not easy to decompose apparently, which put high requirements on the participating SPs [111]. Gong [3] pointed out that the lack of domain-specific skill and expertise has limited the development of the KIC marketplaces, which revealed the dominating role of the SPs' skill in the KIC context. Further, the SPs' skill demonstrates how capable he or she is at doing particular tasks in the domain, and the SPs with high skill are expected to contribute to high-quality outcomes [40,108]. e prerequisite for the SPs conducting a specific task is that they have the relevant and necessary skills to perform at the required quality [112].
is implies that the KIC platform could grade the SPs' skill levels, such that the resources can be strategically allocated and assigned. For example, the high-skill level SPs could be assigned to perform more complex KIC tasks that require a comprehensive usage of different skills. For the SPs, they need to develop and enhance different types of skills, so as to perform more complex tasks to gain more income.
e "ability" factor ranks in second place, which means that, in absolute terms, it is still more important than "knowledge" and "trait." Among all the four competency factors, it is not surprising that "knowledge" and "trait" are weighted as the two least important. is is in line with the actual situation involving KIC activities in China, due to the fact that, to attract more SPs to participate in KIC activities, the input control of the KIC platform is relatively weak. For example, the steps for applicants to be SPs at ZBJ.COM (https://help.zbj.com/ fw/detail?articleId=14762) are (1) registering by a telephone number and a password; (2) filling in some required information, such as self-introduction, location, e-mail address, and task types willing to perform; and (3) uploading photos of identification card. e steps of epwk.com (https://i.epwk.com/User/Basicinfo/index. html) and 680.com (http://help.680.com/view_4.html) are similar to ZBJ.COM.
In addition, the fact that "knowledge" is not ranked the lowest infers that entrepreneurial-and business-related knowledge experience are also regarded by the experts as essential competency dimensions in demonstrating the SP's competency and performance.

Conclusion
In this study, by combining text mining and decisionmaking techniques, we conducted a comprehensive competency analysis of SPs in the environment of KIC. In this process, we leveraged online interview records posted by several KIC platforms in China, which includes the successful SPs' opinions about the qualifications to obtain sustainable business growth and conduct their career well. By applying the topic modelling approach to these materials, we identified four competency factors and twenty competency dimensions in general and thus constructed the hierarchical KSAT competency model. Further, we employed the BWM to identify the weights and priorities of the competency criteria in the KSAT competency model. e relationships between the competency criteria are also discussed, leading to the following conclusions: (1) e proposed KSAT competency model gives practitioners of the KIC platform a comprehensive vision of SPs' performance standards in the context of KIC. Further, it also provides the practitioners flexibility to choose different criteria according to different application scenarios and objective. In KIC markets, the SPs, viewed as entrepreneurs, are varied in backgrounds, skills, and abilities and contributions, leading to great challenges to their management for the KIC platform. Also, the SPs in the startup stages lack the benefits of continuous competition. To stabilize the market environment and advances the platform's interests, the platform managers have to develop better input control mechanisms, incentive mechanisms, and SPs' lifecycle management systems to retain and incubate the valuable SPs. Also, the KSAT competency model can be considered as a self-check or learning system rather as a mere grading tool to help the SPs realize their strengths and weaknesses and allow them to improve in aspects where they are weaker, in a structured and targeted manner.
(2) In the operation system of KIC, the platform is the rule maker and it is better for the SPs to focus on developing and enhancing those competency criteria that the decision-makers considered as important to achieve a higher ranking in the platform's evaluation system. Particularly, as "skill" and "ability" are viewed as the most important competency factors, it is necessary for the SPs to master not only excellent occupational expertise but also comprehensive business operation capabilities to get ahead in the fierce competition. (3) e KIC platform also needs to pay attention to the competency criteria weighted as least important ones, such as team environment, when developing evaluation mechanisms and management systems. Insufficient attention to these criteria may lead to unfair or biased evaluations as the SPs view them as essential elements that affect their business success.
is research can be extended in several ways. First, although the LDA model shows popular applications in text analysis, it still has some shortcomings. As an unsupervised learning algorithm, the LDA inherently has disadvantages in fully understanding natural languages but it requires no human intervention. Future research may use supervised learning algorithms for identifying the SPs' competency components from online interview records. Second, further longitudinal exploration can be conducted to analyze the antecedent and consequential relationships among these competency criteria. Experiments are also needed to help the crowdsourcing researchers establish causality by eliminating extraneous factors and endogeneity issues. Additionally, experimental approaches will possibly enable further explorations into situations where various management mechanisms are existing, in order that more insights can be derived from their interactions and how they could jointly work to enhance performance and customer purchasing intentions. ird, this research is limited by the single source of materials, which may result in incomplete competency identification. It might be necessary to collect and use more data from different data sources in future research to obtain more generalized and significant findings. Finally, it may be worthwhile to extend our study to where it could include other industries, other text mining techniques, and other data as future research for more significant analytical results on the SPs' competency and performance in KIC.
Data Availability e data used in this paper consist of two parts: the online interview records of SPs posted on the KIC platforms and the BWM questionnaires that were sent to the selected experts.
ey are available upon request to the corresponding author at wx921@163.com.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.