Utilizing Structural Network Positions to Diversify People Recommendations on Twitter

Social recommender systems, such as “Who to follow” on Twitter, utilize approaches that recommend friends of a friend or interest-wise similar people. Such algorithmic approaches have been criticized for resulting in filter bubbles and echo chambers, calling for diversity-enhancing recommendation strategies. Consequently, this article proposes a social diversification strategy for recommending potentially relevant people based on three structural positions in egocentric networks: dormant ties, mentions of mentions, and community membership. In addition to describing our analytical approach, we report an experiment with 39 Twitter users who evaluated 72 recommendations from each proposed network structural position altogether.(e users were able to identify relevant connections from all recommendation groups. Yet, perceived familiarity had a strong effect on perceptions of relevance and willingness to follow-up on the recommendations. (e proposed strategy contributes to the design of a people recommender system, which exposes users to diverse recommendations and facilitates new social ties in online social networks. In addition, we advance user-centered evaluation methods by proposing measures for subjective perceptions of people recommendations.


Introduction
Social media and social networking services such as Twitter are widely used in professional cooperation within and across organizations, helping to gain new insights and share knowledge. e functionality of recommending new connections is essential for expanding the social network and introducing new professional ties. Such people recommender systems represent the areas of social computing and social matching [1], which are argued to require careful design of the algorithmic principles [2]. us, people recommenders aim at influencing followership by suggesting seemingly suitable others based on user modeling and predictive analytics. e majority of existing approaches tend to support homophily bias [3]-a tendency of preferring others with similar characteristics as oneself, focusing on similarities in user-created content [4]. Another commonly used principle is the triadic closure [5] in the followership networks [6] that focuses on friend-of-a-friend connections. Furthermore, the "Who to follow" feature on Twitter has been found to favor already popular users and promote uni-directional network connections [7]. A recently much-discussed concern is that network-based algorithms on social media can lead to echo chambers and perpetuate social polarization [8] because they are efficient in reproducing existing connections but limited in developing new ones. erefore, introducing new social ties is likely to be based on similarity or close social vicinity of the active user.
Consequently, an important goal has been set to increase diversity in the recommendations [9,10], potentially decreasing human and algorithmic biases [11]. Our work highlights this goal toward diversification and heterogeneity, especially in the professional networking context where diversity is seen as a key driver for fruitful collaboration [2]. Diversifying people recommendations can enable unexpected yet valuable social encounters [12], which require alternative recommendation strategies to identify relevant people in the vast and complex Twitter network. Traditional recommender systems research seeks to optimize algorithmic accuracy and effectiveness [13,14], creating algorithms that can reproduce actors' current behavior as accurately as possible [15] rather than aiming at increasing diversity. In turn, focusing on accuracy results in a lack of user-centered research addressing the intricacies of recommendation strategies regarding the desirable degree and types of diversity exposure. To this end, understanding the users' subjective perceptions of the relevance of given diversityoriented people recommendations is crucial.
An ongoing merger of three nearby universities provided an opportune case study for exploring a new social matching strategy on Twitter.
is merger raised a need to enable cross-sectoral collaboration between scholars and stakeholders within the new university community [16]. Prior research suggests that bridging polarized intellectual communities and increasing social awareness contributes to developing creativity and innovation capabilities [17]. e pool of Twitter users following one of the to-be-merged universities represents an implicit community of interest in research and innovation with various backgrounds, disciplines, and areas of life at a specific locality. To make this community explicit, we address the untapped potential for professional social matching by introducing new connections with a diversification strategy that subscribes to the principle of balancing between similarity and diversity [2,18]. Specifically, we vary degrees of diversity in the social network structures while at the same time seeking shared interests and topics by measuring the similarity of the produced content.
To apply and evaluate the recommendation strategy in practice, we collected tweets and followership data on more than 12,000 actors who follow the Twitter account of at least one of the three universities. To remedy isolated social groups on Twitter, we suggest reshaping the social network structures rather than exposing the users to more diverse content. In contrast to prior research, which typically analyzes only followership ties [19], we also use mention-based social networks since mentions are stronger interaction indicators between actors. Such an approach allows for identifying three topology-based structural positions in the active user's egocentric network [20]-Dormant ties, Mention-of-Mention, and Community membership. While previous research touched on three structural network positions [21], in this paper, we provide their extended definition and description of the analysis procedure and present empirical findings of an online user experiment on subjective perceptions of the produced recommendations.
To empirically study the proposed diversification strategy, we set the following research question: How do recommendations based on the proposed structural positions associate with the subjective users' perceptions of the relevance and willingness to follow-up? Unaware of the different recommendation groups, 39 voluntary Twitter users in the target community evaluated a total of 288 recommendations (72 from each proposed structural position and one baseline group). e analysis shows that the proposed structural positions can help introduce diversity exposure in different ways: remind about forgotten ties, motivate to connect with new people, and help enter latent communities. us, the paper contributes to interdisciplinary research on social and people recommender systems by proposing a nonconventional perspective for diversifying the pool of people recommendations and, prospectively, making online social networks more heterogeneous.

Related Work
We first outline existing conceptualizations of diversity and similarity within the context of interpersonal relationships and social matching. Next, we outline the existing people recommendation approaches and diversity-enhancing mechanisms on Twitter. Finally, we review research on usercentered evaluation of recommender systems.

Optimizing for Diversity or Similarity.
Concepts of similarity and diversity are two essential polarities in social participation. Driven by the natural tendency of humans to prefer similar others [22], homogeneity is preferable when establishing trustworthy and coherent relationships [23]. At the same time, diversity is vital for productive and innovative collaboration [24]. Prior research has studied the perceived diversity of social relationships [25] and explored how diversity dimensions (e.g., cognitive, physiological, and demographic differences) are addressed in Human-Computer Interaction research [26]. User-centric recommender systems research is interested in diversity as a design goal to overcome algorithmic biases [11,27] and drawbacks of personalization in information filtering [28,29]. e common conceptual aspect across prior literature is that diversity is seen as the opposite of similarity [30] and has been defined, for instance, as average dissimilarity [31], distributional inequality [32], and nonredundancy [33]. erefore, diversity can be interpreted as a perceived difference or measurable distance between all recommendations presented to the user.
While both similarity and diversity can be substantial, optimizing for either of them has been criticized [10]. For instance, social recommendations built on the principle of similarity might strengthen existing communities but can also lead to social polarization and echo chambers [34], hampering information flow, innovation, and creativity [35]. Extreme diversity among community members can negatively affect, for example, knowledge sharing and decisionmaking [36], resulting in conflicts, especially in the case of surface-level social and cultural differences (e.g., demographic qualities). us, researchers have investigated how to overcome or decrease the impact of the abovementioned adverse effects. For instance, it has been found that actors should share common ground in terms of background qualities, values, or goals to establish fruitful relationships [37]. At the same time, professional roles, capabilities, and skills should vary [38]. Rajagopal et al. [39] suggest that matching people based on dissimilarities of attitude or opinion toward the topic of interest results in better learning experiences compared to similarity-maximizing recommendation approaches. Geared toward diversity-enhancing approaches in the social matching of scholars, researchers have proposed recommending not only very similar others but also somewhat similar and different people to extend the social circles [13].
In this study, we focus on the so-called diversity exposure [10] that refers to "the content that the audience actually selects, as opposed to all the content that is available." Diversity exposure is associated with studies on detecting expertise and opinions online to extend personal autonomy (individuals' choices) and overcome echo chambers [40] for more informed rather than polarized opinions. We contribute to the research on diversity exposure by introducing a strategy that exposes Twitter users to the diversity in their social networks. Driven by the idea that fruitful relationships benefit from both shared interest and diversity, we conduct an experiment controlling for similarity and focusing on the effects of social diversity. While the concepts of diversity and similarity are well studied regarding cognitive qualities, personality, and demographics, their manifestation on social networks remains understudied. Considering that social structure encapsulates various human biases, our approach aims to decrease their impact by enhancing diversity in the composition of individuals' social networks.

2.2.
Approaches to People Recommendations. Epistemologically, Twitter-based people recommendation approaches utilize user modeling based on data retrieved from basic features of the platform: "follow," "tweet," "mention," and "retweet" [19]. Accordingly, content-based approaches focus on analyzing textual content, such as tweets and retweets, while network-based approaches examine followership and mentions relationships. Table 1 provides an overview of existing approaches for recommending people on Twitter. e most conventional approach identifies similarities in users' topics of interest content-wise and shared audience in social networks. In addition to similarity-based approaches [41], the number of followers and followees in a user profile can be used for producing recommendations based on the "popularity" dimension [42]. Recommendations can also be based on users' activities such as tweeting, mentioning, and retweeting [43]. For example, depending on the social matching scenario, the most popular and active users might be prioritized or omitted from the list of recommendations.
Since traditional recommendation approaches have been criticized for fostering human and algorithmic biases [45], recommender systems research has recently explored different approaches for diversification. In the context of people recommendations, these approaches fall into two categories. e first relates to the diversity of features-the most conventional approach that focuses on deriving multiple user features, that is, explicit or implicit characteristics such as interests, social network, affiliation, and others. For example, Yuan et al. [46] proposed extracting contextual features such as mobility and activity for people recommendations on Twitter. Guimarães et al. [47] proposed an extension of users' features by simultaneously utilizing content-based, collaboration-based, and user-based information, thus increasing user modeling accuracy and the effectiveness of people recommendations. e second category relates to the diversity of analytical procedures-utilizing hybrid analysis techniques for filtering the recommendation pool. A representative example of this approach complements identifying the content similarity with sentiment analysis to create emotion-based recommendations for matching Twitter users with shared topics and similar [48] or different [49] emotional attitudes toward them. Jacovi et al. [50] argued for mining person and content interest relationships to complement existing approaches based on similarity and familiarity. According to the authors, merely being similar or familiar with a person does not imply directional interest, yet it is essential for establishing new ties. We also contribute to the diversity of analytical procedures by combining different analyses for social tie identification accompanied by retrieving the topic similarity from the content of users' tweets. In addition, we consider contextuality through boundary specification for the recommendation pool: geographically bounded shared interests serve as a common ground across members with inherent internal diversity of expertise sectors.
From the perspective of social networks, people recommendation approaches are limited to the analysis of followership networks and typically utilize triadic closure principles [19]. Smith et al. [51] argued that the nature and topology of networks on Twitter are underutilized, and conventional filtering or recommendation algorithms are trapping users to homogeneous content and social connections. Following the call for diversifying social network structures, Sanz-Cruzado and Castells [52] proposed recommending weak ties derived from dynamic interactive networks (e.g., based on retweets or mentions) of Twitter users. However, their evaluation study is based purely on comparing generated recommendations for forming connections in real life. Importantly, they do not ask users about their perceptions of the recommendations. Although we firmly subscribe to the overall goal of Sanz-Cruzado and Castells' work, we approach the network-based recommendation mechanisms and the evaluation procedure differently. We propose utilizing both followership networks to identify weak ties and mention-based networks to reveal interaction-driven weak and tacit connections. Aiming at user-centered evaluations beyond the accuracy [15], we also measure subjective perceptions of recommendations from different structural network positions.

User-Centric Evaluation of Recommender Systems.
Recommender systems traditionally utilize system-centric evaluation methods and rarely assess the quality of recommendations with user-centric experiments [15,53]. System-centric methods algorithmically simulate the Advances in Human-Computer Interaction accuracy by comparing the estimated opinions regarding the value of recommendations with pre-built ground truth datasets [54]. User-centric evaluation collects opinions and observes behavior during the interaction with the recommender system [55]. Prior research suggests that systemand user-centric measures could lead to contradictory results [56]: recommendations that the system estimate to be relevant may not be perceived the same way by the user. erefore, the need for operationalizing subjective measures to evaluate recommendations' quality has been raised [13].
Nevertheless, the research on defining and applying subjective evaluation measures in practice remains scarce. Existing research primarily focuses on assessing the system's objective aspects, such as interaction effort and efficacy [56]. e measures that aim to reveal users' attitudes regarding recommendations are mainly driven by the idea of evaluating trust toward the system and its functional effectiveness. For instance, measures such as perceived accuracy [55] and familiarity [57] were proposed assuming that recommendations that best match the user's interests and are perceived as familiar increase the trust toward the system and imply efficiency. e subjective measures of novelty and diversity are driven by the goal of revealing the users' satisfaction [58,59]: the novel and diverse recommendations can increase the subjectively perceived usefulness.
In summary, existing evaluation measures have been proven suitable for item recommenders (suggestions on products and content). However, as objects of recommendations, people represent more complex quality criteria that can affect decision-making regarding evaluating their value. Considering social matching scenarios for professional social networking, the subjective perceptions on recommendation relevance can be influenced by a particular need for partnering. is calls for context-specific operationalization of evaluation metrics [2]. Besides, there are no established measures for subjective perceptions of the people recommendation to our best knowledge. is paper proposes evaluating the relevance of recommendations from two perspectives-the value of recommended people for professional activities and their topics' usefulness. In addition, we operationalize measures for evaluating low-and highcost follow-up activities.

Exploring and Defining Structural Network Positions
Our overall matching strategy is to introduce people who share similar interests based on the tweets' content (e.g., shared scientific interests) but have only an indirect or inactive connection in the social network. us, by controlling content similarity, we can compare subjective perceptions of recommendations based on the different structural positions defined as follows: (i) Dormant tie (Dorm)-reintroducing existing followee with whom user did not have any explicit interactions; as a recommendation mechanism, this could remind about possibly ignored or forgotten ties [60]; type of a connection [61] in the mention network; in contrast to typical followership networks, the mention network is based on more explicit interactions; (iii) Community membership (Com)-a user identified to belong to the same community cluster [62]; this could introduce new people in a computationally identified network cluster with no explicit followership and mention-based ties; (iv) e rest of the population (Rest) as a baseline condition-all the other users who follow at least one of the institutional accounts, considered as the most random source of recommendations.
In the following, we describe the procedure for exploring and defining the three structural positions as potential recommendation strategy, including details on data collection, data processing, and analysis methods.

Data Source, Cleaning, and Preprocessing.
We used the official Twitter API to collect followers of to-be-merged universities and their recent tweets (See Figure 1, step 1). e raw Twitter data were stored on a MongoDB database (https://www.mongodb.com/), a flexible data model that allows development without a predefined data schema. e system is implemented in Python, and we use the PyMongo package (https://pypi.org/project/pymongo/) to set up a communication channel with a MongoDB database. We collected tweets and followership data using separate modules for each task, respectively, "GetTweets" and "GetFollowers." e preprocessing phase takes care of the tweet text cleaning task (See Figure 1, step 2) in three stages: (1) converting letters to lowercase, (2) removing the English stop words, and (3) removing the nonletter characters and URLs. Since we aimed to generate person-to-person recommendations, preprocessing also consisted of manually

Approach
Type of recommended users Similarity [41] Users who share interests (tweet content) or audience (network) Triadic closure [6,19] If A & B both follow C, recommend A to B (or vice versa); if A follows both B & C, recommend B to C; if A follows B and B follows C, recommend C to A Popularity [42,43] Users who have many followers or followees Activity [43] Users who frequently tweet, mention, retweet, etc. Reciprocity [44] One's followers Activity & followership [44] Users whose tweets are retweeted by followees of the target user 4 Advances in Human-Computer Interaction filtering out the organizational accounts (e.g., Twitter profiles of local companies). Next, we generated models of target users (See Figure 1, step 3) by collecting the statistical information regarding followers of universities' accounts, including the total number of user's tweets, the number of languages in use, and the number of tweets in each language. An index is created for each user in the database. e text corpus from all the cleaned tweets of a user is collected as the corpus profile ("corp_profile"). We selected users who follow at least one of the selected four university-related Twitter accounts in Tampere, Finland. For each follower, we collected their 500 most recent tweets.
e collected dataset consisted of 12,809 distinct followers and 3,523,397 tweets. e content analysis could only be done within one language corpus because of a lack of analysis procedures supporting the local language. erefore, we excluded Twitter profiles that contain less than 50 English tweets (out of the most recent 500) from the analysis and the pool of potential recommendations. It is noteworthy that English is actively and proficiently used by most of the users in the dataset. Tweet language distribution is as follows: 58.70% are Finnish, 31.06% are English, and 10.24% are other languages. As a result, the final dataset comprises 4,474 users and 933,785 English tweets.

Social Network and Content Analyses.
e data analysis (See Figure 1, step 4) comprises building mention-based and egocentric followership networks for detecting structural network positions and content analysis to obtain cosine distances for measuring the content similarity. To identify structural network positions (See Figure 1, step 5), we utilized the NetworkX Python package (https://networkx.github.io). NetworkX allows to model, manipulate, and analyze the structure of networks by specifying nodes and edges between them. In our use case, the nodes represent distinct actors in the Twitter network, and the edges illustrate the relationships between them. We created a directed mention network from the collected dataset. We used the mention network to identify the MoM and Com groups. e Mentions-of-mentions group follows the triadic closure principle-if user A directly mentions user C and B, then C would be recommended to B and vice versa. For the Community membership, we utilized the Louvain Modularity algorithm [63] directly in the mentionbased network to detect groups of strongly related users without an explicit connection. e Dorm structural position is identified by utilizing the followership network populated directly from the Twitter API. An edge connecting two nodes in the followership network represents an existing followership link between two users on the platform. We identify the Dorm structural positions by utilizing both the mention and the followership networks.
For the experiment stage, we set the requirement for users to have at least three potential recommendations per each structural network position and apply the filter accordingly (See Figure 1, step 6).
We run the content-based similarity analysis utilizing the unsupervised topic modeling technique Latent Dirichlet 3. GENERATE TARGET USERS MODEL "_id": ObjectID("59f70a1e0366e"), "name": "User Name", "screen_name": "Screen Name", "id": 14094651, "tweet_lan_stats":{ "en": 136, "id": 2, "hr": 1, "ti": 1 }, "twet_num":500, "twet_lan_num":4, "corp_profile":[ "application", "postdocsession", ... "api", "economy  Advances in Human-Computer Interaction 5 Allocation (LDA) [64]. Each topic in the LDA model is constructed with a multinomial probability distribution of words. Given a document, in our case, a user's corpus profile, the LDA model can calculate the probabilities of being in each topic of the document. us, a vector of LDA topic representation for each document can be generated, where a given number of topics Z within a corpus comprises documents t. In our case, one document consists of a set of tweets per user. LDA defines each topic over a set of n-grams w. Accordingly, the process can be formulated as follows: Next, we use cosine distance to measure the similarity between two users as each of them has its own LDA topic vector. e lower the cosine distance value, the higher the similarity. e cosine distance is a similarity measure metric between two nonzero vectors. Given two vectors u and v, the cosine distance between them is calculated as follows: erefore, content analysis (LDA + Cosine distance) allows sorting the recommendations within each group of structural positions from the most to least similar in relation to the target user (See Figure 1, step 7). Next, for each follower of the target university accounts, the top three content-wise similar users were identified within each structural position group (See Figure 1 Table 2). e numbers vary between users, depending on the number of followees and activity on Twitter (see examples in Figure 2). We aimed to provide a minimum of four and a maximum of twelve recommendations for each participant (1-3 from each group), which introduced the requirement for eligible respondents to have at least three other Twitter users in each structural position. 574 users out of the 4,474 met this requirement. Evaluating one set (4 recommendations) was mandatory for each participant, and the other two sets of four recommendations were voluntary. By providing up to three sets of recommendations for each participant, we wanted to achieve a higher number of evaluations per each structural network position. Some respondents evaluated all three sets of recommendations (12 in total), some only one or two sets (4 or 8 in total). is procedure resulted in subjective perceptions on 72 recommendations from each structural network position to compare them statistically.

4.1.
Procedure. e evaluation of the proposed structural network positions was carried out with two online surveys deployed on Google Forms: (1) a background questionnaire querying about demographics and the participation consent; (2) a survey with a personalized list of four other Twitter users (See Figure 3) and a set of questions to evaluate the recommendation. We chose to use Google Forms because it allows scripting-based automation to generate personalized surveys with tailored lists of recommendations.
e respondents were given no information about the types of structural network positions or why these individuals were particularly recommended to them. e order of recommendations was randomized within each set. e evaluation survey measured several subjective constructs, including perceived familiarity and perceived relevance of the recommendations and one's willingness to follow-up on them. No existing subjective measurements for perceived relevance could be found in literature, particularly for people recommendations.
erefore, the statements were operationalized based on the authors' personal experiences and insights on academic collaboration and user experience evaluation. Initially, over 20 candidate items were iteratively

Recruitment and Respondents.
We subscribed to the importance of research integrity and followed the policies provided by the National Ethical Committee in Finland. Accordingly, a study does not require an ethical review if it includes informed consent and does not involve any of the following: underage subjects, exposure to strong stimuli, potential long-term mental distress, or intervention with the physical integrity of participants. e study was identified as low-risk and, hence, did not require an ethical review. e participants were provided with a consent form that included a link to a detailed ethics disclaimer explaining the integrity and data management principles. We invited eligible participants over e-mail. e targeted participants' contact information was publicly available on their Twitter profiles, and anyone could access it. e invitation consisted of a short description of the study, including links to the Background and Consent survey and a detailed ethics disclaimer. As an incentive for participation, we organized a raffle of Amazon vouchers. Along with the background information, we queried the respondents' typical behavior and attitudes to professional social networking with 7-point Likert statements (See Figure 4). On average, the respondents frequently network with other people, maintain their networks, and are typically careful in choosing with whom to network. From a professional perspective, their occupation typically requires intensive collaboration. Being successful in their work depends on established social ties, and they use Twitter to support their professional networks. is implies that social networking is an essential element in their professional lives. e majority also indicated that they mostly interact with like-minded people at work. e participants were provided with a consent form that included a link to a detailed ethics disclaimer explaining the integrity and data management principles.

Survey Data Analysis.
e collected responses were imported to SPSS for statistical analysis, addressing two objectives. First, we tested if the proposed structural positions are perceived to be different from each other and could thus serve as alternative analytical mechanisms. e evaluation has a thrice-repeated within-subjects design with four categorical data points per respondent. As the collected data are ordinal, we utilized a nonparametric Friedman test with Bonferroni corrected pairwise comparison to measure statistical differences. e input data for the Friedman test was in a rank format, where rank represents the frequency of each Likert scale value per evaluation statement. e second objective was to identify correlations among experimental variables. We were particularly interested in revealing whether perceived familiarity or attitude toward social and professional networking correlate with perceived relevance

Survey about potential connections on Twitter
Dear participant, Below you can see a personalized list of potential valuable connections. First, take a look at all the profiles. When you are ready to give your opinion on each of them, please proceed to the next section. Feel free to make notes while getting familiar with their profiles. At the end of the survey, we will ask you which one of them was the most interesting to you. and willingness to follow-up on the recommendation. We utilized a nonparametric bi-variate Spearman correlation test as the collected data are ordinal (Likert scale).

Findings
We first provide results on objective measurements of similarity and respondents' perceived familiarity with recommended Twitter users, followed by subjective perceptions regarding the relevance of the recommendations. Next, we describe the respondents' readiness to engage in follow-up activities with the recommended people. Finally, we report bivariate correlation tests on associations between various variables, which provide additional insights and future research directions. In all the subsections, we first report statistical results and continue to present related qualitative findings.

Objective Measures of Similarity.
e respondents were unaware of the cosine distances (similarity measures) to avoid biased evaluations of recommendations based on the structural network positions. We aimed to pick recommendations with as equal cosine distances as possible (i.e., smallest possible variance in terms of content similarity). However, the measures are personal for each respondent and depend on the size of the recommendation pool. Since the Com and Rest recommendation pool sizes are larger, there is a higher probability of having potential recommendations with a smaller cosine distance (See Figure 5(a)). e recommendation pool of Dorm and MoM is significantly smaller, and therefore, on average, the distance is higher. e scatterplot of respondents' scores given to recommendations over the cosine distance values demonstrates a somewhat random distribution, indicating no dependencies between them (See Figure 5(b)). A correlation test further supports this fact in Section 5.5. erefore, we argue that the slight variance in content similarity does not prevent comparing different structural position groups.

Differences in Perceived Familiarity across the Structural Positions.
e descriptive statistics results confirm that the most familiar recommendations belong to the Dorm group (See Figure 6). 44% of the recommendations from the Dorm structural network position fall into the category of either familiar or very familiar, 18% are somewhat familiar, and the remaining 38% are either unfamiliar or very unfamiliar. e other three structural groups primarily consist of unfamiliar people, with few outliers. In the MoM group, 71% of the recommended Twitter users were regarded as unfamiliar, 14% were considered familiar, and 15% somewhat familiar. e recommendations from Com and Rest groups have almost similar proportions of unfamiliar people-94% and 91%, respectively.
e Friedman test indicated a statistically significant variation in respondents' ratings of perceived familiarity across different structural positions. e pairwise comparison identified substantial differences in the evaluations of the Dorm group versus other groups.
As expected, in the open-ended questions, many respondents stated that they already follow many of the recommended Twitter profiles that belong to the Dorm group. Although the respondents might be aware of the recommended person, the analysis of the social network structures revealed a lack of explicit interactions between them. e feedback in open-ended questions also supported the cases of users being unfamiliar with their followees. A relatively large number of unfamiliar followees in the Dorm group could imply that followership indeed is a weak indicator of actual social relationship and familiarity. As the act of following is typically a low-cost action, it might be even hard to keep track of and maintain their connections, especially when the number exceeds a thousand:

Advances in Human-Computer Interaction
While the Twitter user interface allows seeing followership relationships, it is more challenging for users to reveal connections based on mentions or especially mentions-ofmentions. Only a few respondents recognized that they have a bridging tie with the recommendations from the MoM group. For instance, one noticed that they have a shared professional connection with the recommended person: (MoM) " e person and her tweets are really interesting for me. She is perhaps the only one of the groups I find likely to contact and discuss future research collaboration. [. . .] Profile appears approachable, and she has been apparently already collaborating with some people I know." (R1, Principal Research Scientist; 2,566 followers, 3,120 followees) Being unaware of different structural network positions, the respondents were positively surprised by receiving many unfamiliar and diverse recommendations. In what follows, the findings demonstrate that being familiar with a person seems to increase the perception of relevance and willingness to follow-up on recommendations.

Differences in Perceived Relevance across the Structural
Positions. e respondents' subjective perceptions of recommendations provided additional confirmation of the distinct nature of the three proposed structural positions.
ere is an apparent prevalence of positive attitude toward recommendations from the Dorm and MoM groups in the evaluations of both content relevance and professional relevance (See Figure 7). e Dorm group is perceived as the most favorable, while in the evaluations of content relevance, the opinion regarding recommendations from the Com and Rest groups split in half. Regarding the evaluation of professional relevance, the proportions of negative scores prevail.   (1)-strongly disagree (7)), which implies that there are no dependencies between the given scores and cosine distances.

Advances in Human-Computer Interaction
Com, and Dorm and Rest. Interestingly, the difference between MoM and Com is strongly significant only in evaluating whether topics of the recommended person are of interest to the participant (Statement 1 in Figure 7). As for the qualitative feedback, when rationalizing the relevance of recommendations, the respondents often address the importance of having similar interest topics. ere is a clear positive tone in the qualitative feedback regarding recommendations from the Dorm and MoM groups. As addressed earlier, familiarity plays a significant role in evaluating relevance, and respondents often start their rationalization by explicating an existing connection with the person, if there is any. In the following example, followership relationships between the respondent and recommended person started after the face-to-face encounter at the conference: (Dorm) "Lively, energetic, knows a lot about a host of topics, loves traveling. She is somebody I met at a conference a couple of years ago, and we have been in touch on social media as well." (R3, Senior Lecturer; 295 followers, 584 followees) In the next example, the respondent highlights that the recommended person is unfamiliar yet addresses the relevance of topics and the benefit of making a professional connection with a person from another university: (MoM) "Seems active, topics relevant to me. I did not know him probably because he is in a "distant" university; it is always good to know new people from other universities." (R13, University Researcher; 953 followers, 1,010 followees) When evaluating recommendations from Com and Rest groups, the respondents seem to consider a variety of dimensions. For instance, the activeness of users in publishing tweets and their self-representation also play an important role in choosing whom to follow or with whom to interact:   None of the respondents identified being socially connected with recommendations of the Com group, yet one respondent noticed the size of the community they share with the recommended person: To sum up, the data imply that the perceptions of relevance vary across different structural positions, thus supporting the objectively observed differences between the proposed network-based matching mechanisms. Yet, relevance seems to be a weak motivator for respondents to follow-up and interact with recommended Twitter users, as discussed next.

Willingness to Follow-Up on Recommendations across the Structural Positions.
e assessments of follow-up activities (See Figure 8) illustrate the weakest difference between structural positions, meaning that respondents are less open toward social interactions despite the level of recommendation relevance. Passively exploring tweets (statement 7) of recommended people from the Dorm and MoM groups is positively perceived, while other activities brought up primarily negative attitudes. e Friedman test demonstrated statistically significant differences in the evaluation of recommendations from three structural positions within all variables. e pairwise comparisons reveal apparent differences between the Dorm and Com, Dorm and Rest, MoM and Rest structural network positions regarding the intention to continue reading recommended person tweets (statement 7). ere is a statistically significant difference between Dorm and Com, as well as Dorm and Rest groups regarding the attitude toward mentioning recommended people. Other activities do not illustrate very significant differences. e respondents also addressed challenges in estimating the relevance of received recommendations and their intention for follow-up activities in the open-ended responses. For instance, one respondent mentioned that decisionmaking on the interestingness of a recommendation might be affected by the overall sympathy toward a Twitter user, making it challenging to draw a line between personal and professional interests: According to one of the respondent's reasoning, as Twitter is designed mainly for distributing knowledge, it was challenging to envision social interactions with a recommended person beyond the features that the platform offers: Even though few respondents were somewhat positive about recommendations from the baseline group (Rest), being unfamiliar with the recommended person seemed to play a role in determining the willingness to follow-up: "Quite general Twitter profile. Professional but also other content as well, such as news. I liked how she tweeted about the academia/academic world, although we do not work in the same field of study. I would follow her if I knew her somehow other than through Twitter. She seems nice and relatively active." (R12, Researcher; 448 followers, 1,260 followees).
In summary, while respondents expressed the readiness to engage in low-cost interactions, such as exploring the recommended profiles or starting to follow them, they were hesitant to consider initiating interaction beyond the Twitter platform so soon after seeing the recommendation. is is particularly the case if the actions require face-to-face interactions or direct contact. is is understandable given the short time frame for exploring the costs and benefits of potential social interaction and the limited view of the recommended person's profile. Besides, as one of the respondents admitted, Twitter is perceived as a platform for passive social behavior to broadcast and consume content. e users are accustomed to Twitter not providing features to extend the interactions to other channels.

Correlations across the Evaluation Variables.
In addition to looking at how the evaluations differ across predefined structural positions, we explored various statistical associations between the variables to identify future research questions.
e Spearman test revealed several statistically significant positive correlations (see Table 3). In particular, perceived familiarity positively correlates with all the other variables on subjective evaluations, especially with the perception of professional interest (statement 4) and followup activities such as an intention to mention the recommended Twitter user (statement 10). e respondents' background variables and social networking attitudes, such as frequency of socialization and activeness of maintaining existing ties, demonstrate a relatively strong correlation, Advances in Human-Computer Interaction particularly with the willingness to follow-up on recommendations (statements [8][9][10][11][12]. In addition, the test indicates that the scores of high-cost follow-up activities (statements 10-12) increase with higher ratings of Twitter use for professional networking. e test results also imply no dependencies between objective measures of similarity and subjective perceptions on recommendations, which were aimed for. Overall, while correlation tests do not infer causal relations between the variables, the test hints at interesting statistical associations that should be investigated in more detail, for instance, to cover not only correlations between the recommendation evaluations but also personality and attitude-related aspects.

Discussion
Diversity has been a central concept in the design of information systems and social technologies, particularly CSCW and HCI research exploring its different forms (e.g., cognitive, physiological, demographic) for more inclusive and accessible technologies.
is paper extended the discourse around diversity by focusing on the importance of structural diversity in social networks. We proposed identifying structural network positions in a multidimensional space of social networks, allowing the exposure of Twitter users to a variety of potential connections that they would otherwise likely miss.

Advances in Human-Computer Interaction
To answer our research question, the findings illustrate that recommendations are perceived differently.
us, both the objective measurements and subjective perceptions indicate the distinct nature of the proposed three structural positions. e respondents' relatively positive evaluations of relevance suggest that the proposed recommendation strategy is a meaningful approach for diversifying people recommendations.
Furthermore, the fact that the respondents could identify relevant others from all groups implies various internal and external factors that might influence the subjective perceptions. It provides evidence that identifying recommendations within a latent community of interest (academic institutions at a specific locality in this case) is a promising approach to boundary specification. us, the proposed structural positions can help introduce diversity exposure in different ways, prospectively suggesting connections that Twitter users would otherwise overlook.

Contributions and
At the same time, the findings indicate a significant role of familiarity in the subjective evaluation of recommendations. As expected, most recommended Twitter users were unfamiliar to the respondents. However, there were some familiar people, even in the "community membership" and the "rest" groups. is could be explained by the empirical context of the experiment, where boundary specification was based on the followership of the selected institutional Twitter accounts bound to a locality. We assume that this would have a strengthening effect on perceived familiarity. Research on social psychology has also revealed that homophily bias increases the perception of familiarity [65]-the stronger the perceived similarity, the more preferable and familiar the person would seem to be.
A relatively large number of unfamiliar followees in the Dorm group could imply that followership indeed is a weak indicator of actual social relationships. us, even though there is an established followership link, it is worthwhile to remind users about people belonging to the Dorm structural network position, which could result in more explicit social interaction. e presence of a relatively high number of outliers in the familiarity evaluation can also be explained by some respondents having a large number of followers and followees, all of whom they practically cannot remember. It has been shown that cognitive and temporal limitations prevent people from maintaining the number and quality of their relationships [66,67]. Besides, interactions on social media might create a false sense of connection [68] and do  Advances in Human-Computer Interaction 13 not match with offline relationships or predict the degree of familiarity between the actors. In addition, this study contributes to user-centric evaluation methods in the context of people recommender systems. Prior research on evaluating recommender systems is largely built on the assumption that the more accurate the algorithms, the better the user experience [15]. While this approach is useful in evaluating item recommendations (e.g., products or multimedia content), recommending people involves a different notion of recommendation quality. e operationalized measures of subjective perceptions presented in this work can be utilized in future research on evaluating the relevance and familiarity of social recommendations. However, measuring the potential follow-up activities beyond the intention to start following the recommended person worked poorly: the findings illustrate that respondents are generally not interested in speculating high-cost follow-up actions (e.g., face-to-face meetings).
us, it is questionable to utilize such a measure as an indicator of recommendation quality, at least in the context of controlled experiments. at said, we acknowledge an inherent challenge in measuring the relevance of people recommendations. As the benefits of more heterogeneous social networks only surface over time, measuring the immediate impression about the relevance of a recommendation will likely not reflect their long-term value as a connection.
6.2. Practical Implications. Existing recommendation mechanisms shape the choices people make, influencing not only the diversity of interests and opinions but also social structures [69]. Mindful of the threat that such a high agency could strengthen structural issues like polarization and echo chambers, we believe that the recommendation strategy proposed in this article is worth pursuing in practical applications. Notably, the proposed structural network positions could contribute to systems design that can lead to more diverse exposure in individuals' social networks. e analysis procedures could be transferred to many other social media platforms; after all, the proposed content analysis is not limited to hashtags, and the notions of followership and mentions are common in other services as well.
When applying the proposed strategy and the structural network positions in a real-life people recommender system, the restrictions on eligibility criteria for the users, driven by the experiment setup, can naturally be disregarded. ere might be scenarios when Twitter users do not have enough recommendations from the Dorm type of a structural position. However, our finding demonstrated that the MoM and Com groups result in numerous options even for the users with a small number of followers and followees. e effectiveness of the proposed strategy can be strengthened by increasing transparency regarding the recommendation logic in the actual system and explicating the potential value of recommendations from each group. is, in turn, might also improve the willingness to follow-up on them.

Limitations and Future Work, Experimental Setup and Generalizability.
e conducted experiment naturally comes with limitations that can affect the validity and reliability of the findings. First, although the presented diversity-enhancing recommendation strategy seems promising, we could not yet compare that with other strategies in this pioneering study. us, the assessment of the goodness of the recommendation strategy remains quite preliminary based on this experiment. In future work, a comparison with conventional recommendation algorithms would show the goodness in relation to the currently used standards.
In addition, we only tried three calibrations within the proposed strategy and with a sample size limited by practicality and data availability. Regarding generalizability, the respondents mostly represent the same geographical area and cultural background due to the selected focus of introducing users who feel some affinity to the tobe-merged universities. e sample of participants for the experiment might also be considered biased, as the number of respondents' followees and followers is higher compared to an average Twitter user. Large-scale studies and comparisons against different baseline recommendations are required to prove the effectiveness of the overall matching strategy and the structural positions as recommendation mechanisms. Nevertheless, as the paper lays the groundwork for a new diversification strategy, it is essential to show that such an alternative strategy is sensible from the users' viewpoint and technically feasible before comparing it with others. We call for follow-up research to also compare the effectiveness of current and other alternative algorithmic approaches.
Network analysis. Our recommendation approach utilizes social network analysis, which has been critiqued due to several issues [70]. First, modeling networks is limited in terms of deriving personal roles and interpersonal experiences. It is an oxymoron to reduce multi-faceted and dynamic social relationships into network structures with simple node and edge features. Second, network-based analysis can hardly reveal a full and truthful picture of realworld relationships. Nevertheless, our study demonstrates that it is possible to analyze Twitter social networks in new ways that can advance the social matching of individuals.
Content Analysis. Due to the lack of accurate content analysis procedures for the local language, the content analysis was limited to English tweets. In addition, we did not distinguish between personally created tweets and retweeted tweets. ese factors might have decreased the accuracy of representing an individual's topics of interest, which, in turn, could affect the subjective perceptions of the recommendations.
is study also does not consider the segmentation of the participants according to the types of Twitter use or personality traits [71]. However, the results of correlation tests imply that such factors can be relevant. is opens avenues for investigating various personality-related and other background variables that might affect the perceived relevance of recommendations and readiness to follow-up on them.

Conclusion
Despite the extensive prior research on people recommendations on Twitter, the social network structure perspective has been generally underutilized. To address this gap, we proposed a new recommendation strategy for a Twitter-based implicit community of interest, prospectively producing more heterogeneous people recommendations and positively diversifying the users' social networks. e novelty of the proposed strategy lies in combining mentionbased and followership-based networks to identify different types of structural network positions: dormant (followership with no explicit interactions), mention-of-mention (a friend-of-friend connection in the mention network), and community based (users belong to a shared community with no explicit followership and interactions). e findings illustrate that the proposed structural positions are indeed distinct from each other and that the respondents could find relevant users from all groups. However, the willingness to follow-up on recommendations is relatively low and primarily driven by perceived familiarity. We call for more design-oriented research to identify solutions that could increase the probability of follow-up actions. We conclude that a more comprehensive analysis of social networks and more human-centric methods to evaluate recommendations are necessary to improve the benefits and effectiveness of people recommendations on Twitter and other social networking services.
Data Availability e data generated or analyzed during this study are included in this article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.