Development of the Short-Form Yin Deficiency Scale Using Three Item Reduction Approaches

Background Yin deficiency (YD) is a pathological condition characterized by emaciation, afternoon fever, dry mouth, and night sweats. The incidence of YD is 23.3%. A 27-item Yin Deficiency Scale (YDS) was developed to estimate the clinical severity of YD. This study aimed to develop three short-form YDS versions to reduce the burden of response time, using three item-reduction approaches: Rasch, equidiscriminatory item-total correlation (EITC), and factor-based analyses. Methods Two datasets were analyzed from previous studies (169 outpatients from May to June 2009 and 237 healthy college students from January to April 2016). The optimal response category was examined using Rasch analysis. Items with higher item-total correlations were determined using the EITC. Using a factor-based approach, the items were reduced, while maintaining the original YDS construct. Reliability was estimated using the person separation index (PSI) and Cronbach's α values. The predictive accuracy was examined using the area under the curve (AUC). Finally, the relationship between YD and dysfunctional breathing (DB) was examined using factor scores from the YDS and the Korean version of the Nijmegen Questionnaire (KNQ). Results We developed two 14-item YDS versions using the Rasch and EITC approaches, and a 16-item YDS version using a factor-based approach. Rasch analysis suggested an optimal response category of five points. The PSI of Rasch and Cronbach's α of the EITC and factor-based versions were 2.19, 0.855, and 0.827. The AUCs of the three short-form YDS were 0.812, 0.811, and 0.818. The sensitivity of the EITC-YDS was 0.632, which was lower than its specificity of 0.875. The fatigue-related scores of the factor-based YDS were fairly correlated with the factor scores of the KNQ estimating the DB (r = 0.349–0.499). Conclusion The 14-item Rasch- and 16-item factor-based YDS may replace the original YDS during YD's primary screening, epidemiological surveys, and health checkups.


Introduction
Yin defciency (YD) is a pathological condition with diverse symptoms and signs, including emaciation, fatigue, pain, and weakness, especially in the lower limbs; afternoon or night coughs; dry mouth; night sweating; and frequent urination [1].A previous study reported that the incidence of YD was 23.3% in the elderly group [2].YD is induced by insufcient yin fuid, including intra-and extracellular fuid, lymphatics, blood, and synovial fuid, and thus, the diminished moisturizing function may secondarily result in heat-or dryness-related symptoms and signs [3].Some studies have reported that YD is a subtype of the pathological patterns of climacteric syndrome [4], tuberculosis [5], diabetes mellitus [6], and psychiatric disorders [7].Increased YD was associated with the survival rate in late-stage cancer [8].Considering the incidence of YD and its broad physiological and pathological spectrum, a questionnaire that can initially screen for the presence or absence of YD will be helpful for clinical trials, epidemiological surveys, and health checkups.
Park et al. developed the 27-item Yin Defciency Scale (YDS) [9].Te YDS consists of eight factors with a Cronbach's α of 0.885 [9].Based on receiver operating characteristic (ROC) curve analysis, the predictive accuracy of the YDS estimated by the area under the curve (AUC) was 0.875, and its cutof value was determined as ten points.Since its development, the 27-item YDS has been widely used to evaluate the clinical severity of YD.YD scores estimated by the YDS were associated with an aggravation of the quality of life [9].Increased YDS scores were associated with decreased blueness of the face and tongue tip [10,11].Regarding vocal quality, increased YDS scores were associated with decreased modulation of the fundamental frequency [12].Te YDS scores for the young population with dysfunctional breathing (DB) were higher than those without DB [13].Although the YDS has been broadly utilized to estimate the clinical severity of YD, it has 27 items, which may require response time and the ability to complete it [14].In particular, patients with difculty with handwriting or cognition may be afected by the length of the diferent questionnaires [14].Terefore, this study aimed to reduce the number of items in the original YDS using three item reduction methods: Rasch, equidiscriminatory item-total correlation (EITC), and factor analyses.
Rasch's analysis is based on item response theory, in which each item response in the questionnaire is taken as an outcome of the independent interaction between the respondents' abilities and item difculty [15].To overcome the limitations of classical test theory, Rasch analysis includes an examination of item hierarchy, ftting error, and diferential item functioning (DIF) by sex and age [14,16].EITC is a modifed version of item-total correlation (ITC) [17].In the EITC, items are discriminated through three percentile points (25%, 50%, and 75%) of the total scores, and correlations between the dichotomous scores of the items and the total scores of the YDS may be calculated within the three percentile categories [18,19].Te third approach is factor-based item reduction.Te Korean version of the Nijmegen Questionnaire (KNQ), which assesses DB-related symptoms, comprises four etiological factors [20].If the number of YDS items is reduced while maintaining the construct of factors and reliability levels, it will be helpful to understand the relationship between the etiology of YD and DB by examining the correlations between the short-form YDS and the factors of the KNQ.
In summary, Rasch analysis minimizes bias due to item hierarchy and DIF by sex and age, whereas EITC and factor analyses reduce item numbers while maintaining the reliability and construct validity of the original questionnaire.By comparing the advantages of these three item-reduction approaches, researchers and clinicians may be able to relieve the burden of time or handwriting of respondents through a short-form questionnaire that suits their purposes.Finally, we calculated the AUC of the three short-form YDS versions using receiver operating characteristic (ROC) curve analysis and compared their predictive accuracies with the 27item YDS.

2.
1. Data Sources.Two datasets were used in the study.One dataset was previously used to develop the original version of YDS [1].In the previous study, 169 outpatients (39 men aged 42.1 ± 14.7 years; 130 women aged 43.5 ± 14.9 years) from 12 Korean medical clinics completed the 27-item YDS from May to June 2009 [1].Twelve Korean medical doctors with clinical experience, blinded to the YDS scores, determined the presence or absence of the YD pattern for each outpatient.Another dataset was collected from 237 college students (130 men aged 21.4 ± 1.9 years; 107 women aged 21.4 ± 3.0 years) who had no impediments to daily life caused by psychological or respiratory problems from January to April 2016, and they were asked to complete the KNQ, together with the 27-item YDS [13].In the two datasets, the YDS items were rated on a 7-point Likert scale: 1 � disagree very strongly; 2 � disagree strongly; 3 � disagree; 4 � neither agree nor disagree; 5 � agree; 6 � agree strongly; 7 � agree very strongly.Te items of the KNQ are rated on a 5-point Likert scale: 0 � never; 1 � rarely; 2 � sometimes; 3 � often; 4 � very often.Te second dataset did not include information on the clinicians' determination of YD and was used only to examine the relationship between the factor scores of the short-form YDS and the KNQ.Te study protocol was approved by the Institutional Review Board of Kyung Hee University Oriental Medical Hospital at Gangdong (approval number: KHNMCOH 2021-02-001).

Rasch Analysis.
Rasch analysis used the partial credit model because the 27-item YDS was answered using polytomous responses such as a 7-point Likert scale [15].Te frst step in Rasch analysis was to evaluate the appropriateness of the response category.Category probability curves and the ordering of the response categories were examined.If the peak of one curve overlapped with another peak, the response category was excessive, and one of the two response categories was removed [15].Along with examining category probability curves, the ordering of response categories was examined using step calibration values.Despite the wellseparated peaks of each probability curve, the disordered step calibration value, the decreased calibration value among all other increased calibration values, indicated an excessive response category and the category needed to be fused with the adjacent category.Terefore, the examination of the optimal response category was repeated until the separation of probability curves and the ordering of calibration values was satisfed.Te second step in Rasch analysis was to examine the DIF.DIF analysis is a measurement of bias and refers to the diference in the probability of providing a certain response between groups [16].In most cases, age and sex diferences result in DIF [16,21].Terefore, differences in the item responses of the YDS between the sexes and between the older and younger age groups were examined.Rasch modeling assumes that items are weighted according to their difculty along a linear logistic function, and the mean square (MnSq) levels and chi-square statistics divided by the degrees of freedom are calculated to examine 2 Evidence-Based Complementary and Alternative Medicine whether the difculty of each item fts the linear function [14].Terefore, the third step was to evaluate the MnSq levels of inft and outft for each item [18].Trough the evaluation of MnSq levels, misftting items were deleted, and iterations of ft evaluations were conducted until all items were free of ftting errors [18].Finally, the unidimensionality and reliability of Rasch YDS were examined [22,23].
2.3.EITC.EITC, a modifed ITC, was used to reduce items in the questionnaire [24].ITC focuses on the correlations between each item's scores and the questionnaire's total scores.Te EITC reset the three cutof points according to the three percentile levels of the total scores (25%, 50%, and 75%) and transformed the total scores into dichotomous values [18,19].For example, total scores below and above the cutof point of 25% were transformed into scores of 0 and 1.
Similarly, other dichotomous total scores were determined according to 50% and 75% cutof points.Tereafter, the EITC was calculated as the correlation between the three sets of dichotomous value-transformed total scores and the questionnaire item scores.Tree sets of tables according to the three percentile categories were rearranged in descending order according to the EITC score.Four or fve items with the top-ranked EITC were extracted from the 25% percentile category.Te same number of items with the top-ranked IETC as those in the 25% percentile category were extracted from the 50% and 75% percentile categories.
If the same item was on the top list for both the 25% and 50% categories, it was dropped from the list of the 50% group, and the next-ranked item from that group was substituted into the 50% list.Te item in the 75% list was dropped, and the next-ranked item was substituted if it was in both 50% and 75% ranks [18,19].As it was reported that Cronbach's α > 0.800 is preferable [25], we calculated the minimal item numbers to guarantee a Cronbach's α of 0.800 using the Spearman-Brown prophecy formula [26].If the total number of items to satisfy Cronbach's α level is a multiple of three, all top-ranked items may be extracted from the three percentile groups.However, if the total number is not a multiple of three, the multiple of three items exceeding the minimal numbers suggested by the Spearman-Brown prophecy was primarily extracted from the three percentile groups, and the items with the lowest EITC were removed until the minimum item numbers satisfying Cronbach's α of 0.800 were reached.

Factor Analysis.
Te items of the original YDS were previously determined using the contribution scores to YD by 50 Korean medical clinicians who were asked to rate forty-three items on a 7-point Likert scale (1 � no contribution to YD; 7 � greatest contribution to YD) using the Delphi method [27].Trough two iterations of feedback, 30 items with a contribution score over 4.00 were extracted, and the following study fnally determined the 27-item YDS satisfying reliability and construct validity [1].Table 1 lists the fnal 27 items and mean contribution scores for YD estimated by clinicians [27], and Supplementary Table S1 lists eight factors of the YDS extracted from the 27 items using principal component analysis (PCA) [1].As shown in Supplementary Table S1, eight factors were associated with the symptoms, lesions, and subtypes of YD.For example, cough, fever, pain, and fatigue were named according to the symptoms of YD, whereas urine and skin factors were named according to the lesions afected by YD.Kidney liver defciency is one of the most frequently observed subtypes of YD in clinical cases.
As mentioned earlier, we speculated that a factor-based approach may help examine the relationship between the symptoms, lesions, and subtypes of YD and the clinical severity of the disease.A factor-based approach was implemented using the four-step item-reduction procedure proposed by Smith et al. [28].Tis procedure had the advantage of minimizing the loss of reliability level while maintaining the construct of factors.In Step 1, Cronbach's α values of all factors and whether the values may increase when an item is removed from each factor were examined.In Step 2, we examined whether the decrease in Cronbach's α values may be minimized when removing an item.In Step 3, we examined whether face or content validity was maintained after the items were removed or retained in Steps 1 and 2. If face or content validity collapses, returning to Steps 1 and 2, some items may be retained despite their low reliability.In Step 4, a principal component analysis was conducted to examine whether there were remarkable changes in the construct for the reduced items.However, we omitted step 3 from the development of the short-form YDS because all items of the original YDS satisfed content validity via experts' contribution scores for the YD.In other words, it was inappropriate to add dropped items to maintain or increase the validity level.Terefore, we conducted factor-based item reduction, where Cronbach's α values of the eight factors of the original YDS were examined, and the items contributing to an increase in overall Cronbach's α value (step 1) or contributing to a minimal decrease in the value (step 2), were removed until two items remained with each factor.Tereafter, a PCA was conducted on the short-form YDS to examine whether there were any changes to the construct of the original YDS (Step 4).Finally, the overall Cronbach's α coefcient of the short-form YDS was compared to that of the 27-item YDS.

Statistical Analysis
2.5.1.Rasch Analysis.In examining the response category, the ordering of the item responses was acceptable when the total counting numbers of each response category were higher than 10 points, the average measure and step calibration showed an ascending order, and the outft level of each category was lower than 2.0 [29].If there was any violation among the item response categories, the category was unifed with an adjacent category, and examination of the ordering of all categories was repeated until the violations were corrected.Together with the numerical examination, the overlap of a category curve peak with other curves was examined, and the fusion of two adjacent categories was repeated until all the peaks of the category curves Contribution scores were previously reported through two iterations of the Delphi method [27].4 Evidence-Based Complementary and Alternative Medicine were well separated [30].DIF was assessed in both sex and age groups.Te median value of the participants' age was 41.0 years, thus those >41 years to the older group and those <41 years to the younger group.Diferences in logits between men and women and between the older and younger groups were examined using the chi-square test [16].Inft and outft were assessed, and items with MnSq values of inft or outft <0.70 or ≥1.40 were removed sequentially [18].From the item response perspective, unidimensionality denotes that, among the short-form YDS, the second factor may comprise only one item, which helps avoid scoring unrelated dimensions within the reduced items [22].Unidimensionality was determined when the unexplained variance in the frst contrast extracted from the PCA was <2.0 [22].Person separation and reliability indexes were calculated to examine the reliability level of the reduced items where separation index ≥2.0,or reliability index ≥0.8, was considered highreliability levels [31].

EITC Analysis.
Te Spearman-Brown prophecy test was used to determine which item numbers could predict a Cronbach's coefcient >0.800, as this value is considered a preferable level of reliability [25,26].After determining the minimal item numbers, each item and the total score were transformed into dichotomous variables according to the three percentile levels.EITCs were calculated using Spearman's rho correlations, and the top-ranked items were rearranged in descending order of EITCs among the three percentile categories.As mentioned, one or two items with lower EITC levels were removed from the item pool with the top-ranked EITC, according to the total number of items calculated using the Spearman-Brown prophecy formula.

Conduction of PCA.
For the items within the eight factors, we examined which items resulted in a slight increase or minimal decrease in Cronbach's α of each factor [21].Sequential removal of items was maintained until only two items remained for each factor.Since the "kidney-liver defciency" factor of the 27-item YDS comprised only two items, the item removal procedure was not conducted for this factor.Trough item reduction using the factormaintaining method, 16 items, including two items in the eight factors, were determined for the short-form YDS.A PCA was conducted for the factor-based YDS comprising 16 items to examine whether there are any changes in the construct of the eight factors in the short-form YDS compared with that of the 27-item YDS.Only factors with eigenvalues greater than 1.0 were retained in PCA using the Kaiser criterion.Along with construct changes, we examined whether there were any changes in the overall Cronbach's α level for the 16-item YDS compared to that of the 27item YDS.

ROC Curve Analysis.
After examining the reliability, construct validity, and dimensionality of the three short-form YDS using Rasch, EITC, and factor-based approaches, ROC curve analyses were conducted to compare their predictive accuracy for YD.In the three ROC curve analyses, the total scores of the short-form YDS served as test variables, and the presence or absence of YD, as determined by 12 clinicians in the previous study, served as the gold standard [1].Te predictive accuracy levels of the three short-form YDS were independently calculated using AUC.It is generally accepted that AUC values >0.9, 0.7-0.9, and 0.5-0.7 indicate high, moderate, and low accuracies, respectively [32].An optimum cut-of point corresponded to the maximal Youden index (Youden index � sensitivity + specifcity −1) [32].

Correlation Analysis.
A previous study reported that the 16-item KNQ consisted of four factors: neuropsychological, respiratory, neurogastrointestinal, and neuromuscular [20].Correlations between the total and factor scores of the KNQ and factor-based YDS were examined using Pearson's rho coefcient.A correlation coefcient ≥0.8 was considered a "very strong correlation," that of 0.6-0.7 was considered a "moderate correlation," that of 0.3-0.5 was considered a "fair correlation," and that of 0.1-0.2 was considered a "poor correlation" [33].Correlation and factor analyses, reliability tests, and ROC curve analyses were performed using SPSS version 21 (SPSS Inc., Chicago, IL, USA), while Rasch analysis, including category probability, DIF, ftting error, unidimensionality, and person reliability index, was performed using Winsteps 4.8.Statistical signifcance was set at P < 0.05.

Rasch Analysis.
Te category characteristics of the 27 items are summarized in Table 2. Te 7-point category responses were arranged in ascending order, and all outft MnSq levels were below 2.0.Te counts for the seven categories exceeded 10.However, the step calibration value for category 4 was lower than that for category 3, indicating that both categories were disordered.Figure 1(a) shows the probability curve of Question 1 (Q1: night cough), according to a 7-point Likert scale, where the peak of category 3 was fused with that of category 4. Tis and the step calibration results indicated that categories 3 and 4 were disordered.Furthermore, the peak of category 5 sank under that of category 6, although the step calibration was slightly increased (0.04).Tis disordering between categories 3 and 4 and between 5 and 6 for the Q1 "night cough" were found equally for the probability curves of the other 26 question items.Categories 3 and 4 were frst fused to correct for the categories' disordering because the degree of disordering of categories 3 and 4 was greater than that of categories 5 and 6.Te fusion of categories 3 and 4 reduced seven categories to six (Table 2).Te probability curve and step calibration analyses were repeated for the six categories.Consequently, the disordering of categories between categories 3 and 4 has been corrected.However, there was still the disordering of categories 4 and 5 among the six categories, which corresponded to the disordering of categories 5 and 6 among the seven categories (Figure 1(b)).Tis indicated that disorder existed among the six categories; categories 4 and 5 were Evidence-Based Complementary and Alternative Medicine   Evidence-Based Complementary and Alternative Medicine unifed, and the third-step calibration and probability curve analysis were conducted.Finally, 5-step categories showed an ascending order of step calibration throughout all categories, and all peaks of the probability curve were well separated (Figure 1(c)).Tis indicated that fve response categories (disagree very strongly disagree strongly, disagree, agree strongly, and agree strongly) were suitable for respondents' answers to the YDS.Table 3 presents the DIF results based on sex and age.Te logit values for "afternoon fever (Q4)," "night fever (Q7)," "morning fatigue (Q15)," "susceptibility of heat and cold (Q16)," "night hot soles (Q22)," and "sweating during sleep (Q23)" in the older group were higher than those in the younger group, while the logit values for "persistent cough (Q2)," "residual urine (Q13)," "difculty in containing the urine (Q14)," and "bone steaming (Q21)" in the younger group were higher than those in the older group.Regarding sex diferences, the logit value only for "dark yellow urine (Q27)" in women was higher than in men.Terefore, 11 items showing DIF by age or sex were removed, and the remaining 16 were analyzed using Rasch analysis.
Table 4 lists the ft levels of the sixteen items without DIF.In the frst analysis, "afternoon cough (Q3)" and "wake due to night urination (Q11)" showed ftting errors [18].Terefore, the second analysis was conducted after removing the two items from the item pool.As a result, the remaining 14 items were free of ftting error, ranging in inft and outft values from 0.70 to 1.39, and additional ftting analysis was not considered [18].Te raw or overall scores denoting the frequency of responses ranged from 341 points "night cough (Q1)" to 650 points "fatigue (Q17)".In the reliability test, the person separation index was 2.19, and the person reliability was 0.83, indicating that the 14 items by Rasch analysis showed a high level of reliability [31].Table 5 lists the dimensionality results of the 14 items.Unexplained variance in the frst contrast was 1.994 (<2.0), implying the 14 items by Rasch analysis as unidimensional [22].According to the category response, DIF, ftting error, and dimensionality analyses by Rasch analysis, the 14-item YDS rated on fve categories was fnally determined.6 lists the EITC results by three percentage points (25%, 50%, and 75%).In the Spearman-Brown prophecy analysis, 14 items were suggested as minimal numbers for guaranteeing Cronbach's α of 0.800.Terefore, in each percentile category, fve items with topranked EITC values were extracted from the three percentile categories, respectively.Among the 15 items, "hair loss (Q25)" with the lowest EITC value (r � 0.368) was removed because the purpose of item shortening by EITC was to reduce items as many as possible while retaining Cronbach's α of 0.800.Finally, 14 items were determined as short-form YDS by ETIC.Cronbach's α for the 14-item YDS by EITC was 0.855.

Factor-Based Analysis.
As mentioned earlier, items that contributed to a slight increase or minimal decrease in Cronbach's α values within a factor were removed item by item until two items remained within each factor.PCA was then conducted to examine the changes in the construct and Cronbach's α of the factors.Table 7 lists the factor loadings of the 16 items and the Cronbach's α values of the factors.Te eight factors in the 27-item YDS were reduced to fve in the 16-item YDS."Cough" and "fever" factors of the 27-item YDS were still preserved in the 16-item YDS, while "painweaknes" and "fatigue" factors of the 27-item YDS were unifed into one in the 16-item YDS.Similarly, the "urine factor" and "skin-hair factor" were unifed into one.Te Cronbach's α values of the eight factors, which consisted of For the 7 points, C1 (category 1), disagree very strongly; C2 (category 2), disagree strongly; C3 (category 3), disagree; C4 (category 4), neither disagree nor agree; C5 (category 5), agree; C6 (category 6), agree strongly; and C7 (category 7), agree very strongly.In the 6 points, C3 (disagree) and C4 (neither disagree nor agree) of the 7 points were unifed to C3 (disagree).In the 5 points, C4 (agree) and C5 (agree strongly) of the 6 points were unifed to C4 (agree strongly).Evidence-Based Complementary and Alternative Medicine two items, ranged from 0.282 to 0.818 (Supplementary Table S1).However, the Cronbach's alpha for the fve factors in the short-form YDS increased from 0.492 to 0.818.Te total percentage of variance in the 16-item YDS by factor-based reduction was 61.61%, and the overall Cronbach's α of the 16 items was 0.828.

ROC Curve Analysis.
Table 8 lists the ROC curve analyses of three short-form YDS versions by Rasch, EITC, and factor analyses.Supplementary Figure S1 shows maximal Youden points on the ROC curves of the three short-form YDS versions.Te previous study has reported that the sensitivity, specifcity, AUC, and cut-of points of the 27item YDS were 78.7%, 84.8%, 0.885, and 10 points, respectively [1].Te AUC is a refection how well the test distinguishes between YD and non-YD groups [32].Te AUC serves as a single measure summarizing the discriminative ability of a test across the full range of cut-ofs, independently with the prevalence of disease or pathological pattern [34].In this study, the AUC levels of 14-item Rasch, 14-item EITC, and 16-item factor-based YDS were 0.812, 0.811, and 0.818, indicating that three short-form YDS had moderate accuracy for determining YD.
In examining sensitivity and specifcity levels using the maximal Youden index, Rasch and factor-based models revealed similar sensitivity and specifcity levels ranging from 0.737 to 0.789.However, for the EITC model, the sensitivity level (0.632) at the maximal Youden index (0.507) was lower than the specifcity level (0.875), while the Youden index with similar sensitivity and specifcity levels (0.719 and    < 0.01.Among the 15 items, "hair loss" had lowest EITC (0.368) and was fnally removed.

Evidence-Based Complementary and Alternative Medicine
10 Evidence-Based Complementary and Alternative Medicine 0.723, respectively) was 0.443, being lower than the maximal Youden value of 0.507.Figure 2 shows which items were overlapped or separated in the three short-form YDS versions.For example, "frequent urination (Q12)" and "dry and cracked heel (Q18)" were included only in the Rasch model, while "dry mouth (Q5)," "weakness of the lower limbs (Q8)," "night itch (Q19)," and "rough skin (Q26)" were overlapped with the three short-form YDS.

Correlation Analysis.
In the examination of the incidence of YD among 237 college students, 51 students showed a total score of 27-item YDS over 10 points, and the incidence of YD was 21.5%.Table 9 lists Pearson's correlations between the total and factor scores of the KNQ and the three short-form YDS versions.Te total KNQ scores were positively correlated with the three short-form YDS by Rasch (r � 0.564), EITC (r � 0.498), and factor analysis (r � 0.517).Te four-factor scores of the KNQ also showed fairly positive correlations with the total scores of the three short-form YDS versions (r; 0.352-0.489).About fve-factor scores of the factor-based version, "fever," "cough," "sweating-feet," and "urine-hair" had poor or fairly positive correlations with the total and the factor scores of the KNQ (r; 0.128-0.499).

Discussion
In this study, we developed three short-form YDS versions using the Rasch-, EITC-, and factor-based approaches.Te main fnding of this study was that the reliability and predictive accuracy of the three short-form YDS versions were comparable to those of the original 27-item YDS.Tis indicates that the 14-item Rasch and EITC YDS and the 16item factor-based YDS can be utilized to estimate the severity of YD or determine the presence or absence of YD in clinical cases.However, our results also suggest that caution should be exercised when prioritizing the short-form YDS according to the characteristics of each approach, clinical situation, and study purpose.
Regarding the brevity of the three short-form YDS versions, the item reduction ratio of the Rasch and EITC approaches (13/27, both) was higher than that of the factorbased approach (11/27).Terefore, the short-form YDS by the Rasch and EITC approaches may be prioritized because the two questionnaires may shorten the completion time compared to the short-form YDS by a factor-based approach.In examining the reliability of the three short-form YDS versions, reliability levels estimated by Cronbach's α were preferable or higher.Interestingly, the fnal Cronbach's α of the EITC YDS was 0.855, which was higher than the value initially predicted by the Spearman-Brown prophecy formula (0.800).Although it was possible to reduce some items with lower EITC until Cronbach's α reached 0.800, we did not conduct additional item reduction by EITC because it might lower the predictive accuracy of the EITC YDS.Terefore, according to the Spearman-Brown prophecy formula, we determined the item number of the EITC YDS to be 14.Factor-based approaches are known to reduce the number of items while maintaining factor constructs [28].By reducing items using the factor approach, an undesirable decrease in reliability within each factor was minimized because the items contributing to a slight increase or decrease in intrafactor reliability were primarily removed from the factor.Te fnal Cronbach's α for the factor-based YDS was 0.827, indicating a preferable level of reliability [25].In the Rasch approach, a higher person separation index denotes higher sensitivity in distinguishing between high and low respondents [14,31].Tis study's person separation index was 2.19, indicating a high-reliability level [31].In  12 Evidence-Based Complementary and Alternative Medicine summary, the three approaches to item reduction may not have signifcantly decreased the reliability of the original YDS.
In examining the predictive accuracy of the three shortform YDS versions, the AUC values of Rasch, EITC, and factor-based approaches were 0.812, 0.811, and 0.818, respectively.Tese values were considered as having "moderate accuracy" [32], and were similar to the AUC of 0.875 for the original YDS.Terefore, predictive accuracy equivalent to the original YDS may be expected when utilizing the 14-item Rasch and EITC YDS versions and the 16-item factor-based YDS version.However, it should be noted that among the three short-form YDS versions, the EITC YDS showed lower sensitivity (0.632) than specifcity (0.875) at the maximal Youden points.One possibility is that the total scores of the 14 items in the EITC had a nonparametric distribution, which may have formed the jagged contour of the AUC.On the jagged contour, the increases or decreases in sensitivity and specifcity tended to become irregular as the Youden index increases [35].Tis means that for short-form YDS determined by the EITC, lower sensitivity or higher false-negative predictivity may have been barriers to the determination of YD using ROC curve analysis.Terefore, considering reliability, predictive accuracy, sensitivity, and specifcity simultaneously, this study suggests using the short-form YDS version using Rasch and factor-based approaches rather than the EITC YDS.
Although both Rasch and factor-based versions showed satisfactory reliability and predictive accuracy, it should be emphasized that the short-form YDS by Rasch approach had a few advantages over the YDS by factor-based approach.Rasch analysis clarifed the response category of the 27-item YDS by modifying the 7-point response scale of the original version of the YDS to fve points.Te response category of the fve points of the short-form YDS was lower than that of the short-form Phlegm Pattern Questionnaire, where 6point categories were fnally determined using Rasch analysis [36].After modifying the response category, 11 items with DIF regarding sex and age distribution and two items with inft or outft errors were removed from the twenty-seven items of the YDS, and fnally, fourteen items were determined.Among the ft indices, the outft index was more sensitive to unexpected responses in items far from the person measure, whereas the inft index was more sensitive to unexpected responses in items close to the person measure [37].Terefore, the short-form YDS from Rasch analysis may be broadly used in clinical cases, such as health checkups and epidemiological surveys, to minimize bias due to sex, aging, or unexpected responses.
In addition to the advantages of the Rasch approach, the advantages of the factor-based approach must also be described.Tis study showed "weak" positive correlations between the "cough" factor scores of the 16-item YDS and the "neuropsychological," "respiratory," and "neurogastrointestinal" factors of the KNQ.Tis suggests that the  14 Evidence-Based Complementary and Alternative Medicine etiology of cough in YD may not be closely related to the etiology of dysfunctional breathing.Rather, the "fatigue" factor scores of YDS had "strong" positive correlations with the scores of "neuropsychological," and "neurogastrointestinal" factors.Tis result may not guarantee the causality of YD-related fatigue with neuropsychological or neuro-gastrointestinal symptoms [13,20].Correlations between the factor scores of the factor-based YDS and the KNQ suggest that fatigue due to YD needs to be monitored and treated more intensively than other etiological or symptomatic factors of YD in patients with dysfunctional breathing.Terefore, the factor-based YDS may be used exclusively to examine YD's etiological, regional, and symptomatic characteristics in diverse diseases and syndromes.Tis study had some limitations.Item reduction by DIF in Rasch analysis is afected by sample characteristics, including environmental or racial diferences.Terefore, another item reduction of the YDS by Rasch analysis is needed in other samples to examine the similarity or dissimilarity of the 14-item YDS by Rasch analysis in this study.It should also be mentioned that the dataset used for item reduction in the original YDS was collected from outpatients who visited Korean medical clinics, whereas the dataset used to examine the correlation between YD and dysfunctional breathing was collected from a healthy young population.Terefore, it is necessary to examine the correlation between YD and dysfunctional breathing in the patient group.In the frst dataset, there were more women (130 outpatients) than men (39 outpatients), which may have afected the results of the Rasch analysis.Further studies are needed to overcome these limitations regarding sample characteristics, healthy populations, and diferences in the number of sexes.

Conclusions
Tis study aimed to develop three short-form YDS versions using Rasch, EITC, and factor-based approaches.Two datasets from previous studies (169 outpatients and 237 healthy college students) were analyzed.As a result, two types of the 14-item YDS were determined by Rasch and EITC analyses.A factor-based analysis suggested a 16-item YDS consisting of eight factors.Te Rasch analysis suggested a 5-point response category to correct for the disordering of responses.Te three-item reduction method showed moderate predictive accuracy in the ROC curve analysis.However, the specifcity of the EITC method was lower than that of other item reduction methods.Factor scores of the short-form YDS were either weakly or strongly correlated with those of the KNQ.In conclusion, the 14-item Rasch YDS may be utilized to estimate YD's clinical severity or screen out YD for health checkups, primary care, or epidemiological surveys.In contrast, the 16-item Rasch YDS may be utilized to examine the relationship between the etiological factors of YD and other diseases.

Table 1 :
Te 27-item Yin Defciency Scale.deep in the body, e.g., in the

Female
P values in bold indicate signifcant diferences in logit values between sexes or higher and lower age groups.

8
summation of fve top-ranked items among three percentile categories.All Spearman correlations had P

Figure 2 :
Figure 2: Items overlapped or separated by the three item-reduction methods.EITC, equidiscriminatory item-total correlation.

Table 3 :
Diferential item functioning results by sex and age.

Table 4 :
Item difculty and ftting levels of the short-form Yin Defciency Scale using Rasch analysis.

Table 5 :
Dimensionality results of the 14-item Yin Defciency Scale by Rasch analysis.

Table 6 :
Equidiscriminatory item-total correlation results by three percentile points.

Table 7 :
Factor loadings of sixteen items and Cronbach's α values of the factors.Bold letters indicate maximal factor loadings among the fve factors.Factor 1, fever factor; factor 2, cough factor; factor 3, fatigue-pain-weakness factor; factor 4, sweating-feet factor; factor 5, urine-hair factor.Final overall Cronbach's α of the 16 items was 0.828.

Table 8 :
ROC curve analyses of the three short-form YDS by Rasch, EITC, and factor-based approaches.Yin Defciency Scale; EITC, equidiscriminatory item-total correlation.Te values in bold indicate maximum Youden indexes (sensitivity + specifcity− 1) obtained using ROC curve analyses and cut-of points corresponding to the Youden indexes.

Table 9 :
Correlations between the total and factor scores of the KNQ and the three short-form YDS versions.KNQ, the Korean version of the Nijmegen Questionnaire; YDS, Yin Defciency Scale; EITC, equidiscriminatory item-total correlation.