Introduction
In Colombia, with the increasing visibility of non-heterosexual groups and the impact on people's physical and mental health when they perceive themselves to be stigmatised and discrim inated against, the need for knowledge about the attitudes of other groups towards these people makes it necessary to have valid and reliable measuring instruments that quantify different types of prejudices.1,2
The Homophobia Scale (HS-7) is a seven-item, Likert-type questionnaire for quantifying one prejudice, i.e. a negative attitude towards homosexual people.3 The HS-7 is one of the instruments available for quantifying people's attitudes towards homosexuality, and one of its main attributes is the small number of items(3) Despite its small size, it shows good psychometric performance, with high internal consistency and adequate types of validity.4,5 The fact that the HS-7 is so quick to complete may explain why it is used so frequently in research involving higher education students around the world.6-8
The performance of the HS-7 in medical students in Colom bia has been presented in three previous articles. In the first study published, 199 medical students from the first to fifth semesters of a university in Bogotá, Colombia, participated and the scale was reported to have a high internal consistency (α = 0.78 and Ω = 0.79), adequate convergent validity (r = 0.84 with the scale for attitude towards lesbians and gay men [ATLG]), acceptable discriminant validity (r = -0.06 with the
General Well-Being Index [WHO-5]), poor nomological validity (r=0.19 with the short Francis scale for religiosity [Francis-5]) and one single domain or factor accounting for 44.7% of the variance.9
In the second study, 124 students participated, in this case from the sixth to the tenth semester at the same univer sity in Bogotá, and the investigators found adequate internal consistency (a = 0.81 and i2 = 0.82), high convergent validity (r=0.82 with ATLG), optimal discriminant validity (r=-0.03 with WHO-5), poor nomological validity (r=0.19 with Francis-5; with no significant differences in the scores between men and women, when higher scores were expected from the men) and one single factor was retained that accounted for 49.2% of the total variance observed.10
Lastly, in the third study, 366 students from the first to the ninth semesters of a university in Bucaramanga par ticipated; the findings were acceptable internal consistency (α = 0.78 and Ω = 0.79), good convergent validity (r =0.82 with ATLG), optimal discriminant validity (r=0.03 with WHO-5), inconsistent nomological validity (r=0.16 with Francis-5, lower than expected, and with significant differences between men and women, higher amongst males, as is usual with most prejudices) and one single domain that accounted for 43.8% of the variance was retained. Also in that study, an additional validation test was performed, based on item-response theory: differential item functioning (DIF) was reported by gender, and no significant differences were found between males and females in any of the seven items of the scale.11
Factor analysis is usually related to the construct valid ity of a scale.12 However, it should be borne in mind that all known and calculated validity forms contribute to the construct validity, the practical and objective utility of a the oretical concept.12,13 The investigations previously reviewed showed that the factor solution was close to the desired 50% in only one of the analyses. It was also found that items 2, 4 and 6 showed poor individual performance, with corrected Pearson's correlations and low commonalities.9-11
Given that factor analysis is the most appropriate strategy for the review and refinement of measurement scales, an anal ysis was performed in this study to observe the performance of the HS-7 and to refine the scale after removing items with poor performance in the previous studies.12 It was assumed that the validation of measuring instruments is a continuous pro cess that requires constant revision and adaptation with the adequate use of different statistical tests.12,13 In addition, in this secondary analysis the CFA (which was omitted in the pre ceding articles) was carried out to support the interpretation of the findings.
The objective of this analysis is to review the psychometric functioning and to refine the content of the HS-7 in medical students at two universities in Colombia.
Material and methods
We conducted an observational, analytical validation study within the context of a larger study that explored the psy chometric performance of various scales in medical students. For this study, the standards for health research in Colombia were followed; an institutional ethics committee reviewed and approved the research project and the consent of the research participants was obtained once they had been informed about the objectives and the respect for privacy and confidentiality of the data provided.14
A total of 667 first to tenth semester medical students from two universities, one in Bogotá and another in Bucaramanga, participated in this study. The participating population was aged from 18 to 34 (mean, 20.9 ± 2.7) years. With regard to gen der, 60.6% were women. This group represents 96.8% of the samples participating in the three studies mentioned above, as we excluded 22 (3.2%) participants who did not complete the Zung Self-Rating Anxiety Scale-Short Form (ZSAS-SF), which was included in the present analysis and showed better internal consistency than WHO-5, used in the other analy ses for discriminant validity.9-11 Participants completed the questionnaire in the classroom in the presence of a research assistant, who presented the study objectives, requested vol untary participation and gave instructions on how to complete the research questionnaire. The questionnaire did not ask for the person's name, with the aim being that completing it anonymously would encourage them to answer as honestly as possible. It asked only for basic demographic information and included the ZSAS-SF,15 the scale for attitude towards gay men (ATG),16 and the HS-7.3
The ZSAS-SF is a brief self-administered questionnaire that consists of five questions that investigate symptoms such as nervousness, fear for no apparent reason, muscular pains, easy fatigue and feeling dizzy in the period covering the last 30 days. The scale provides four response options ranging from never to always. The response of the participant is rated 1-4, with a range of possible total scores of 0-20; a higher score indicates more and greater anxiety symptoms that may be of clinical importance. This scale has been used in differ ent research projects in Colombia and shows high internal consistency.15
The ATG is a ten-point scale that explores attitudes towards homosexual men in relation to different topics such as adop tion, marriage, work and other general impressions. The instrument consists of a Likert-type response pattern (polytomous) with five response options from "strongly disagree" to "strongly agree". Each response is given a score of 0-4, a possible spectrum of 0-40. The higher the score, the worse the attitude towards gay men, with more extreme prejudice or homophobia.16 The Spanish version of the scale shows good psychometric performance.9,17,18
Cronbach's alpha (α)19and McDonald's omega (Ω)20 were calculated as reliability indicators. To determine the conver gent and discriminant validity, Pearson's correlation (r) was calculated.21 For convergent validity, the total scores on the ATG and HS-4 were correlated, and for the discriminant valid ity, the total scores on the ZSAS-SF and HS-4.
For the estimation or verification of the nomological valid ity of the HS-4, the means ± standard deviation of the male and female scores were compared using the Student's t test (significantly higher scores were expected from men than from women).
Lastly, in order to corroborate the dimensionality of HS-4 and HS-7, a factor analysis was carried out by the maximum likelihood method, the commonalities were observed and the Kaiser-Meyer-Olkin (KMO) coefficient22 and Bartlett's test of sphericity of the sample were calculated.23 For the KMO coef ficient, a value >0.600 was expected and for the Bartlett's test, probability <0.0524.
CFA was performed to confirm the factor structure previ ously determined in the EFA. In order to evaluate the fit of the models in HS-7 and HS-4, the x2 test was determined with the respective degrees of freedom (df) and probability value (p), root mean square error of approximation (RMSEA), with a 90% confidence interval (90% CI) as is customary, the com parative fit index (CFI) and the Tucker-Lewis index (TLI). For x2 the probability value was expected to be >5%; for RMSEA, <0.06, and for CFI and TLI, values >0.89. Most of this analysis was performed with the SPSS 16.0 statistical package,25 while the CFA was completed with the Mplus 7.21 software.26
Results
The HS-7 showed a = 0.793 and i2 = 0.796, and a main fac tor that explained 45.2% of the total variance. The CFA showed x2 = 139.756; df=13; p<0.01; RMSEA=0.121; 90% CI, 0.103-0.139; CFI = 0.953; TLI = 0.923.
Given that these findings indicated the removal of three items, the performance of a four-item version (HS-4) was tested. The HS-4 showed α = 0.770 and Ω = 0.775, with one sin gle factor accounting for 59.7% of the total variance (x2 = 3.622; df =1; p = 0.057; RMSEA = 0.063; 90% CI, 0.000-0.130; CFI = 0.998; TLI = 0.991). The details on commonalities and coefficients of the items in the factor analysis are shown in Table 1.
For the nomological validity of HS-4, we compared the mean ± standard deviation of the male and female scores (10.1 ±3.7 vs. 9.3 ±3.6). The difference was statistically sig nificant (Levene's test for homogeneity of variance, F = 0.004, p = 0.949, t = 2.499, df =665, p = 0.013, two-tailed).
The convergent validity of HS-4 with ATG (α = 0.821) showed high correlation (r=0.778, p<0.001). The discriminant valid ity with ZSAS-SF (α = 0.789) showed a very poor correlation (r=-0.047; p=0.223).
Discussion
This research shows that HS-7 and HS-4 are scales with high reliability and adequate validity. However, the shorter version accounts for a higher percentage of the total variance, with better indicators in the CFA.
Instruments need to be validated to improve the mea surements of constructs, both in the clinical context and in research work.12,27,28 The tendency at the moment is to continually revise the scales already available, with careful evaluation of the performance of individual items and removal of those with poor indicators, in order to reduce the number of items on the scales without losing reliability, validity and practical utility.12,13,27,29,30
There are a number of advantages to short-form instru ments in clinical and epidemiological studies. First, from the psychometric perspective, is that they collect or preserve the essential or structural aspects of the construct that usually meet in the factor or dimension that explains the main or highest percentage of the total variance.24,31 Second, use of these shorter versions reduces the possibility of overestimat ing the reliability and internal consistency due to the number of points in the scale, as that type of coefficient is sensitive to the number of items. The greater the number of items, the greater the internal consistency, even with a significant reduction in the intercorrelations among items, which to a large extent is an indicator that the different points approach or attempt to quantify the same construct32,33 Third and last is the fact that it is more operative. A measuring instrument should be practical both for application and for the qualifica tion and interpretation of the scores. Short scales reduce the amount of time required for completion, with less chance of users developing fatigue or boredom, as can occur with long questionnaires, thus providing even more assurance of the validity and reliability of the measurement.27
Having instruments such as the HS-4 with good psycho metric performance in medical students is necessary, given the high frequency of sexual prejudice in this group of people.34 It will allow research to be carried out to help iden tify the scale of the problem, so that the necessary appropriate measures can be taken to reduce the negative impact of sex ual prejudices in the medical profession during the training process and while practising.35
We are able to conclude that in medical students from two cities in Colombia, the HS-4 showed high internal con sistency, good convergent validity, adequate discriminant validity, excellent nomological validity and one dimension that explains more than 50% of the total variance, with bet ter indicators in the CFA fit than the HS-7. Further research is needed to show the psychometric performance of the IHS-4 and confirm these initial observations, which must be consid ered as preliminary.