Introduction
Chronic Heart Failure (CHF) prevalence is over 5.8 million in the USA, and over 23 million worldwide[1]. In Latin America, estimated Heart Failure (HF) prevalence is 1% (95% CI, 0.1 to 2.7%) with 199 per 100,000 person-year[2]. The hospital readmission rates of patients with HF are 33%, 28%, 31%, and 35% at 3, 6, 12, and 24 to 60 months of follow-up (median duration of hospitalization of 7.0 days), with 1-year mortality rate of 24.5% (95% CI, 19.4 to 30.0%) and in -hospital mortality of 11.7% (95% CI, 10.4 to 13.0%)[2]. Although mortality has decreased in CHF, the estimated survival rate is 50% and 10% at five and ten years after diagnosis[3].
The burden of the disease in CHF involves several limitations in patients carrying out daily life activities, and affects health-related quality of life (HRQoL) more severely than other chronic diseases[4]. Several studies have shown that HRQoL of patients with HF is worse than the general population, or patients with other chronic diseases[4,5]. Furthermore, the decline in quality of life of HF patients is not temporary, but rather progressive over time[6]. Nevertheless, measuring HRQoL in HF remains a challenge, and despite the existence of several instruments (generic and disease-specific) for assessing HRQoL, no consensus has been achieved on which instrument would be most suitable[7].
The Minnesota Living with Heart Failure Questionnaire (MLHF-Q) is a disease-specific instrument, consisting of 21 items addressing a wide range of HRQoL and it is the most frequently used internationally. Since 1987, the MLHF-Q has been translated into more than 30 languages, including Spanish[8-13] and it is used as an outcome measure in multiple clinical trials showing the best psychometric properties as to validity, reliability and sensitivity to change[14-17].
Even though Spanish is spoken by 95% of Latin America's population, Brazil, where Portuguese is spoken, is the only regional country where the MLHF-Q has been validated[18], while Colombia has no data available on the evaluation of the reliability and validity of the MLHF-Q. Therefore, we aimed to evaluate the internal consistency and construct validity of the MLHF-Q in patients with CHF in Colombia.
Methods
Study population
A cross-sectional study was conducted between February and October 2015 in the Heart Failure and Heart Transplant Clinic of Cardiovascular Foundation in Floridablanca city, Santander-Colombia. We included patients if they (i) were 18 years old or older and (ii) had a confirmed HF medical diagnosis by a cardiologist. Patients with mental sphere alterations or communication limitations were excluded. All patients gave written informed consent and the Research Ethic Committee of the institution approved the research protocol.
In calculating our sample size, care was taken to comply with the 10-patient-per-analized-item criterion considered adequate for factorial analysis[19]. The sample was selected in a non-probabilistic way; all patients were invited to participate consecutively by a previously trained nurse who conducted the interviews upon medical control appointments.
Clinical screening
HRQoL was measured with the MLHF-Q[11], a specific self-report instrument for CHF patients. HRQoL questionnaire is made up of 21 items graded by the patient using a 6-point Likert-type scale ranging from 0 (no impairment) to 5 (very much impairment). The MLHF-Q groups the items in three dimensions: physical (8 items), emotional (5 items), and the overall score for HRQoL (21 items). Eight separate items, which do not assess a single construct or dimension of HRQoL, measure social and economic impairments of patients with HF and contribute to the overall score. The total score has a range between 0 and 105 points, the physical dimension (between 0 and 40), the emotional dimension (0 and 25) and the separate items on the socio-economic impairments (0 and 40). High scores on the MLHF-Q scale indicate impaired HRQoL. The MLHF-Q has a global internal consistency measured by Cronbach's alpha of 0.94 (95% CI, 0.91 to 0.95) and general intraclass correlation coefficient of 0.84, characteristics that make it suitable for use[17].
Statistical analysis
Continuous variables are reported as median and quartiles (Q) unless stated otherwise, and categorical variables are presented as percentages. Internal consistency was evaluated through Cronbach's alpha coefficient[20]. Kaiser-Meyer-Oklin's (KMO) index and Bartlett's test of sphericity were estimated to establish the pertinence of factorial analysis. KMO ≥ 0.7 was considered acceptable[21,22].
To evaluate construct validity of the questionnaire, two different approaches were used: first, the structure of the model originally proposed by Rector and Cohn[8] was examined by means of confirmatory factorial principal component analysis (PCA). Dimensional structure was identified through varimax-type octagonal rotation, factor loading, and those ≥ 0.4 were considered acceptable[13,23]. Second, polytomous Rasch rating scale model was used to assess each specific questionnaire dimension according to the factorial structure proposed by literature[24]. Thus, the first step was to evaluate the functioning of rating scale categories. A clearly progressive level of difficulty across item categories was expected as a criterion of adequate function. We also examined the standardized (ZSTD) fit statistics of persons for whom a score between ±3 was expected.
For dimensionality evaluation, which is a fundamental requirement for construct validity, we applied the following criteria: (i) mean square information-weighted statistic (infit) and the outlier-sensitive statistic (outfit), with values between 0.7 and 1.3 indicate a good fit (ii) PCA of the residuals[25]. Unidimensionality was violated if, besides the first factor, other factors had eigenvalues >3, and the local dependency was assessed through the item residual correlations where values >0.5 may indicate that the response to one item may be determined by another. To detect the presence of differential item functioning (DIF), which occurs when groups within the sample respond differently from an individual item; we compared distinct levels of the trait by sex and age group (≤65 vs. >65 years). A Welch's t statistically significant (p<0.05), and a difficulty difference ≥0.5 logits were considered evidence of uniform DIF.
Finally, discriminative capacity of the questionnaire was assessed by its ability to differentiate among subgroups of patients with different levels of CHF severity, taking into account the following hypothesis: women, higher age, superior New York Heart Association (NYHA) functional class and Left Ventricular Ejection Fraction (LVEF) under 45% will have higher scores of the MLHF-Q, by using the Mann-Whitney U test. All statistical tests were two-sided and a p-value <0.05 was considered significant. Data were analyzed using Stata Statistical Software, version 14 and Winsteps 3.80.0.
Results
Characteristics of the study population
The proportion of missing data was 0%. During recruitment period, two hundred CHF patients fulfilled the selection criteria, agreed to participate and completed the questionnaire. Median age of participants was 64 (Q1=53; Q3=73) years old, 63.0 % were men, 79.5% had a LVEF ≤ 45%, and 24.0% subjects were in NYHA functional class III-IV. Sociodemographic and clinical characteristics of the study population are shown in Table 1.
Psychometric analysis
Internal reliability
Cronbach's alphas coefficients ranged from 0.73 (social dimension) to 0.91 (physical dimension and total score) in the MLHF-Q, indicating satisfactory level for internal consistency. Descriptive analysis and internal consistency of the MLHF-Q are shown in Table 2 and Supplementary MaterialTable S1.
Construct validity
The KMO statistic was 0.90, indicating sampling adequacy (Supplementary MaterialTable S2) and Bartlett's test of sphericity was statistically significant (chi2 (210)= 2126.20; p=0.000), suggesting that data were appropriate to be subjected to a factorial analysis[22]. All items in the first factor were associated to signs and symptoms of HF; this factor was identified such as physical dimension. The second factor, included four items of five items from the original questionnaire, and they were related to the patient's psychological response to disease; this factor was recognized as the emotional dimension. Finally, three items in the third factor were correlated to the patient's social relationships, thus this factor was named the social dimension. Then, confirmatory factorial PCA of three factors explained 54.03% of total variation in the study population, of which 30.6% was explained by the first factor, 15.8% the second factor and 7.6% the third dimension. Eigenvalue was 6.43 for the physical, 3.31 for the emotional, and 1.59 for the social dimension (Supplementary MaterialTable S3).
Table 3 shows factorial analysis, five of the 21 items demonstrated factor loadings between 0.4 and 0.6; four between 0.6 and 0.7; six between 0.7 and 0.8; four items had factor loadings >0.8 and, two items (14 and 16) did not adequately load (loading <0.4). Additionally, in Table 3 we can distinguish physical, emotional and social dimensions. Also, items 1, 8, 9, is preserved almost entirely, except for item 20. Finally, 10 (social) and 20 (emotional) have been reclassified in item 8 could belong to both, the physical and social physical dimension. Conversely, emotional dimension dimension.
Source: authors. aSingle items used in the construction of the overall score (social dimension); bPhysical dimension; cEmotional dimension.
Regarding Rasch analysis of the total score, the average measures of the rating scale of the MLHF-Q were ordered, progressing from -0.84 logits for rating scale category zero (no impairment) to 0.27 logits for rating scale category of five (very much impairment); disordered thresholds (response categories not working logically) were corrected by combining adjacent categories (Supplementary MaterialFigure S1); the result was a 3-point scale that met the criteria for rating scale. Eight persons had a ZSTD exceeding the value expected and were excluded from the analysis.
The person separation was 2.56 and reliability 0.87; for items these statistics were 5.00 and 0.96, respectively. Items 15, 16, 20, 14 and 10 showed fit statistics (Outfit, Infit) out of the established range for the analysis; statistics are shown in Table 4. In the PCA of residuals, 21 items and 192 persons explained the 42.0%. In the first contrast, we observed 3.3 eigenvalues with residuals correlations higher than 0.50. Items 1, 15, 16 and 17 had a difference in difficulty >0.5 logits by groups of age with Welch p values under 0.05. There was no evidence of uniform DIF by sex groups. After removing the five misfit items, the overall fit of the data improved, with 47.7% of raw variance explained and only two items (6 and 11) maintaining their fit statistics above the range.
Source: authors. a5 level of severity (higher values indicate higher severity). bExpected range 0.7 - 1.3. MLHF-Q=Minnesota Living with Heart Failure Questionnaire; SE=Standard error; MNSQ=Mean square fit statistic; Infit=Inlier-sensitive fit; Outfit=Outlier-sensitive fit.
In the analysis of physical, emotional and social dimensions, disorders of the rating scale were not observed, in fact, all analyses were made with the original MLHF-Q codification. Social dimension's items explained 44.8% of the variance and had 1.7 eigenvalues in the first contrast, the residual did not present any important correlation. Item 8 (working to earn a living difficult) of social dimension had a slightly lower value of the range. The eight items of physical dimension presented 2 eigenvalues in the first contrast and explained 57.4% of the raw variance; two of the items were above and one item was below expected range as shown in Table 4.
In the emotional dimension, one item presented a severe misfit Table 4, the variance explained by these items was 54.0% and had 1.7 eigenvalues in the first contrast; correlation of -0.51 between items 19 and 20 was found. Eliminating item 20 and analyzing the remaining four items, all statistics were into the expected values (Supplementary MaterialTable S4) and the variance explained improved (Supplementary MaterialTable S5). It was not detected uniform DIF by sex or age group in any dimension. Wright maps are presented for each dimension evaluated (Supplementary MaterialFigure S2).
Contrast validity
Discriminative capacity of the MLHF-Q subscales and for the overall score was observed in age and NYHA functional class (p <0.05). Worse HRQoL was observed among women than men in the emotional dimension (p=0.047). Although higher HRQoL impairment was evident in LVEF < 45% compared with LVEF >45% patients, it was not statistically significant (Supplementary MaterialTable S6).
Discussion
To the best of our knowledge, the present work is the first study that has assessed the psychometric properties of the MLHF-Q in a Spanish-speaking population of Latin America. We have evaluated the internal consistency, construct validity through the two methods (PCA and Rasch analysis), and the discriminative capacity of the MLHF-Q in outpatients with CHF in Colombia.
Interpretation of findings
The reliability of subscales and overall MLHF-Q showed a Cronbach's alpha acceptable to excellent, with coefficients similar to those in other populations: Australia, France, Hungary, Yugoslavia (physical dimension α =0.91); Hungary, Poland, Sweden (emotional dimension a=0.80); Israel, Italy (social dimension a=0.73), and Denmark, Spain, Yugoslavia (total score a=0 . 91) [23,26].
Regarding construct validity, we found that the three factors explain 47.7% and 54.03% of the overall score in the Rasch analysis and PCA, respectively, with similar results previously reported by another study[27], while in other studies the variance explained by these three factors has been higher (64.1 to 72%)[9,10,19,28]. Also, we found the following similarities with other authors; Heo, et al. [27] evidenced that items (1 and 9) were loaded on physical dimension and items (14 and 16) presented loading <0.4. Ho, et al. [19] showed that item 1 was loaded on physical dimension. Finally, Moon, et al. [28] found that items (1, 9 and 10) were loaded on physical dimension.
Item 1 (Swelling in your ankles, legs) is part of the social dimension (another dimension) from original version; however, it has been reported that up two thirds of patients admitted with acute HF presented hypervolemia signs such as jugular venous distension and peripheral edema, typical physiopathological manifestations of HF[29-30],which support its correlation to the physical dimension. On the other hand, determine the most plausible dimension for item 10 (sexual activities difficult) is complicated, due to the multifactorial explanation (psychological, emotional, physical and medical) of HF patients' sexual activity[31]. Also, possible explanations for the differences found in the factor structure, variance explained, and eigenvalues with other authors could be sample size, culture, demographics and clinical characteristics, among others[10,17].
The MLHF-Q is interpreted by its total score, which results of averaging the score of all 21 items. However, this assumes that the total score is unidimensional. Nevertheless, Rasch analysis for the total score did not find evidence of unidimensional functioning. Moreover, it demonstrated misfitting of five items (10, 14, 15, 16, 20), and therefore confirming the existence of some problematic items in the composition of the total score. Elimination of items has been reported as a solution[10,27]. Exclusion of problematic items in our study improved the general fit to the Rasch model.
Similar findings have been reported by Munyombwe, et al. [10] who found that several items (7, 8, 10, 14, 16) presented misfit. Also, Bilbao, et al. [24] reported two misfitting items (1 and 10). Considering that misfitting items have been identified in a third factor presenting the social dimension, as also shown in the current study, several authors have suggested to add a third factor to the total score[9,10,19,23,28]. However, it remains a challenge to reach a consensus on which of the different social factors proposed is the most appropriate and has the best psychometric properties, and therefore, future studies should examine further and use confirmatory techniques.
Regarding to response categories, we found difficulties in distinguishing between the response options very little (1) and little (2), or much (4) and very much (5). This pattern was also reported by Munyombwe, et al. [10], who suggests that it could be explained by the sample size or an excess of response categories.
According to the findings of the item-map graphics for the subscales, some patients are in the bottom of the emotional and physical subscales person-item maps, denoting floor effects. This finding is consistent with Munyombwe, et al. [10], and suggests that those subscales need more items to cover all the levels of the underlying trait. Nevertheless, some studies have reported either floor or ceiling effect in the analysis of total score of the MLHF-Q[10].
In relation to other variables that measure different stages of disease severity, our results are consistent with a priori hypothesis. The MLHF-Q scores clearly discriminate between different stages of NYHA functional class and age. This has also been observed in both observational studies[9,11,13], as well as clinical trials, where it is the ideal setting for assessing sensitivity to change[32-34].
Strengths and limitations
The strengths of our study include an adequate sample size, as also shown by the KMO statistic. Also, we provide complete analyses of the structural validity, using both factorial PCA and Rasch analysis. The present study has, however, some important limitations to consider. First, our study was conducted in a single HF center. Accordingly, study results cannot be considered a representative description of the HRQoL of all Colombia's HF clinics. Second, the MLHF-Q is a self-administered questionnaire and, in our study it was applied by a nurse because a high percentage of our population had low educational level, and therefore it could have affected the measurement of the HRQoL.
Conclusions
In conclusion, we have assessed the content, the internal consistency, construct and discriminative capacity of the MLHF-Q in patients with CHF from Colombia. We have confirmed the three-factor structure of MLHF-Q such as previous studies, and satisfactory level for internal consistency. Additionally, these results suggest that the questionnaire adequately reflects the severity of the disease. However further studies are required in Colombian population to validate these findings and to evaluate the sensitivity to change of the MLHF-Q in longitudinal designs.