Time-Trade-Off (TTO) is one of the main methodologies used for eliciting health-state utilities to calculate quality-adjusted life-years (QALYS) in health technology assessments. However, there are major differences in the results of TTO valuations 1, as well as in the implementation methods 2. Differences between TTO valuations in different populations might be attributed to different preferences, but might also be the result of small methodological changes.
Anchoring is a cognitive bias that arises when numerical estimates are affected by irrelevant information at hand 3. Even obviously random data unrelated to a particular question may lead people to focus on the information that is consistent with the anchor 4. Bias does not seem to disappear when the subject is aware of its existence 5. Anchoring has been found in a wide set of laboratory and real life situations 6-8, including health valuation 9,10. It may also be related to other cognitive biases 11,12.
A recent study found anchoring in TTO procedures as the starting point of subsequent valuations in a web survey 13. However, there are some issues to be tackled with respect to this result. First, the gold standard for TTO is face-to-face interviews. In addition, working with a heterogeneous population makes it difficult to isolate the anchoring effect. For instance, people of different ages may value differently. Also, the ten-year horizon of the standard TTO protocol has a different meaning for a young adult in their 20's than for an elderly person in their 60's. Since data are not normally distributed, isolating the anchoring effect by means of econometric regressions leaves room for discussion. Considering that the size of the anchoring effect may be affected by the lack of personal involvement and that heterogeneity in the general population makes it difficult to isolate the effect, in this paper we set out to find whether the starting point in a face-to-face TTO iteration procedure with a homogeneous highly educated population also induces anchoring in final health-state utilities.
Specifically, the TTO method seeks to find how many years in perfect health are equivalent (indifferent) to a year in certain heath state A. This is achieved by asking whether a person would prefer to spend the rest of their life (for example, 10 years) in health state A and then die, or to spend 10 years in perfect health and then die. If the person chooses 10 years in perfect health, then the question is asked again changing the number of years in perfect health until an equivalent number is obtained. By way of example, a person reports being indifferent to spending 10 years in health state A and then dying, compared to spending 2 years in perfect health and then dying. In that case, a year in the health state under study is equivalent to 0.2 years in perfect health. Theoretically speaking, the procedure could start at 10 years and go down from there, start in zero and go up from there or start at any other number and go up or down in any order depending on the response. The assumption of procedural invariance means that the result should be the same regardless of the starting point 14; however, if there is anchoring, the starting point would affect the result, which is the object of study of this article.
METHODS
Population. Participants included 147 final-year economics students (111 males, 36 females) aged 18 to 25, taking the research methodology course at Universidad Nacional de Colombia in 2015. Participation was voluntary and they could withdraw from the experiment at any time, no questions asked. Only one person declined to participate. The subjects were informed that the study was looking for determinants of preferences for health states and that it did not represent any hazard or breach of confidentiality for them. Informed consents were signed to participate. No compensation was paid for participating.
The participants valued five EQ-5D health states by TTO. All of them had a 40-year time horizon, so that the results were closer to their life expectancy. The subjects were randomly allocated to two groups. For the first group, the first question in the iteration procedure asked to compare 40 years in perfect health to 40 years in the valued health state, and then, the number decreased in 4-year periods. For the second group, the first question compared 20 years in perfect health to 40 years in the valued health state, and then it decreased or increased in 4-year periods depending on the answer.
Only states valued better than death were considered. When a subject valued a state as worse than death, it was excluded from the sample for that state, considering that valuation of worse than death states implied a process different from the process under study.
Health states were described using the EQ-5D-3L system, which uses five dimensions (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression) at three levels (1=no problems, 2=moderate problems, and 3=severe problems). For a clearer exposition, each health state shall be referred to hereinafter by means of a letter: health states were 32211 (L), 22323 (Y), 21221 (X), 11121 (M), 33333 (S).
Interviewers (10 people) received previous training for the task and were not aware of the hypothesis being tested. They were randomly assigned to one of the anchors. For their part, participants were assigned randomly to the interviewers. Six of the interviewers performed the 40 anchor and four used the 20 anchor.
Anchoring should appear in the form of higher valuations for the group starting at 40 than for the group starting at 20. Distribution normality was tested through Shapiro-Wilk test, while results were tested by Kolmogorov-Smirnov (KS), Wilcoxon-Mann-Whitney and Kruskal-Wallis tests. Ordered logistic regressions and box and whisker plots were also used to confirm the difference between the distributions.
In order to measure the size of the anchoring, the Jacowitz and Kahneman's 15 Anchoring Index was used (difference between medians divided by the difference between anchors).
RESULTS
Descriptive statistics of the variables are reported in Table 1. The number of observations was calculated after excluding worse-than-death states.
For each health state, the mean valuation of the an-chor-40 group was higher than for the anchor-20 group in the same state. The Shapiro-Wilk test showed that no variable followed a normal distribution. The KS test was applied to verify equal distributions (Table 2).
Table 2 Non-parametric tests for normality and equal distributions

* Significant at 1 %; ** Significant at 5 %
The first column of the KS test shows that all states but M have lower values for the anchor-20 group than for the anchor-40 group and that this result is significant at 1 %. The second line shows the probability of a value in the anchor-20 group being higher than in the other group, which does not happen in any case. The third line shows the combined tests and the p value; all states but M have a different distribution for both groups, with lower values for the anchor-20 group. The results with KS are supported by the Wilcoxon-Mann-Whitney test, as well as by the Kruskal-Wallis test (not reported), an ordered logistic regression (not reported), and box and whisker plots (Figure 1). The anchoring effect, measured by the Anchoring Index, is low compared to different estimations in the laboratory and in business 15,16.
DISCUSSION
The results show that the starting point may act as an anchor in TTO health state valuations even during a face-to-face interview with educated people, so results are not affected by misunderstanding of the procedure by the subjects. The result for the health state M shows that anchoring might not be present in some health states, perhaps some near perfect health. The Anchoring Index is low compared to other situations (e.g. business) but the effect is large enough to be policy relevant, as the estimated utility using the 40-year anchor may be 15 % to 188 % higher than that estimated for the 20-year anchor. In addition, since not all states are equally affected, the effect should not be discarded in health technology assessments because it may have a different impact on each branch of a decision tree. These results coincide with the only previous study on anchoring in TTO 13.
The experiment was applied to a student population, which is younger than the general population. Therefore, a 40-year time horizon was applied instead of the usual 10-year time horizon. Nevertheless, the point is that anchoring was observed in the young adult population and that it should be considered in surveys applied in the general population or patients. The fact that this is a highly educated population should not affect the results because anchoring susceptibility does not seem to be related to demographic and cognitive measures 7.
For this study, the Ping-Pong scheme used in British TTO EQ-5D studies (changing the year of comparison up and down in the questions) was not considered, since it is taken as equivalent to the consider-the-opposite method 17 to avoid anchoring. However, anchoring was found in the results obtained by Augestad using the Ping-Pong scheme 13. This study only addressed one specific scheme as the objective was isolating the relevant variables. The effect of different schemes should be the subject of future studies.
The distributions are not normal, hence the importance of using non-parametric tests, as regression-based results would have been less conclusive.
The conclusion of this exercise is that anchoring is present in TTO even when face-to-face interaction increases the subject's attention and understanding of the task ●