CADTH Health Technology Review

Model Validation Tool to Assist in the Conduct of Economic Evaluations

Methods and Guidelines

Authors: Doug Coyle, Alex Haines, Karen Lee

For further information on how to use this tool or specifics about each item, refer to The Development of a Model Validation Tool to Assist in the Conduct of Economic Evaluations.

Validation of the Conceptual Model

Decision Problem

The decision problem relates to interventions to be compared, the population(s) in which they are compared, the perspective for the evaluation, which costs and outcomes are to be considered, and the time horizon of the evaluation.

If any item is not present, the model may not reflect the decision problem and therefore its conclusions may not be valid.

Table 1: Validation of Decision Problem (Items 1 to 5)

Item

Description

Yes

No

NA

1

The model built is reflective of the stated population that the decision problem applies to.

2

The model can examine key subgroups within the population of interest.

3

The model assesses all comparators used to currently treat the stated population.

4

The model incorporates costs that are consistent with the specified perspective of the analysis.

5

The model assesses all outcomes deemed important by clinicians and patients.

Model Specification

Model specification relates to the choice of model type, the health states that are modelled, and, when applicable, the choice of cycle length. When building a model, it is important to consider what would be an appropriate specification of the model. To undertake this, it is worthwhile to consider what you would consider an optimal specification of the model. This can be informed by 2 distinct processes: a review of existing models in the area and a formal consideration of the disease process and clinical pathway. This is required to help ensure external validity of the model.

Table 2: Validation of Model Specification (Items 6 and 7)

Item

Description

Yes

No

NA

6

The structure of the model (i.e., the process and clinical pathway) has been validated by clinical experts.

7

The model follows previous models in this clinical area or justification has been provided about why the model structure differs from previous models.

[Select NA if there are no previous models in this clinical area.]

Modelling of Clinical Effectiveness

One component of the conceptual model that will have a large impact on the results of the related analysis is the process by which clinical effectiveness is incorporated within the model. It is important to consider issues such as the quality and consistency of the evidence, the assumed duration of the effect, double counting of benefit, and appropriate consideration of uncertainty.

Note the following does not consider the quality or robustness of the evidence considered. Please refer to the CADTH guidelines for guidance on evidence appraisal.

How Was Clinical Evidence Modelled?

The following items apply to state transition models. Some of the items in this section may not be applicable for decision trees or discrete event simulations. In these cases, select NA. For more complicated models, further validation steps will likely be required.

Table 3: Validation of Clinical Effectiveness (Items 8 to 25)

Item

Description

Yes

No

NA

8

Time spent in each health state, for each technology assessed, can be extracted from the model.

[Select NA if the model does not utilize health states (e.g., a decision tree) and skip item 9.]

9

The model output matches the evidence provided to support time spent in health states among technologies.

10

If clinical events are modelled (i.e., hospitalizations, exacerbations, strokes, hip fractures), the number of events for each technology can be extracted from the model.

[Select NA if clinical events are not relevant to the decision problem and skip item 11.]

11

Model output matches the evidence provided to support the number of clinical events across technologies.

12

The impact of adverse events on health outcomes and costs can be extracted from the model.

[Select NA if adverse events are not relevant to the decision problem and skip item 13.]

13

The model output matches evidence provided for adverse event type and frequency from the evidence.

14

Life-years are reported as a result within the model.

15

The impact the evaluated technologies has on mortality is clear.

[Select NA if there are no differences in mortality and skip items 16, 17, and 18.]

16

If differences in mortality are noted in Item 15, select the reasons for differing mortality in the model (more than 1 reason can be selected):

a. Duration of time spent in health states lead to higher mortality risk.

b. There is a difference in the frequency of fatal clinical events.

c. There is a difference in the frequency of fatal adverse events.

d. Direct impact on risk of death, not stated previously, has been modelled (i.e., direct modelling of overall survival from the trial).

17

If there are mortality differences between technologies, the model can extract which of the reasons from item 16 has the largest impact on incremental life-years.

18

Model output matches evidence regarding mortality rates between different technologies.

19

It is clear that the model does not utilize technology-specific utilities.

20

Based on the results from the submitted model, it can be determined which of the following has the largest impact on cost-effectiveness conclusions: time spent in health states, number of clinical events occurring, adverse events, and mortality.

21

The model distinguishes data that are based on extrapolation methods (i.e., using parametric survival analysis).

[Select NA if no extrapolation is required.]

22

The model time horizon can be adjusted for just the period for which there are clinical data available.

[Select NA if the data cover the full period for which clinical data are available.]

23

If the model incorporates both direct and indirect effects, it is clear how double counting has been avoided (e.g., a direct effect applies to mortality through applying a hazard ratio to overall survival and an indirect effect is applied to the probability of an event or transition that is associated with a mortality risk).

[Select NA if only direct or indirect effects are included.]

24

Does the modelled relationship between surrogate outcomes and final outcomes (quality of life and mortality) match the evidence presented?

[Select NA if no surrogate outcomes are used.]

25

The model allows flexibility to explore waning of treatment effects OR evidence and rationale is provided that suggests treatment effects are permanent and enduring.

[Select NA if no extrapolation of treatment effect is required.]

Computer Model Validation and Verification

The process of model verification can be separated into 2 distinct processes which are akin to those adopted in software verification: assessment of model behaviour (black box testing) then scrutinization of the coding of the model (white box testing). It is important that these processes are conducted in the correct order.

To enable black box and white box testing, there are essential features of the model that need to be in place. The following section relates to these essential features.

Model Transparency

Table 4: Validation of Model Transparency (Items 26 to 28)

Item

Description

Yes

No

NA

26

The model can access the deterministic result and the results from single Monte Carlo simulations.

27

A clear trace can be identified that links all input parameters to final outcomes (i.e., only input parameters are hard coded).

28

Macros are exclusively related to first- or second-order simulation and model navigation (exclusive to models built in Microsoft Excel).

Assessment of Model Behaviour (Black Box Verification)

The first process involves ascertaining whether changing the inputs of the model leads to results that meet the general expectations of the reviewer. This is referred to as black box testing because it does not require the reviewer to know the inner workings of the model. If, during black box testing, the model fails to provide results that are explainable, the reviewer could conduct detailed white box testing (refer to Table 6) to determine why the results are not as expected.

The following items include a range of possible black box tests.

Table 5: Validation of Assessment of Model Behaviour (Items 29 to 46)

Item

Description

Yes

No

NA

29

You can set the effectiveness of different technologies such that QALY estimates are equal.

30

When you set effectiveness values to be extremely in favour of or against 1 technology, this leads to substantially greater or reduced QALY estimates.

31

When you set effectiveness values for 1 technology to be slightly improved or reduced, this leads to greater or reduced QALY estimates.

32

When you increase mortality risk for each health state or event, this leads to lower QALYs and life-years for all technologies.

33

When you reduce mortality risk for each health state or event, this leads to greater QALYs and life-years for all technologies.

34

When you increase baseline risks of events, this leads to lower QALYs for all technologies.

35

When you reduce baseline risks of events, this leads to higher QALYs for all technologies.

36

When you set mortality to be zero (i.e., patients do not enter the death state), life-years are identical across technologies.

37

When you increase the cost of a technology, the only output impacted is the total lifetime costs for strategies that include that technology; likewise, there is no effect on QALYs or life-years.

38

When you set all utilities to 1 and all disutilities to zero, the estimated QALYs are equivalent to life-years.

39

For evaluations with a time horizon greater than 1 year, when you set the discount rate to 0%, the costs and QALYs for all interventions increase.

[Select NA if the time horizon is shorter than 1 year because discounting is only relevant for models with time horizons longer than 1 year.]

40

For evaluations with a time horizon greater than 1 year, if you increase the discount rate, the costs and QALYs for all interventions decrease.

[Select NA if the time horizon is shorter than 1 year because discounting is only relevant for models with time horizons longer than 1 year.]

41

When you reduce the time horizon of the evaluation (the period costs and QALYs are estimated) this leads to lower estimated costs and QALYs for all interventions.

42

It is possible to switch the inputs for 2 technologies and get the same results as before, meaning by changing the inputs (effectiveness, costs, QALYs), the model structure for any decision alternative can be used to model any other decision alternative.

43

You can calculate the correlation between the costs and QALYs for different technologies across the Monte Carlo simulation replications.

44

Based on the results of the Monte Carlo simulation, there is a strong correlation between the estimates of costs (i.e., the estimated costs from each replication) for different technologies.

45

Based on the results of the Monte Carlo simulation, there is a strong correlation between the estimates of QALYs (i.e., the estimated QALYs from each replication) for different technologies.

46

The results of the deterministic analysis are broadly in line with the results of the probabilistic analysis. Justification is provided about why deterministic and probabilistic results are different.

Scrutinization of Model Coding (White Box Verification)

The purpose of the second process is to establish whether the links between inputs and outputs are appropriate. This involves checking the detailed model calculations (white box testing). White box testing requires scrutinizing the formulas in a spreadsheet that link input parameters and outcomes. White box testing can identify the root of possible issues raised by black box testing. Black box testing cannot identify whether the model is providing correct results. This can only be ascertained through white box testing. Thus, white box testing should not be limited to areas of concern raised by black box testing but should focus on appraising the coding of the model with respect to all parameters that are determined to be important (either a priori or ex ante).

Table 6: Validation of Model Coding (Items 47 and 48)

Item

Description

Yes

No

NA

47

You can work backward from the results of the model to the location where inputs are entered.

[For example, if you take the total costs associated with an intervention, can you work back from this value to determine how it was estimated and what inputs were used to derive this value?]

48

You can work forward from the location where inputs are entered to the results of a single Monte Carlo simulation.

[For example, if you take a random input into the model (e.g., technology cost), can you trace how this input influences costs and or QALYs in the model?]

General Issues of Concern

The following relates to common issues of concern expressed by the health economists consulted for this work regarding models built in Microsoft Excel. The issues are of direct relevance to white box testing. The following functions should be avoided to ensure model transparency and affect the reliability of the model analysis. Often these functions are used to override user-provided inputs.

Table 7: General Issues of Concern (Items 49 to 53)

Item

Description

Yes

No

NA

49

The use of the following functions limit model transparency, are inefficient, and are not required:

  • IFERROR, IFNA, ISERROR, ISERR, or ISNA

  • CHOOSE, INDIRECT, OFFSET, and INDEX

The model makes no or limited use of these statements.

50

The model has no hidden sheets, rows, and columns.

51

The model is free of user-created formulas embedded within VBA macros.

52

Parameters are not reset to default values after macros (e.g., for a Monte Carlo simulation) are run.

53

All input parameters that influence model results are provided in a transparent manner, preferably in a single worksheet.