Our report published today evaluates a new primary care service for older people with complex care needs that began operating in early 2015. Compared to our earlier report Transforming health care in nursing homes, the results are mixed, with positive qualitative findings but little quantitative evidence that the service has improved outcomes.
This is similar to many evaluations of new services that, because of the findings, do not make a big impact. But we need to be careful that such results are not overlooked. The immediate impact of an evaluation is often not in proportion to its rigour or the time taken to undertake it. However, it is the longer and more thorough studies that play an important part in future work that synthesises the available evidence.
Not easy to predict
The evaluation scenario is typical: a new service innovation is implemented and we want to investigate its value and impact. And this is not a controlled environment – it is the unpredictable and sometimes messy world of practical health care.
An effective way of carrying out such evaluations is through mixed methods: balancing qualitative and quantitative approaches, with the latter adopting a matched case-control design.
In our study, the ‘cases’ were the people registered with the new service, whereas each ‘control’ was an individual who did not experience the new service but was similar to one of the cases at the time of registration. The theory with this approach is that what then happens to the control group reflects what would have happened to the cases if the new service did not exist, and so by comparing outcomes for the two groups we can judge its impact.
However, this does create some analytical challenges:
- We need access to individual patient-level data, yet this doesn’t always exist, or may not be obtainable by the people wanting to do the evaluation. (As alternatives, matching studies can be done on aggregated data by matching, for example, GP practices or care homes, but its effectiveness will depend on the study.)
- We need a control group. For our study we used a local control group as many eligible people did not register. In some ways this is an advantage, because any factors due to the external local environment would be the same, and we know the controls have not been experiencing a similar initiative. The alternative would be to find controls from outside the area that create problems around which areas to choose and what data are available.
- We need criteria for matching controls. These can only be derived from the data available. The hope is that we can match on characteristics that predict future use of health services – regardless of how different the people might be in other ways. However, it is not essential to match on all variables since they can be adjusted for in the analysis. It may also be feasible to match on values of predicted future risk, rather than specific variables.
- Matching is not as straightforward as it sounds. The data processing and analysis required should not be underestimated, and some algorithms are very sophisticated.
- The groups of people need to be large enough and/or followed up for long enough for the study to be able to detect real changes that may be happening. Suppose I had a coin that was weighted in some way in favour of heads. If I tossed it 10 times and heads came up on seven occasions, I wouldn’t think anything of it. But if I tossed it 100 times and heads appeared on 70 occasions, then I would be more suspicious. The proportion of heads is the same in both cases, but because the total number of tosses the second time is sufficiently large, I am able to detect the bias. This is closely linked to:
- The challenge of choosing appropriate measures of patient outcomes. This is a problem we came up against with this study. Once it was clear that recruitment to the new service was an issue, if the service was having any influence on hospital attendance then it wasn’t large enough for us to be able to detect it. This has been a similar situation with many small-scale studies that have aimed to measure the impact on hospital attendance. We might be able to overcome the problem of having sufficient numbers by choosing different measures where changes over the shorter term, or with fewer patients, might be easier to detect.
Formal case matching methodologies can be analytically challenging, but they enable rigorous evaluation of real-world interventions. Inconclusive results may not have the same short-term impact as clear positive outcomes, but in the longer term it is their contribution to the evidence base that counts.
Sherlaw-Johnson C (2018) "The challenges of case-control studies and the value of inconclusive results", Nuffield Trust comment. https://www.nuffieldtrust.org.uk/news-item/the-challenges-of-case-control-studies-and-the-value-of-inconclusive-results