BMJ 2013;347:f6651

Can joint replacement reduce cardiovascular risk?

Mohammad Ehsanul Karim, PhD candidate

The role of propensity score matching and landmark analysis in interpreting observational studies of treatments

Randomised trials are usually the best way to evaluate treatments, but observational designs can also provide useful insights into the effects of a particular treatment, as long as researchers use the appropriate statistical tools to help overcome the limitations of these studies. The paper by Ravi and colleagues (doi:10.1136/bmj.f6187),1 which reported an association between total joint arthroplasty and a reduced risk of serious cardiovascular events over a median follow-up of seven years, is a good illustration of two of these techniques: propensity score matching2 and landmark analysis.3

Patients are not randomised in observational comparative effectiveness studies. Lack of randomisation may contribute to differences in baseline characteristics of treated and untreated patients, including important prognostic factors such as age, sex, and comorbidities. Techniques such as propensity score matching make it possible to iron out these differences, so meaningful comparisons between treated and untreated groups can be made. The score, which is derived using multivariate logistic regression, defines each patient’s “propensity” for treatment—in this case total joint replacement—on the basis of their measured characteristics, so researchers can compare similar patients who did and did not have treatment.

Propensity score matching is meant to mimic the balanced groups obtained by randomisation and allow an unbiased comparison of the true effect of treatment. Propensity scores are derived from characteristics that researchers can see and record, but they inevitably exclude unobserved patient characteristics that might still affect the outcome. Ravi and colleagues mentioned physical activity levels and use of cardioprotective drug treatments as two examples of unobserved and unrecorded factors that might confound the association between joint arthroplasty and fewer subsequent cardiovascular events.1 They carried out sensitivity analysis to measure the likely extent of this problem,4 and they concluded that such unobserved confounders would have to meet strong conditions to affect their results.

Ravi and colleagues also used landmark analysis, a technique designed to overcome a problem associated with comparative observational studies that have a “time to event” outcome. Commonly known as immortal time bias, this problem tends to exaggerate the benefits of a treatment (such as arthroplasty) because patients in a cohort are classified as untreated if they develop the outcome (a cardiovascular event) before they have the treatment.5 This results in the untreated group tending to look worse.

Landmark analysis is a relatively straightforward technique that can be used to combat immortal time bias. It emerged in oncology research and has steadily gained popularity in other areas.6 Before starting a study, researchers choose a landmark date or time point that will be considered as the start of follow-up. Ravi and colleagues chose a date three years after each patient’s entry into the cohort. They first excluded all patients from their analyses who had a cardiovascular event (an outcome) before this date. They then classified as treated all patients who had an arthroplasty before the landmark date and classified as untreated all those who did not. Finally, they counted only cardiovascular events (outcomes) that occurred after the landmark date. These elements combined help to eliminate immortal time bias, but, as with all statistical manipulations, landmark analysis has its limitations.

The technique classifies patients as untreated even if they are treated immediately after the landmark date. Furthermore, excluding patients who have a cardiovascular event before the date may cause loss of power. Choosing a date that is too early or too late has important consequences, and ideally selection should be clinically relevant.3 When a different date is chosen, results and conclusions may change, which complicates the interpretation of results.7 Ravi and colleagues ran their analyses again to check the sensitivity of their results to changes in the landmark date, and they found that the changes had no substantial impact on their findings.1

Landmark analysis is easy to execute and relatively easy to understand, so it is appealing to medical researchers. But we must remember that this method, as with other sophisticated techniques,8 cannot completely eliminate confounding and bias in observational evaluations of treatments, even when combined with propensity score matching. Patients are still selected for treatment on the basis of many factors and it is rarely (if ever) possible to account for all of them. In that respect, the current study is persuasive but not conclusive.1 The authors used the right techniques to minimise the well known limitations of observational analyses, but the techniques themselves have equally well established limitations. Comparative observational studies, such as this one,1 should not be used to establish causality, and we should not, therefore, assume that joint replacement can definitely prevent serious cardiovascular events. Such studies can, however, lay a good foundation for future longitudinal research to explore causality.

Link to article