The Journal of Arthroplasty, ISSN: 0883-5403, Vol: 37, Issue: 2, Page: 267-273

Logistic Regression and Machine Learning Models Cannot Discriminate Between Satisfied and Dissatisfied Total Knee Arthroplasty Patients

Joseph S. Munn; Brent A. Lanting; Steven J. MacDonald; Lyndsay E. Somerville; Jacquelyn D. Marsh; Dianne M. Bryant; Bert M. Chesworth
Knee

Background

Approximately 20% of total knee arthroplasty (TKA) patients are found to be dissatisfied or unsure of their satisfaction at 1-year post-surgery. This study attempted to predict 1-year post-surgery dissatisfied/unsure TKA patients with pre-surgery and surgical variables using logistic regression and machine learning methods.

Methods

A retrospective analysis of patients who underwent primary TKA for osteoarthritis between 2012 and 2016 at a single institution was completed. Patients were split into satisfied and dissatisfied/unsure groups. Potential predictor variables included the following: demographic information, patella re-surfaced, posterior collateral ligament sacrificed, and subscales from the Knee Society Knee Scoring System, the Knee Society Clinical Rating System, the Western Ontario and McMaster Universities Osteoarthritis Index, and the 12-Item Short Form Health Survey version 2. Logistic regression and 6 different machine learning methods were used to create prediction models. Model performance was evaluated using discrimination (AUC [area under the receiver operating characteristic curve]) and calibration (Brier score, Cox intercept, and Cox slope) metrics.

Results

There were 1432 eligible patients included in the analysis, 313 were considered to be dissatisfied/unsure. When evaluating discrimination, the logistic regression (AUC = 0.736) and extreme gradient boosted tree (AUC = 0.713) models performed best. When evaluating calibration, the logistic regression (Brier score = 0.141, Cox intercept = 0.241, and Cox slope = 1.31) and gradient boosted tree (Brier score = 0.149, Cox intercept = 0.054, and Cox slope = 1.158) models performed best.

Conclusion

The models developed in this study do not perform well enough as discriminatory tools to be used in a clinical setting. Further work needs to be done to improve the performance of pre-surgery TKA dissatisfaction prediction models.

Link to article