The Journal of Thoracic and Cardiovascular Surgery
Volume 134, Issue 5 , Pages 1128-1135.e3, November 2007

Propensity-score matching in the cardiovascular surgery literature from 2004 to 2006: A systematic review and suggestions for improvement

  • Peter C. Austin, PhD

      Affiliations

    • Corresponding Author InformationAddress for reprints: Peter C. Austin, PhD, Institute for Clinical Evaluative Sciences, G1 06, 2075 Bayview Ave, Toronto, Ontario M4N 3M5, Canada.

Institute for Clinical Evaluative Sciences, the Department of Public Health Sciences, University of Toronto, and the Department of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada.

Received 16 April 2007; accepted 31 July 2007.

Objective

I conducted a systematic review of the use of propensity score matching in the cardiovascular surgery literature. I examined the adequacy of reporting and whether appropriate statistical methods were used.

Methods

I examined 60 articles published in the Annals of Thoracic Surgery, European Journal of Cardio-thoracic Surgery, Journal of Cardiovascular Surgery, and the Journal of Thoracic and Cardiovascular Surgery between January 1, 2004, and December 31, 2006.

Results

Thirty-one of the 60 studies did not provide adequate information on how the propensity score–matched pairs were formed. Eleven (18%) of studies did not report on whether matching on the propensity score balanced baseline characteristics between treated and untreated subjects in the matched sample. No studies used appropriate methods to compare baseline characteristics between treated and untreated subjects in the propensity score–matched sample. Eight (13%) of the 60 studies explicitly used statistical methods appropriate for the analysis of matched data when estimating the effect of treatment on the outcomes. Two studies used appropriate methods for some outcomes, but not for all outcomes. Thirty-nine (65%) studies explicitly used statistical methods that were inappropriate for matched-pairs data when estimating the effect of treatment on outcomes. Eleven studies did not report the statistical tests that were used to assess the statistical significance of the treatment effect.

Conclusions

Analysis of propensity score–matched samples tended to be poor in the cardiovascular surgery literature. Most statistical analyses ignored the matched nature of the sample. I provide suggestions for improving the reporting and analysis of studies that use propensity score matching.

CTSNet classification: 2

 

 The Institute for Clinical Evaluative Sciences (ICES) is supported in part by a grant from the Ontario Ministry of Health and Long Term Care. The opinions, results and conclusions are those of the author and no endorsement by the Ministry of Health and Long-Term Care or by the Institute for Clinical Evaluative Sciences is intended or should be inferred. Dr Austin is supported in part by a New Investigator award from the Canadian Institutes of Health Research (CIHR).

PII: S0022-5223(07)01243-3

doi:10.1016/j.jtcvs.2007.07.021

The Journal of Thoracic and Cardiovascular Surgery
Volume 134, Issue 5 , Pages 1128-1135.e3, November 2007