Journal Home
Search for

Volume 137, Issue 6, Pages 1572-1573 (June 2009)


View previous. 59 of 82 View next.

Reference values: No need for confusion

Nancy A. Obuchowski, PhD

Refers to article:
Remediastinoscopy in restaging of lung cancer after induction therapy
Alessandro Marra, Ludger Hillejan, Sylvia Fechner, Georgios Stamatis
The Journal of Thoracic and Cardiovascular Surgery
April 2008 (Vol. 135, Issue 4, Pages 843-849)
Abstract | Full Text | Full-Text PDF (276 KB)
Remediastinoscopy: A statistical reinterpretation
Eric Lim, Michael Dusmet
The Journal of Thoracic and Cardiovascular Surgery
January 2009 (Vol. 137, Issue 1, Pages 254-255)
Full Text | Full-Text PDF (59 KB)
Referent values and equipoise: Editors' notes
Thomas W. Rice, Eugene H. Blackstone
The Journal of Thoracic and Cardiovascular Surgery
January 2009 (Vol. 137, Issue 1, Pages 256-257)
Full Text | Full-Text PDF (64 KB)

CTSNet classification2, 4

Article Outline

References

Copyright

To the Editor:

I would like to comment on the discussion among Lim and Dusmet,1 Marra and colleagues,2 and Rice and Blackstone.3 There are several issues of confusion; I hope I can clarify some of these.

Sensitivity and specificity are measures of a test's inherent diagnostic performance. Sensitivity is the proportion of patients who test positive among patients with the disease; specificity is the proportion of patients who test negative among patients without the disease. Another common measure of diagnostic performance is the receiver operating characteristic (ROC) curve.4 An ROC curve illustrates a test's sensitivity and specificity for different criteria for defining positive and negative test results. For highly accurate tests, there is a point on the ROC curve that one can choose if high specificity is desired; the price, however, is low sensitivity. Similarly, one can choose very high sensitivity but at a price of low specificity. Lim and Dusmet's1 comment that “sensitivity truly starts at 50%” is incorrect; a test with low sensitivity (ie, <0.5) can have diagnostic value if the specificity is high.

Sensitivity and specificity are the basic measures of a test's ability, but they do not describe how well the test will perform for a particular patient population. In managing patients, physicians focus on what the test results tell them about their patient. They want to know the probability their patient has the disease after a positive test result (positive predictive value [PPV]) and the probability their patient does not have the disease after a negative test result (negative predictive value [NPV]). Predictive values depend not only on the sensitivity and specificity of the test but also on the probability of disease in similar patients (ie, prevalence of disease). In fact, when predictive values are reported in the literature, a subscript indicating the prevalence rate is often used. For example, remediastinoscopy may have an NPV of 0.85 in a sample with a prevalence rate of 0.32, which we write as NPV0.32 = 0.85. In a different population with a different prevalence rate, the NPV will change, for example, NPV0.05 = 0.98 or NPV0.50 = 0.72. Much of the controversy in these authors' correspondences is due to confusion between sensitivity and PPV, and between specificity and NPV. Sensitivity and specificity describe the test's inherent diagnostic abilities irrespective of the prevalence rate. PPV and NPV, on the other hand, tell us the likelihood of disease after the test is performed in a particular patient population with a particular prevalence rate. In determining the role of remediastinoscopy in restaging lung cancer, it seems that PPV and NPV are the important metrics and should be the focus of the discussion.

Lim and Dusmet1 and Marra and colleagues2 point out correctly that specificity is important for ruling in disease and sensitivity is important for ruling out disease. These relationships are due to the roles of these metrics in estimating PPVs and NPVs. A high specificity causes the PPV to increase, and a high sensitivity causes the NPV to increase, assuming, of course, that the prevalence of disease is held constant. As we have illustrated, predictive values are highly influenced by the prevalence of disease. Similarly, the measure of “accuracy” that Marra and colleagues report is also dependent on the prevalence of disease in the sample, and thus could be reported more appropriately as overall accuracy0.32 = 0.88.

There are several other issues in these correspondences that need clarification. First, neither Marra and colleagues2 nor Lim and Dusmet1 report a confidence interval (CI) for specificity. A reasonable 95% CI for specificity based on these data is 0.96 to 1.0.5 CIs for both sensitivity and specificity should be routinely reported. Contrary to Marra and colleagues' description of the meaning of a CI, it is not “the likelihood that another sample will provide the same result.” Rather, a CI describes a range of plausible values for the metric of interest, here specificity. Statistically speaking, we expect that 95% of CIs will contain the real, but unknown, true value of the metric (ie, specificity); 5% of CIs will not contain the true value. Statisticians use the data from a single sample to estimate the unknown value of the metric; 95% of the time the CI they construct contains the true value, although we do not know which value in the interval it is or which CIs contain the true value and which do not.

Second, it is important to consider the effects of patient and disease characteristics in estimating sensitivity and specificity. For example, the size of lesions is a critical determinant of sensitivity, as well as the comorbidities of patients. Some of the differences between estimates of sensitivity and specificity reported in the literature for remediastinoscopy could be due to these patient differences.

Third, when a diagnostic test does not yield a result, that is, the result is “uninterpretable,”6 it is critical that the frequency of this occurrence be reported. Marra and colleagues2 reported a 2% frequency for remediastinoscopy. They also included this frequency in the denominator of their estimate of overall accuracy; this gives the reader an honest estimate of the test's performance.

Last, I think Drs Rice and Blackstone's3 statement that screening tests usually have good specificity, whereas a test used to work up patients needs good sensitivity, is too narrow and does not describe many scenarios. In screening for breast cancer, for example, physicians look for tests with good sensitivity even if the false-positive rate is a bit high. Computer-aided detection systems are often used to improve sensitivity, usually at a cost of even higher recall rates. Without reasonable sensitivity, many screening programs cannot be cost-effective. Further workup of these patients demands higher specificity to prevent unnecessary invasive testing. The consequences of test errors and prevalence of disease must be weighed in each application to find the best test for a particular application.

References 

return to Article Outline

1. 1Lim E, Dusmet M. Remediastinoscopy: a statistical reinterpretation. J Thorac Cardiovasc Surg. 2009;137:254–255author reply 5-6. Full Text | Full-Text PDF (59 KB) | CrossRef

2. 2Marra A, Hillejan L, Fechner S, Stamatis G. Remediastinoscopy in restaging of lung cancer after induction therapy. J Thorac Cardiovasc Surg. 2008;135:843–849. Abstract | Full Text | Full-Text PDF (276 KB) | CrossRef

3. 3Rice TW, Blackstone EH. Referent values and equipoise: editors notes. J Thorac Cardiovasc Surg. 2009;137:256–257. Full Text | Full-Text PDF (64 KB) | CrossRef

4. 4Zhou XH, Obuchowski NA, McClish DK. Statistical Methods in Diagnostic Medicine. New York: Wiley and Sons; 2002;.

5. 5Hanley JA, Lippman-Hand A. If nothing goes wrong, is everything all right? Interpreting zero numerators. JAMA. 1983;249:1743–1745. MEDLINE

6. 6Begg CB, Greenes RA, Iglewicz B. The influence of uninterpretability on the assessment of diagnostic tests. J Chronic Dis. 1986;39:575–584. MEDLINE | CrossRef

Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, Ohio

PII: S0022-5223(09)00356-0

doi:10.1016/j.jtcvs.2009.02.031


View previous. 59 of 82 View next.