Statistical Validation Based on Parametric Receiver Operating Characteristic Analysis of Continuous Classification Data

Zou KH, Warfield SK, Fielding JR, Tempany CM, William W, Kaus MR, Jolesz FA, Kikinis R. Statistical Validation Based on Parametric Receiver Operating Characteristic Analysis of Continuous Classification Data. Acad Radiol. 2003;10(12):1359–68.

Abstract

RATIONALE AND OBJECTIVES: The accuracy of diagnostic test and imaging segmentation is important in clinical practice because it has a direct impact on therapeutic planning. Statistical validations of classification accuracy was conducted based on parametric receiver operating characteristic analysis, illustrated on three radiologic examples, MATERIALS AND METHODS: Two parametric models were developed for diagnostic or imaging data. Example 1: A semi-automated fractional segmentation algorithm was applied to magnetic resonance imaging of nine cases of brain tumors. The tumor and background pixel data were assumed to have bi-beta distributions. Fractional segmentation was validated against an estimated composite pixel-wise gold standard based on multi-reader manual segmentations. Example 2: The predictive value of 100 cases of spiral computed tomography of ureteral stone sizes, distributed as bi-normal after a non-linear transformation, under two treatment options received. Example 3: One hundred eighty cases had prostate-specific antigen levels measured in a prospective clinical trial. Radical prostatectomy was performed in all to provide a binary gold standard of local and advanced cancer stages. Prostate-specific antigen level was transformed and modeled by bi-normal distributions. In all examples, areas under the receiver operating characteristic curves were computed. RESULTS. The areas under the receiver operating characteristic curves were: Example 1: Fractional segmentation of magnetic resonance imaging of brain tumors: meningiomas (0.924-0.984); astrocytomas (0.786-0.986); and other low-grade gliomas (0.896-0.983). Example 3: Ureteral stone size for treatment planning (0.813). Example 2: Prostate-specific antigen for staging prostate cancer (0.768). CONCLUSION: All clinical examples yielded fair to excellent accuracy. The validation metric area under the receiver operating characteristic curves may be generalized to evaluating the performances of several continuous classifiers related to imaging.
Last updated on 02/24/2023