Publications by Year: 2020

2020

Zwanenburg A, Vallières M, Abdalah MA, Aerts HJWL, Andrearczyk V, Apte A, Ashrafinia S, Bakas S, Beukinga RJ, Boellaard R, Bogowicz M, Boldrini L, Buvat I, Cook GJR, Davatzikos C, Depeursinge A, Desseroit MC, Dinapoli N, Dinh CV, Echegaray S, Naqa IE, Fedorov AY, Gatta R, Gillies RJ, Goh V, Götz M, Guckenberger M, Ha SM, Hatt M, Isensee F, Lambin P, Leger S, Leijenaar RTH, Lenkowicz J, Lippert F, Losnegård A, Maier-Hein KH, Morin O, Müller H, Napel S, Nioche C, Orlhac F, Pati S, Pfaehler EAG, Rahmim A, Rao AUK, Scherer J, Siddique MM, Sijtsema NM, Fernandez JS, Spezi E, Steenbakkers RJHM, Tanadini-Lang S, Thorwarth D, Troost EGC, Upadhaya T, Valentini V, van Dijk L V, van Griethuysen J, van Velden FHP, Whybra P, Richter C, Löck S. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology. 2020;295(2):328–38.
Background Radiomic features may quantify characteristics present in medical imaging. However, the lack of standardized definitions and validated reference values have hampered clinical use. Purpose To standardize a set of 174 radiomic features. Materials and Methods Radiomic features were assessed in three phases. In phase I, 487 features were derived from the basic set of 174 features. Twenty-five research teams with unique radiomics software implementations computed feature values directly from a digital phantom, without any additional image processing. In phase II, 15 teams computed values for 1347 derived features using a CT image of a patient with lung cancer and predefined image processing configurations. In both phases, consensus among the teams on the validity of tentative reference values was measured through the frequency of the modal value and classified as follows: less than three matches, weak; three to five matches, moderate; six to nine matches, strong; 10 or more matches, very strong. In the final phase (phase III), a public data set of multimodality images (CT, fluorine 18 fluorodeoxyglucose PET, and T1-weighted MRI) from 51 patients with soft-tissue sarcoma was used to prospectively assess reproducibility of standardized features. Results Consensus on reference values was initially weak for 232 of 302 features (76.8%) at phase I and 703 of 1075 features (65.4%) at phase II. At the final iteration, weak consensus remained for only two of 487 features (0.4%) at phase I and 19 of 1347 features (1.4%) at phase II. Strong or better consensus was achieved for 463 of 487 features (95.1%) at phase I and 1220 of 1347 features (90.6%) at phase II. Overall, 169 of 174 features were standardized in the first two phases. In the final validation phase (phase III), most of the 169 standardized features could be excellently reproduced (166 with CT; 164 with PET; and 164 with MRI). Conclusion A set of 169 radiomics features was standardized, which enabled verification and calibration of different radiomics software. © RSNA, 2020 See also the editorial by Kuhl and Truhn in this issue.
Fedorov A, Hancock M, Clunie D, Brochhausen M, Bona J, Kirby J, Freymann J, Pieper S, Aerts HJWL, Kikinis R, Prior F. DICOM Re-encoding of Volumetrically Annotated Lung Imaging Database Consortium (LIDC) Nodules. Med Phys. 2020;47(11):5953–65.
PURPOSE: The dataset contains annotations for lung nodules collected by the Lung Imaging Data Consortium and Image Database Resource Initiative (LIDC) stored as standard DICOM objects. The annotations accompany a collection of computed tomography (CT) scans for over 1000 subjects annotated by multiple expert readers, and correspond to "nodules >= 3 mm", defined as any lesion considered to be a nodule with greatest in-plane dimension in the range 3-30 mm regardless of presumed histology. The present dataset aims to simplify reuse of the data with the readily available tools, and is targeted towards researchers interested in the analysis of lung CT images. ACQUISITION AND VALIDATION METHODS: Open source tools were utilized to parse the project-specific XML representation of LIDC-IDRI annotations and save the result as standard DICOM objects. Validation procedures focused on establishing compliance of the resulting objects with the standard, consistency of the data between the DICOM and project-specific representation, and evaluating interoperability with the existing tools. DATA FORMAT AND USAGE NOTES: The dataset utilizes DICOM Segmentation objects for storing annotations of the lung nodules, and DICOM Structured Reporting objects for communicating qualitative evaluations (nine attributes) and quantitative measurements (three attributes) associated with the nodules. The total of 875 subjects contain 6859 nodule annotations. Clustering of the neighboring annotations resulted in 2651 distinct nodules. The data are available in TCIA at https://doi.org/10.7937/TCIA.2018.h7umfurq. POTENTIAL APPLICATIONS: The standardized dataset maintains the content of the original contribution of the LIDC-IDRI consortium, and should be helpful in developing automated tools for characterization of lung lesions and image phenotyping. In addition to those properties, the representation of the present dataset makes it more FAIR (Findable, Accessible, Interoperable, Reusable) for the research community, and enables its integration with other standardized data collections.
Miskin N, Unadkat P, Carlton ME, Golby AJ, Young GS, Huang RY. Frequency and Evolution of New Postoperative Enhancement on 3 Tesla Intraoperative and Early Postoperative Magnetic Resonance Imaging. Neurosurgery. 2020;87(2):238–46.
BACKGROUND: Intraoperative magnetic resonance imaging (IO-MRI) provides real-time assessment of extent of resection of brain tumor. Development of new enhancement during IO-MRI can confound interpretation of residual enhancing tumor, although the incidence of this finding is unknown. OBJECTIVE: To determine the frequency of new enhancement during brain tumor resection on intraoperative 3 Tesla (3T) MRI. To optimize the postoperative imaging window after brain tumor resection using 1.5 and 3T MRI. METHODS: We retrospectively evaluated 64 IO-MRI performed for patients with enhancing brain lesions referred for biopsy or resection as well as a subset with an early postoperative MRI (EP-MRI) within 72 h of surgery (N = 42), and a subset with a late postoperative MRI (LP-MRI) performed between 120 h and 8 wk postsurgery (N = 34). Three radiologists assessed for new enhancement on IO-MRI, and change in enhancement on available EP-MRI and LP-MRI. Consensus was determined by majority response. Inter-rater agreement was assessed using percentage agreement. RESULTS: A total of 10 out of 64 (16%) of the IO-MRI demonstrated new enhancement. Seven of 10 patients with available EP-MRI demonstrated decreased/resolved enhancement. One out of 42 (2%) of the EP-MRI demonstrated new enhancement, which decreased on LP-MRI. Agreement was 74% for the assessment of new enhancement on IO-MRI and 81% for the assessment of new enhancement on the EP-MRI. CONCLUSION: New enhancement occurs in intraoperative 3T MRI in 16% of patients after brain tumor resection, which decreases or resolves on subsequent MRI within 72 h of surgery. Our findings indicate the opportunity for further study to optimize the postoperative imaging window.
Smith BJ, Buatti JM, Bauer C, Ulrich EJ, Ahmadvand P, Budzevich MM, Gillies RJ, Goldgof D, Grkovski M, Hamarneh G, Kinahan PE, Muzi JP, Muzi M, Laymon CM, Mountz JM, Nehmeh S, Oborski MJ, Zhao B, Sunderland JJ, Beichel RR. Multisite Technical and Clinical Performance Evaluation of Quantitative Imaging Biomarkers from 3D FDG PET Segmentations of Head and Neck Cancer Images. Tomography. 2020;6(2):65–76.
Quantitative imaging biomarkers (QIBs) provide medical image-derived intensity, texture, shape, and size features that may help characterize cancerous tumors and predict clinical outcomes. Successful clinical translation of QIBs depends on the robustness of their measurements. Biomarkers derived from positron emission tomography images are prone to measurement errors owing to differences in image processing factors such as the tumor segmentation method used to define volumes of interest over which to calculate QIBs. We illustrate a new Bayesian statistical approach to characterize the robustness of QIBs to different processing factors. Study data consist of 22 QIBs measured on 47 head and neck tumors in 10 positron emission tomography/computed tomography scans segmented manually and with semiautomated methods used by 7 institutional members of the NCI Quantitative Imaging Network. QIB performance is estimated and compared across institutions with respect to measurement errors and power to recover statistical associations with clinical outcomes. Analysis findings summarize the performance impact of different segmentation methods used by Quantitative Imaging Network members. Robustness of some advanced biomarkers was found to be similar to conventional markers, such as maximum standardized uptake value. Such similarities support current pursuits to better characterize disease and predict outcomes by developing QIBs that use more imaging information and are robust to different processing factors. Nevertheless, to ensure reproducibility of QIB measurements and measures of association with clinical outcomes, errors owing to segmentation methods need to be reduced.
Li MD, Chang K, Bearce B, Chang CY, Huang AJ, Campbell P, Brown JM, Singh P, Hoebel K V, Erdoğmuş D, Ioannidis S, Palmer WE, Chiang MF, Kalpathy-Cramer J. Siamese Neural Networks for Continuous Disease Severity Evaluation and Change Detection in Medical Imaging. NPJ Digit Med. 2020;3(1):48.
Using medical images to evaluate disease severity and change over time is a routine and important task in clinical decision making. Grading systems are often used, but are unreliable as domain experts disagree on disease severity category thresholds. These discrete categories also do not reflect the underlying continuous spectrum of disease severity. To address these issues, we developed a convolutional Siamese neural network approach to evaluate disease severity at single time points and change between longitudinal patient visits on a continuous spectrum. We demonstrate this in two medical imaging domains: retinopathy of prematurity (ROP) in retinal photographs and osteoarthritis in knee radiographs. Our patient cohorts consist of 4861 images from 870 patients in the Imaging and Informatics in Retinopathy of Prematurity (i-ROP) cohort study and 10,012 images from 3021 patients in the Multicenter Osteoarthritis Study (MOST), both of which feature longitudinal imaging data. Multiple expert clinician raters ranked 100 retinal images and 100 knee radiographs from excluded test sets for severity of ROP and osteoarthritis, respectively. The Siamese neural network output for each image in comparison to a pool of normal reference images correlates with disease severity rank (ρ = 0.87 for ROP and ρ = 0.89 for osteoarthritis), both within and between the clinical grading categories. Thus, this output can represent the continuous spectrum of disease severity at any single time point. The difference in these outputs can be used to show change over time. Alternatively, paired images from the same patient at two time points can be directly compared using the Siamese neural network, resulting in an additional continuous measure of change between images. Importantly, our approach does not require manual localization of the pathology of interest and requires only a binary label for training (same versus different). The location of disease and site of change detected by the algorithm can be visualized using an occlusion sensitivity map-based approach. For a longitudinal binary change detection task, our Siamese neural networks achieve test set receiving operator characteristic area under the curves (AUCs) of up to 0.90 in evaluating ROP or knee osteoarthritis change, depending on the change detection strategy. The overall performance on this binary task is similar compared to a conventional convolutional deep-neural network trained for multi-class classification. Our results demonstrate that convolutional Siamese neural networks can be a powerful tool for evaluating the continuous spectrum of disease severity and change in medical imaging.
Xu T, Nenning KH, Schwartz E, Hong SJ, Vogelstein JT, Goulas A, Fair DA, Schroeder CE, Margulies DS, Smallwood J, Milham MP, Langs G. Cross-Species Functional Alignment Reveals Evolutionary Hierarchy Within the Connectome. Neuroimage. 2020;223:117346.
Evolution provides an important window into how cortical organization shapes function and vice versa. The complex mosaic of changes in brain morphology and functional organization that have shaped the mammalian cortex during evolution, complicates attempts to chart cortical differences across species. It limits our ability to fully appreciate how evolution has shaped our brain, especially in systems associated with unique human cognitive capabilities that lack anatomical homologues in other species. Here, we develop a function-based method for cross-species alignment that enables the quantification of homologous regions between humans and rhesus macaques, even when their location is decoupled from anatomical landmarks. Critically, we find cross-species similarity in functional organization reflects a gradient of evolutionary change that decreases from unimodal systems and culminates with the most pronounced changes in posterior regions of the default mode network (angular gyrus, posterior cingulate and middle temporal cortices). Our findings suggest that the establishment of the default mode network, as the apex of a cognitive hierarchy, has changed in a complex manner during human evolution - even within subnetworks.
Gao Y, Xiao X, Han B, Li G, Ning X, Wang D, Cai W, Kikinis R, Berkovsky S, Di Ieva A, Zhang L, Ji N, Liu S. Deep Learning Methodology for Differentiating Glioma Recurrence From Radiation Necrosis Using Multimodal Magnetic Resonance Imaging: Algorithm Development and Validation. JMIR Med Inform. 2020;8(11):e19805.
BACKGROUND: The radiological differential diagnosis between tumor recurrence and radiation-induced necrosis (ie, pseudoprogression) is of paramount importance in the management of glioma patients. OBJECTIVE: This research aims to develop a deep learning methodology for automated differentiation of tumor recurrence from radiation necrosis based on routine magnetic resonance imaging (MRI) scans. METHODS: In this retrospective study, 146 patients who underwent radiation therapy after glioma resection and presented with suspected recurrent lesions at the follow-up MRI examination were selected for analysis. Routine MRI scans were acquired from each patient, including T1, T2, and gadolinium-contrast-enhanced T1 sequences. Of those cases, 96 (65.8%) were confirmed as glioma recurrence on postsurgical pathological examination, while 50 (34.2%) were diagnosed as necrosis. A light-weighted deep neural network (DNN) (ie, efficient radionecrosis neural network [ERN-Net]) was proposed to learn radiological features of gliomas and necrosis from MRI scans. Sensitivity, specificity, accuracy, and area under the curve (AUC) were used to evaluate performance of the model in both image-wise and subject-wise classifications. Preoperative diagnostic performance of the model was also compared to that of the state-of-the-art DNN models and five experienced neurosurgeons. RESULTS: DNN models based on multimodal MRI outperformed single-modal models. ERN-Net achieved the highest AUC in both image-wise (0.915) and subject-wise (0.958) classification tasks. The evaluated DNN models achieved an average sensitivity of 0.947 (SD 0.033), specificity of 0.817 (SD 0.075), and accuracy of 0.903 (SD 0.026), which were significantly better than the tested neurosurgeons (P=.02 in sensitivity and P<.001 in specificity and accuracy). CONCLUSIONS: Deep learning offers a useful computational tool for the differential diagnosis between recurrent gliomas and necrosis. The proposed ERN-Net model, a simple and effective DNN model, achieved excellent performance on routine MRI scans and showed a high clinical applicability.
Hoebel K V, Patel JB, Beers AL, Chang K, Singh P, Brown JM, Pinho MC, Batchelor TT, Gerstner ER, Rosen BR, Kalpathy-Cramer J. Radiomics Repeatability Pitfalls in a Scan-Rescan MRI Study of Glioblastoma. Radiol Artif Intell. 2020;3(1):e190199.
Purpose: To determine the influence of preprocessing on the repeatability and redundancy of radiomics features extracted using a popular open-source radiomics software package in a scan-rescan glioblastoma MRI study. Materials and Methods: In this study, a secondary analysis of T2-weighted fluid-attenuated inversion recovery (FLAIR) and T1-weighted postcontrast images from 48 patients (mean age, 56 years [range, 22-77 years]) diagnosed with glioblastoma were included from two prospective studies (ClinicalTrials.gov NCT00662506 [2009-2011] and NCT00756106 [2008-2011]). All patients underwent two baseline scans 2-6 days apart using identical imaging protocols on 3-T MRI systems. No treatment occurred between scan and rescan, and tumors were essentially unchanged visually. Radiomic features were extracted by using PyRadiomics https://pyradiomics.readthedocs.io/ under varying conditions, including normalization strategies and intensity quantization. Subsequently, intraclass correlation coefficients were determined between feature values of the scan and rescan. Results: Shape features showed a higher repeatability than intensity (adjusted < .001) and texture features (adjusted < .001) for both T2-weighted FLAIR and T1-weighted postcontrast images. Normalization improved the overlap between the region of interest intensity histograms of scan and rescan (adjusted < .001 for both T2-weighted FLAIR and T1-weighted postcontrast images), except in scans where brain extraction fails. As such, normalization significantly improves the repeatability of intensity features from T2-weighted FLAIR scans (adjusted = .003 [ score normalization] and adjusted = .002 [histogram matching]). The use of a relative intensity binning strategy as opposed to default absolute intensity binning reduces correlation between gray-level co-occurrence matrix features after normalization. Conclusion: Both normalization and intensity quantization have an effect on the level of repeatability and redundancy of features, emphasizing the importance of both accurate reporting of methodology in radiomics articles and understanding the limitations of choices made in pipeline design. © RSNA, 2020See also the commentary by Tiwari and Verma in this issue.
Drakopoulos F, Tsolakis C, Angelopoulos A, Liu Y, Yao C, Kavazidi KR, Foroglou N, Fedorov A, Frisken S, Kikinis R, Golby A, Chrisochoides N. Adaptive Physics-Based Non-Rigid Registration for Immersive Image-Guided Neuronavigation Systems. Front Digit Health. 2020;2:613608.
Objective: In image-guided neurosurgery, co-registered preoperative anatomical, functional, and diffusion tensor imaging can be used to facilitate a safe resection of brain tumors in eloquent areas of the brain. However, the brain deforms during surgery, particularly in the presence of tumor resection. Non-Rigid Registration (NRR) of the preoperative image data can be used to create a registered image that captures the deformation in the intraoperative image while maintaining the quality of the preoperative image. Using clinical data, this paper reports the results of a comparison of the accuracy and performance among several non-rigid registration methods for handling brain deformation. A new adaptive method that automatically removes mesh elements in the area of the resected tumor, thereby handling deformation in the presence of resection is presented. To improve the user experience, we also present a new way of using mixed reality with ultrasound, MRI, and CT. Materials and methods: This study focuses on 30 glioma surgeries performed at two different hospitals, many of which involved the resection of significant tumor volumes. An Adaptive Physics-Based Non-Rigid Registration method (A-PBNRR) registers preoperative and intraoperative MRI for each patient. The results are compared with three other readily available registration methods: a rigid registration implemented in 3D Slicer v4.4.0; a B-Spline non-rigid registration implemented in 3D Slicer v4.4.0; and PBNRR implemented in ITKv4.7.0, upon which A-PBNRR was based. Three measures were employed to facilitate a comprehensive evaluation of the registration accuracy: (i) visual assessment, (ii) a Hausdorff Distance-based metric, and (iii) a landmark-based approach using anatomical points identified by a neurosurgeon. Results: The A-PBNRR using multi-tissue mesh adaptation improved the accuracy of deformable registration by more than five times compared to rigid and traditional physics based non-rigid registration, and four times compared to B-Spline interpolation methods which are part of ITK and 3D Slicer. Performance analysis showed that A-PBNRR could be applied, on average, in <2 min, achieving desirable speed for use in a clinical setting. Conclusions: The A-PBNRR method performed significantly better than other readily available registration methods at modeling deformation in the presence of resection. Both the registration accuracy and performance proved sufficient to be of clinical value in the operating room. A-PBNRR, coupled with the mixed reality system, presents a powerful and affordable solution compared to current neuronavigation systems.