PURPOSE: Prostate cancer is the second most common cancer diagnosed in men. The rate is disproportionately high among men in sub-Saharan Africa where, unlike in North America and Western Europe, the screening process for prostate cancer has historically not been routine. Currently, as awareness regarding prostate health increases, more patients in this region are being referred to trans-rectal ultrasound guided prostate biopsy, a diagnosis procedure which requires a strong understanding of prostate zonal anatomy. To aid in the instruction of this procedure, prostate biopsy training programs need to be implemented. Unfortunately, current TRUS-guided training tools are not ideal for reproducibility in these Western African countries. To answer this challenge, we are developing an affordable and open-source training simulator for TRUS-guided prostate biopsy, for use in Senegal. In this paper, we present the implementation of the training simulator’s virtual interface, highlighting the generation and evaluation of the critical training component of zonal anatomy overlaid on TRUS.
METHODS: For the simulator’s dataset, we registered TRUS and MRI volumes together to obtain the zonal segmentation from the MRI volumes. After generating ten pairings of TRUS overlaid with zonal segmentation, we designed and implemented a virtual TRUS training system, developed in open-source software. The objective of our simulator is to teach trainees to accurately identify the prostate’s anatomical zones in TRUS. To confirm the system’s usability for training zonal identification, we conducted a two-part survey on the quality of the zonal overlays with 7 urology experts. In the first part, they assessed the zonal overlay for visual correctness by rating 10 images from one patient’s TRUS with registered overlay on a 5-point Likert scale. For the second part, they labelled 10 plain TRUS volumes with zonal anatomy and the labels were compared to the labels of our overlay.
RESULTS: On average, experts rated the zonal overlay’s visual accuracy at 4 out of 5. Furthermore, 7 out of 7 experts labelled the peripheral, anterior, and transitional zones in the same regions we overlaid them, and 5 out of 7 labelled the central zone in the same region we overlaid it.
CONCLUSION: We created the prototype of a TRUS imaging simulator in open-source software. A vital training component, zonal overlay, was generated using publicly accessible data and validated by expert urologists for prostate zone identification, confirming the concept.
The corticospinal tract is the most intensively investigated tract of the human motor system in stroke rehabilitative research. Diffusion-tensor-imaging gives insights into its microstructure, and transcranial magnetic stimulation assesses its excitability. Previous data on the interrelationship between both measures are contradictory. Correlative or predictive models which associate them with motor outcome are incomplete. Free water correction has been developed to enhance diffusion-tensor-imaging by eliminating partial volume with extracellular water, which could improve capturing stroke-related microstructural alterations, thereby also improving structure-function relationships in clinical cohorts. In the present cross-sectional study, data of 18 chronic stroke patients and 17 healthy controls, taken from a previous study on cortico-cerebellar motor tracts, were re-analysed: The data included diffusion-tensor-imaging data quantifying corticospinal tract microstructure with and without free water correction, transcranial magnetic stimulation data assessing recruitment curve properties of motor evoked potentials and detailed clinical data. Linear regression modelling was used to interrelate corticospinal tract microstructure, recruitment curves properties and clinical scores. The main finding of the present study was that free water correction substantially strengthens structure-function associations in stroke patients: Specifically, our data evidenced a significant association between fractional anisotropy of the ipsilesional corticospinal tract and its excitability ( = 0.001, adj. = 0.54), with free water correction explaining additional 20% in recruitment curve variability. For clinical scores, only free water correction leads to the reliable detection of significant correlations between ipsilesional corticospinal tract fractional anisotropy and residual grip ( = 0.001, adj. = 0.70) and pinch force ( < 0.001, adj. = 0.72). Finally, multimodal models can be improved by free water correction as well. This study evidences that corticospinal tract microstructure directly relates to its excitability in stroke patients. It also shows that unexplained variance in motor outcome is considerably reduced by free water correction arguing that it might serve as a powerful tool to improve existing models of structure-function associations and potentially also outcome prediction after stroke.
The retinogeniculate visual pathway (RGVP) conveys visual information from the retina to the lateral geniculate nucleus. The RGVP has four subdivisions, including two decussating and two nondecussating pathways that cannot be identified on conventional structural magnetic resonance imaging (MRI). Diffusion MRI tractography has the potential to trace these subdivisions and is increasingly used to study the RGVP. However, it is not yet known which fiber tracking strategy is most suitable for RGVP reconstruction. In this study, four tractography methods are compared, including constrained spherical deconvolution (CSD) based probabilistic (iFOD1) and deterministic (SD-Stream) methods, and multi-fiber (UKF-2T) and single-fiber (UKF-1T) unscented Kalman filter (UKF) methods. Experiments use diffusion MRI data from 57 subjects in the Human Connectome Project. The RGVP is identified using regions of interest created by two clinical experts. Quantitative anatomical measurements and expert anatomical judgment are used to assess the advantages and limitations of the four tractography methods. Overall, we conclude that UKF-2T and iFOD1 produce the best RGVP reconstruction results. The iFOD1 method can better quantitatively estimate the percentage of decussating fibers, while the UKF-2T method produces reconstructed RGVPs that are judged to better correspond to the known anatomy and have the highest spatial overlap across subjects. Overall, we find that it is challenging for current tractography methods to both accurately track RGVP fibers that correspond to known anatomy and produce an approximately correct percentage of decussating fibers. We suggest that future algorithm development for RGVP tractography should take consideration of both of these two points.
BACKGROUND: While previous studies have implicated white matter (WM) as a core pathology of Obsessive-Compulsive Disorder (OCD), the underlying neurobiological processes remain elusive. This study utilizes free-water imaging derived from diffusion MRI to identify cellular and extracellular WM abnormalities in patients with OCD compared to controls (Cs). Next, we investigate the association between diffusion measures, and clinical variables in patients. METHODS: We collected diffusion-weighted MRI and clinical data from eighty-three patients with OCD (56 females/27 males, age=37.7 ± 10.6) and 52 Cs (27 females/25 males, age=32.8 ± 11.5). Fractional anisotropy (FA), fractional anisotropy of cellular tissue (FAT), and extracellular free-water (FW) maps were extracted and compared between patients and Cs using tract-based spatial statistics, and voxel-wise comparison in FSL's Randomise. Next, we correlated these WM measures with clinical variables (age-of-onset and symptom severity) and compared them between patients with and without comorbidities and patients with and without psychiatric medication. RESULTS: Patients with OCD demonstrated lower FA (43.4% of the WM skeleton), FAт (31% of the WM skeleton), and higher FW (22.5% of the WM skeleton) compared to Cs. We did not observe significant correlations between diffusion measures and clinical variables. Comorbidities and medication status did not influence diffusion measures. CONCLUSIONS: Our findings of widespread FA, FAт, and FW abnormalities suggest that OCD is associated with both microstructural cellular and extracellular abnormalities beyond the cortico-striato-thalamo-cortical circuits. Future multimodal longitudinal studies are needed to understand better the influence of essential clinical variables across the illness trajectory.
Diffusion encoding along multiple spatial directions per signal acquisition can be described in terms of a b-tensor. The benefit of tensor-valued diffusion encoding is that it unlocks the 'shape of the b-tensor' as a new encoding dimension. By modulating the b-tensor shape, we can control the sensitivity to microscopic diffusion anisotropy which can be used as a contrast mechanism; a feature that is inaccessible by conventional diffusion encoding. Since imaging methods based on tensor-valued diffusion encoding are finding an increasing number of applications we are prompted to highlight the challenge of designing the optimal gradient waveforms for any given application. In this review, we first establish the basic design objectives in creating field gradient waveforms for tensor-valued diffusion MRI. We also survey additional design considerations related to limitations imposed by hardware and physiology, potential confounding effects that cannot be captured by the b-tensor, and artifacts related to the diffusion encoding waveform. Throughout, we discuss the expected compromises and tradeoffs with an aim to establish a more complete understanding of gradient waveform design and its impact on accurate measurements and interpretations of data.
Segmentation of brain tissue types from diffusion MRI (dMRI) is an important task, required for quantification of brain microstructure and for improving tractography. Current dMRI segmentation is mostly based on anatomical MRI (e.g., T1- and T2-weighted) segmentation that is registered to the dMRI space. However, such inter-modality registration is challenging due to more image distortions and lower image resolution in dMRI as compared with anatomical MRI. In this study, we present a deep learning method for diffusion MRI segmentation, which we refer to as DDSeg. Our proposed method learns tissue segmentation from high-quality imaging data from the Human Connectome Project (HCP), where registration of anatomical MRI to dMRI is more precise. The method is then able to predict a tissue segmentation directly from new dMRI data, including data collected with different acquisition protocols, without requiring anatomical data and inter-modality registration. We train a convolutional neural network (CNN) to learn a tissue segmentation model using a novel augmented target loss function designed to improve accuracy in regions of tissue boundary. To further improve accuracy, our method adds diffusion kurtosis imaging (DKI) parameters that characterize non-Gaussian water molecule diffusion to the conventional diffusion tensor imaging parameters. The DKI parameters are calculated from the recently proposed mean-kurtosis-curve method that corrects implausible DKI parameter values and provides additional features that discriminate between tissue types. We demonstrate high tissue segmentation accuracy on HCP data, and also when applying the HCP-trained model on dMRI data from other acquisitions with lower resolution and fewer gradient directions.
PURPOSE: To introduce, develop, and evaluate a novel denoising technique for diffusion MRI that leverages nonlinear redundancy in the data to boost the SNR while preserving signal information. METHODS: We exploit nonlinear redundancy of the dMRI data by means of kernel principal component analysis (KPCA), a nonlinear generalization of PCA to reproducing kernel Hilbert spaces. By mapping the signal to a high-dimensional space, a higher level of redundant information is exploited, thereby enabling better denoising than linear PCA. We implement KPCA with a Gaussian kernel, with parameters automatically selected from knowledge of the noise statistics, and validate it on realistic Monte Carlo simulations as well as with in vivo human brain submillimeter and low-resolution dMRI data. We also demonstrate KPCA denoising on multi-coil dMRI data. RESULTS: SNR improvements up to 2.7 were obtained in real in vivo datasets denoised with KPCA, in comparison to SNR gains of up to 1.8 using a linear PCA denoising technique called Marchenko-Pastur PCA (MPPCA). Compared to gold-standard dataset references created from averaged data, we showed that lower normalized root mean squared error was achieved with KPCA compared to MPPCA. Statistical analysis of residuals shows that anatomical information is preserved and only noise is removed. Improvements in the estimation of diffusion model parameters such as fractional anisotropy, mean diffusivity, and fiber orientation distribution functions were also demonstrated. CONCLUSION: Nonlinear redundancy of the dMRI signal can be exploited with KPCA, which allows superior noise reduction/SNR improvements than the MPPCA method, without loss of signal information.
PURPOSE: We aimed to develop a predictive model of disease severity for cirrhosis using MRI-derived radiomic features of the liver and spleen and compared it to the existing disease severity metrics of MELD score and clinical decompensation. The MELD score is compiled solely by blood parameters, and so far, it was not investigated if extracted image-based features have the potential to reflect severity to potentially complement the calculated score. METHODS: This was a retrospective study of eligible patients with cirrhosis ([Formula: see text]) who underwent a contrast-enhanced MR screening protocol for hepatocellular carcinoma (HCC) screening at a tertiary academic center from 2015 to 2018. Radiomic feature analyses were used to train four prediction models for assessing the patient's condition at time of scan: MELD score, MELD score [Formula: see text] 9 (median score of the cohort), MELD score [Formula: see text] 15 (the inflection between the risk and benefit of transplant), and clinical decompensation. Liver and spleen segmentations were used for feature extraction, followed by cross-validated random forest classification. RESULTS: Radiomic features of the liver and spleen were most predictive of clinical decompensation (AUC 0.84), which the MELD score could predict with an AUC of 0.78. Using liver or spleen features alone had slightly lower discrimination ability (AUC of 0.82 for liver and AUC of 0.78 for spleen features only), although this was not statistically significant on our cohort. When radiomic prediction models were trained to predict continuous MELD scores, there was poor correlation. When stratifying risk by splitting our cohort at the median MELD 9 or at MELD 15, our models achieved AUCs of 0.78 or 0.66, respectively. CONCLUSIONS: We demonstrated that MRI-based radiomic features of the liver and spleen have the potential to predict the severity of liver cirrhosis, using decompensation or MELD status as imperfect surrogate measures for disease severity.
Diffusion kurtosis imaging (DKI) is a diffusion MRI approach that enables the measurement of brain microstructural properties, reflecting molecular restrictions and tissue heterogeneity. DKI parameters such as mean kurtosis (MK) provide additional subtle information to that provided by popular diffusion tensor imaging (DTI) parameters, and thus have been considered useful to detect white matter abnormalities, especially in populations that are not expected to show severe brain pathologies. However, DKI parameters often yield artifactual output values that are outside of the biologically plausible range, which diminish sensitivity to identify true microstructural changes. Recently we have proposed the mean-kurtosis-curve (MK-Curve) method to correct voxels with implausible DKI parameters, and demonstrated its improved performance against other approaches that correct artifacts in DKI. In this work, we aimed to evaluate the utility of the MK-Curve method to improve the identification of white matter abnormalities in group comparisons. To do so, we compared group differences, with and without the MK-Curve correction, between 115 individuals at clinical high risk for psychosis (CHR) and 93 healthy controls (HCs). We also compared the correlation of the corrected and uncorrected DKI parameters with clinical characteristics. Following the MK-curve correction, the group differences had larger effect sizes and higher statistical significance (i.e., lower p-values), demonstrating increased sensitivity to detect group differences, in particular in MK. Furthermore, the MK-curve-corrected DKI parameters displayed stronger correlations with clinical variables in CHR individuals, demonstrating the clinical relevance of the corrected parameters. Overall, following the MK-curve correction our analyses found widespread lower MK in CHR that overlapped with lower fractional anisotropy (FA), and both measures were significantly correlated with a decline in functioning and with more severe symptoms. These observations further characterize white matter alterations in the CHR stage, demonstrating that MK and FA abnormalities are widespread, and mostly overlap. The improvement in group differences and stronger correlation with clinical variables suggest that applying MK-curve would be beneficial for the detection and characterization of subtle group differences in other experiments as well.
Translating deep learning research from theory into clinical practice has unique challenges, specifically in the field of neuroimaging. In this paper, we present DeepNeuro, a Python-based deep learning framework that puts deep neural networks for neuroimaging into practical usage with a minimum of friction during implementation. We show how this framework can be used to design deep learning pipelines that can load and preprocess data, design and train various neural network architectures, and evaluate and visualize the results of trained networks on evaluation data. We present a way of reproducibly packaging data pre- and postprocessing functions common in the neuroimaging community, which facilitates consistent performance of networks across variable users, institutions, and scanners. We show how deep learning pipelines created with DeepNeuro can be concisely packaged into shareable Docker and Singularity containers with user-friendly command-line interfaces.
BACKGROUND: Extracellular free water within cerebral white matter tissue has been shown to increase with age and pathology, yet the cognitive consequences of free water in typical aging prior to the development of neurodegenerative disease remains unclear. Understanding the contribution of free water to cognitive function in older adults may provide important insight into the neural mechanisms of the cognitive aging process. METHODS: A diffusion-weighted MRI measure of extracellular free water as well as a commonly used diffusion MRI metric (fractional anisotropy) along nine bilateral white matter pathways were examined for their relationship with cognitive function assessed by the NIH Toolbox Cognitive Battery in 47 older adults (mean age = 74.4 years, SD = 5.4 years, range = 65-85 years). Probabilistic tractography at the 99th percentile level of probability (Tracts Constrained by Underlying Anatomy; TRACULA) was utilized to produce the pathways on which microstructural characteristics were overlaid and examined for their contribution to cognitive function independent of age, education, and gender. RESULTS: When examining the 99th percentile probability core white matter pathway derived from TRACULA, poorer fluid cognitive ability was related to higher mean free water values across the angular and cingulum bundles of the cingulate gyrus, as well as the corticospinal tract and the superior longitudinal fasciculus. There was no relationship between cognition and mean FA or free water-adjusted FA across the 99th percentile core white matter pathway. Crystallized cognitive ability was not associated with any of the diffusion measures. When examining cognitive domains comprising the NIH Toolbox Fluid Cognition index relationships with these white matter pathways, mean free water demonstrated strong hemispheric and functional specificity for cognitive performance, whereas mean FA was not related to age or cognition across the 99th percentile pathway. CONCLUSIONS: Extracellular free water within white matter appears to increase with normal aging, and higher values are associated with significantly lower fluid but not crystallized cognitive functions. When using TRACULA to estimate the core of a white matter pathway, a higher degree of free water appears to be highly specific to the pathways associated with memory, working memory, and speeded decision-making performance, whereas no such relationship existed with FA. These data suggest that free water may play an important role in the cognitive aging process, and may serve as a stronger and more specific indicator of early cognitive decline than traditional diffusion MRI measures, such as FA.
PURPOSE: To optimize diffusion-relaxation MRI with tensor-valued diffusion encoding for precise estimation of compartment-specific fractions, diffusivities, and T values within a two-compartment model of white matter, and to explore the approach in vivo. METHODS: Sampling protocols featuring different b-values (b), b-tensor shapes (b ), and echo times (TE) were optimized using Cramér-Rao lower bounds (CRLB). Whole-brain data were acquired in children, adults, and elderly with white matter lesions. Compartment fractions, diffusivities, and T values were estimated in a model featuring two microstructural compartments represented by a "stick" and a "zeppelin." RESULTS: Precise parameter estimates were enabled by sampling protocols featuring seven or more "shells" with unique b/b /TE-combinations. Acquisition times were approximately 15 minutes. In white matter of adults, the "stick" compartment had a fraction of approximately 0.5 and, compared with the "zeppelin" compartment, featured lower isotropic diffusivities (0.6 vs. 1.3 μm /ms) but higher T values (85 vs. 65 ms). Children featured lower "stick" fractions (0.4). White matter lesions exhibited high "zeppelin" isotropic diffusivities (1.7 μm /ms) and T values (150 ms). CONCLUSIONS: Diffusion-relaxation MRI with tensor-valued diffusion encoding expands the set of microstructure parameters that can be precisely estimated and therefore increases their specificity to biological quantities.
The corticospinal tract (CST) is one of the most well studied tracts in human neuroanatomy. Its clinical significance can be demonstrated in many notable traumatic conditions and diseases such as stroke, spinal cord injury (SCI) or amyotrophic lateral sclerosis (ALS). With the advent of diffusion MRI and tractography the computational representation of the human CST in a 3D model became available. However, the representation of the entire CST and, specifically, the hand motor area has remained elusive. In this paper we propose a novel method, using manually drawn ROIs based on robustly identifiable neuroanatomic structures to delineate the entire CST and isolate its hand motor representation as well as to estimate their variability and generate a database of their volume, length and biophysical parameters. Using 37 healthy human subjects we performed a qualitative and quantitative analysis of the CST and the hand-related motor fiber tracts (HMFTs). Finally, we have created variability heat maps from 37 subjects for both the aforementioned tracts, which could be utilized as a reference for future studies with clinical focus to explore neuropathology in both trauma and disease states.
Objective: In image-guided neurosurgery, co-registered preoperative anatomical, functional, and diffusion tensor imaging can be used to facilitate a safe resection of brain tumors in eloquent areas of the brain. However, the brain deforms during surgery, particularly in the presence of tumor resection. Non-Rigid Registration (NRR) of the preoperative image data can be used to create a registered image that captures the deformation in the intraoperative image while maintaining the quality of the preoperative image. Using clinical data, this paper reports the results of a comparison of the accuracy and performance among several non-rigid registration methods for handling brain deformation. A new adaptive method that automatically removes mesh elements in the area of the resected tumor, thereby handling deformation in the presence of resection is presented. To improve the user experience, we also present a new way of using mixed reality with ultrasound, MRI, and CT. Materials and methods: This study focuses on 30 glioma surgeries performed at two different hospitals, many of which involved the resection of significant tumor volumes. An Adaptive Physics-Based Non-Rigid Registration method (A-PBNRR) registers preoperative and intraoperative MRI for each patient. The results are compared with three other readily available registration methods: a rigid registration implemented in 3D Slicer v4.4.0; a B-Spline non-rigid registration implemented in 3D Slicer v4.4.0; and PBNRR implemented in ITKv4.7.0, upon which A-PBNRR was based. Three measures were employed to facilitate a comprehensive evaluation of the registration accuracy: (i) visual assessment, (ii) a Hausdorff Distance-based metric, and (iii) a landmark-based approach using anatomical points identified by a neurosurgeon. Results: The A-PBNRR using multi-tissue mesh adaptation improved the accuracy of deformable registration by more than five times compared to rigid and traditional physics based non-rigid registration, and four times compared to B-Spline interpolation methods which are part of ITK and 3D Slicer. Performance analysis showed that A-PBNRR could be applied, on average, in <2 min, achieving desirable speed for use in a clinical setting. Conclusions: The A-PBNRR method performed significantly better than other readily available registration methods at modeling deformation in the presence of resection. Both the registration accuracy and performance proved sufficient to be of clinical value in the operating room. A-PBNRR, coupled with the mixed reality system, presents a powerful and affordable solution compared to current neuronavigation systems.
BACKGROUND: The radiological differential diagnosis between tumor recurrence and radiation-induced necrosis (ie, pseudoprogression) is of paramount importance in the management of glioma patients. OBJECTIVE: This research aims to develop a deep learning methodology for automated differentiation of tumor recurrence from radiation necrosis based on routine magnetic resonance imaging (MRI) scans. METHODS: In this retrospective study, 146 patients who underwent radiation therapy after glioma resection and presented with suspected recurrent lesions at the follow-up MRI examination were selected for analysis. Routine MRI scans were acquired from each patient, including T1, T2, and gadolinium-contrast-enhanced T1 sequences. Of those cases, 96 (65.8%) were confirmed as glioma recurrence on postsurgical pathological examination, while 50 (34.2%) were diagnosed as necrosis. A light-weighted deep neural network (DNN) (ie, efficient radionecrosis neural network [ERN-Net]) was proposed to learn radiological features of gliomas and necrosis from MRI scans. Sensitivity, specificity, accuracy, and area under the curve (AUC) were used to evaluate performance of the model in both image-wise and subject-wise classifications. Preoperative diagnostic performance of the model was also compared to that of the state-of-the-art DNN models and five experienced neurosurgeons. RESULTS: DNN models based on multimodal MRI outperformed single-modal models. ERN-Net achieved the highest AUC in both image-wise (0.915) and subject-wise (0.958) classification tasks. The evaluated DNN models achieved an average sensitivity of 0.947 (SD 0.033), specificity of 0.817 (SD 0.075), and accuracy of 0.903 (SD 0.026), which were significantly better than the tested neurosurgeons (P=.02 in sensitivity and P<.001 in specificity and accuracy). CONCLUSIONS: Deep learning offers a useful computational tool for the differential diagnosis between recurrent gliomas and necrosis. The proposed ERN-Net model, a simple and effective DNN model, achieved excellent performance on routine MRI scans and showed a high clinical applicability.
We propose and demonstrate a novel machine learning algorithm that assesses pulmonary edema severity from chest radiographs. While large publicly available datasets of chest radiographs and free-text radiology reports exist, only limited numerical edema severity labels can be extracted from radiology reports. This is a significant challenge in learning such models for image classification. To take advantage of the rich information present in the radiology reports, we develop a neural network model that is trained on both images and free-text to assess pulmonary edema severity from chest radiographs at inference time. Our experimental results suggest that the joint image-text representation learning improves the performance of pulmonary edema assessment compared to a supervised model trained on images only. We also show the use of the text for explaining the image classification by the joint model. To the best of our knowledge, our approach is the first to leverage free-text radiology reports for improving the image model performance in this application. Our code is available at: https://github.com/RayRuizhiLiao/joint_chestxray.
Using medical images to evaluate disease severity and change over time is a routine and important task in clinical decision making. Grading systems are often used, but are unreliable as domain experts disagree on disease severity category thresholds. These discrete categories also do not reflect the underlying continuous spectrum of disease severity. To address these issues, we developed a convolutional Siamese neural network approach to evaluate disease severity at single time points and change between longitudinal patient visits on a continuous spectrum. We demonstrate this in two medical imaging domains: retinopathy of prematurity (ROP) in retinal photographs and osteoarthritis in knee radiographs. Our patient cohorts consist of 4861 images from 870 patients in the Imaging and Informatics in Retinopathy of Prematurity (i-ROP) cohort study and 10,012 images from 3021 patients in the Multicenter Osteoarthritis Study (MOST), both of which feature longitudinal imaging data. Multiple expert clinician raters ranked 100 retinal images and 100 knee radiographs from excluded test sets for severity of ROP and osteoarthritis, respectively. The Siamese neural network output for each image in comparison to a pool of normal reference images correlates with disease severity rank (ρ = 0.87 for ROP and ρ = 0.89 for osteoarthritis), both within and between the clinical grading categories. Thus, this output can represent the continuous spectrum of disease severity at any single time point. The difference in these outputs can be used to show change over time. Alternatively, paired images from the same patient at two time points can be directly compared using the Siamese neural network, resulting in an additional continuous measure of change between images. Importantly, our approach does not require manual localization of the pathology of interest and requires only a binary label for training (same versus different). The location of disease and site of change detected by the algorithm can be visualized using an occlusion sensitivity map-based approach. For a longitudinal binary change detection task, our Siamese neural networks achieve test set receiving operator characteristic area under the curves (AUCs) of up to 0.90 in evaluating ROP or knee osteoarthritis change, depending on the change detection strategy. The overall performance on this binary task is similar compared to a conventional convolutional deep-neural network trained for multi-class classification. Our results demonstrate that convolutional Siamese neural networks can be a powerful tool for evaluating the continuous spectrum of disease severity and change in medical imaging.