Abstract
Neuroimaging studies typically adopt a common feature space for all data, which may obscure aspects of neuroanatomy only observable in subsets of a population, e.g. cortical folding patterns unique to individuals or shared by close relatives. Here, we propose to model individual variability using a distinctive keypoint signature: a set of unique, localized patterns, detected automatically in each image by a generic saliency operator. The similarity of an image pair is then quantified by the proportion of keypoints they share using a novel Jaccard-like measure of set overlap. Experiments demonstrate the keypoint method to be highly efficient and accurate, using a set of 7536 T1-weighted MRIs pooled from four public neuroimaging repositories, including twins, non-twin siblings, and 3334 unique subjects. All same-subject image pairs are identified by a similarity threshold despite confounds including aging and neurodegenerative disease progression. Outliers reveal previously unknown data labeling inconsistencies, demonstrating the usefulness of the keypoint signature as a computational tool for curating large neuroimage datasets.