In Bioinformatics (Oxford, England) ; h5-index 0.0
MOTIVATION : Data-independent acquisition mass spectrometry allows for comprehensive peptide detection and relative quantification than standard data-dependent approaches. While less prone to missing values, these still exist. Current approaches for handling so-called missingness have challenges. We hypothesized that non-random missingness is a useful biological measure and demonstrate the importance of analysing missingness for proteomic discovery within a longitudinal study of disease activity.
RESULTS : The magnitude of missingness did not correlate with mean peptide concentration. The magnitude of missingness for each protein strongly correlated between collection time points (baseline, 3 months, 6 months; R = 0.95-0.97, CI = 0.94, 0.97) indicating little time-dependent effect. This allowed for the identification of proteins with outlier levels of missingness that differentiate between patient groups characterized by different patterns of disease activity. The association of these proteins with disease activity was confirmed by machine learning techniques.
CONCLUSION : Our novel approach complements analyses on complete observations and other missing value strategies in biomarker prediction of disease activity.
SUPPLEMENTARY INFORMATION : Supplementary figures and tables are available at Bioinformatics online.
McGurk Kathryn A, Dagliati Arianna, Chiasserini Davide, Lee Dave, Plant Darren, Baricevic-Jones Ivona, Kelsall Janet, Eineman Rachael, Reed Rachel, Geary Bethany, Unwin Richard D, Nicolaou Anna, Keavney Bernard D, Barton Anne, Whetton Anthony D, Geifman Nophar