Oral Presentation 31st Annual Lorne Proteomics Symposium 2026

A proteomics workflow for rare disease diagnosis: Recommendations for the processing and analysis of proteomics data from paediatric PBMCs for use in diagnosing rare genetic disease (133428)

Julia R Broadbent 1 2 , Nikeisha J Caruana 2 , Tanavi Sharma 2 , Daniella H Hock 1 2 3 , Liana N Semcesen 2 , Michael P Menden 2 4 , David A Stroud 1 2 3
  1. Murdoch Children’s Research Institute, Melbourne, VIC, Australia
  2. Department of Biochemistry and Pharmacology, Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Melbourne, VIC, Australia
  3. Victorian Clinical Genetics Services, Murdoch Children’s Research Institute, Melbourne, VIC, Australia
  4. Computational Health Center, Helmholtz Munich, Neuherberg, Germany

Background: Rare genetic diseases affect around 1 in 10 Australians; however, current clinical pathways only result in diagnosis for approximately 50% of patients. Many are left with unresolved variants of uncertain significance (VUSs): genomic changes which may be upgraded to a pathogenic classification and diagnosis with supporting functional evidence. Mass spectrometry-based quantitative proteomics presents a promising solution to the growing need for high-throughput functional assays to resolve VUSs – particularly missense variants – and contribute to rare disease diagnoses. As a gene- and disease-agnostic test, proteomics can provide functional evidence for diagnostic hypotheses by comparing patient protein expression levels against controls. However, utility and adoption of rare disease proteomics is currently limited by a lack of standardisation and practical limitations hampering control group sizes.

Aim: We sought to create an automated, customisable and reproducible workflow for analysing mass spectrometry-based quantitative proteomics data as a rare disease diagnostic tool, utilising peripheral blood mononuclear cell (PBMC) samples as a clinically accessible, easily isolated, and informative tissue in which more than 50% of all known monogenic disease genes are expressed.

Methods/Results: We first searched the literature to identify the optimal tools for each element of the proteomics workflow: peptide identification and protein inference, protein quantification and managing missing data, contaminants removal, normalisation, batch effect correction, and differential expression analysis. Leveraging a novel dataset of paediatric PBMC samples from 394 control individuals, we compared different methods on a high-quality subset of the data (n = 42 samples, 84 replicates). We demonstrate the advantages of a DIA-NN–limpa pipeline, harnessing DIA-NN’s command line interoperability and the recently published limpa R package’s approach to handling missing data. We produce complete protein matrices without imputing or eliminating lowly detected proteins, yielding a higher quantity of informative data and more robust results in downstream analyses. When benchmarked against our group’s previous methods, this workflow produced more statistically significant results without any loss of accuracy, improving its potential to inform diagnostic investigations in challenging patient cases.

Conclusion: We present a standardised workflow for processing mass spectrometry-based label-free DIA proteomics data from paediatric PBMC samples, and analysis in a rare disease diagnostic context.

  1. Australian Government. (2022). What we’re doing about rare diseases. https://www.health.gov.au/topics/chronic-conditions/what-were-doing-about-chronic-conditions/what-were-doing-about-rare-diseases
  2. Demichev, V., Messner, C. B., Vernardis, S. I., Lilley, K. S., & Ralser, M. (2020). DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nature Methods, 17(1), 41-44. https://doi.org/10.1038/s41592-019-0638-x
  3. Hock, D. H., Caruana, N. J., Semcesen, L. N., Lake, N. J., Formosa, L. E., Amarasekera, S. S. C., Stait, T., Tregoning, S., Frajman, L. E., Bournazos, A. M., Robinson, D. R. L., Ball, M., Reljic, B., Ryder, B., Wallis, M. J., Vasudevan, A., Beck, C., Peters, H., Lee, J.,…Stroud, D. A. (2025). Untargeted proteomics enables ultra-rapid variant prioritisation in mitochondrial and other rare diseases. Genome Medicine, 17(1), 58. https://doi.org/10.1186/s13073-025-01467-z
  4. Hoq, M., Karlaftis, V., Mathews, S., Burgess, J., Donath, S. M., Carlin, J., Monagle, P., & Ignjatovic, V. (2019). A prospective, cross-sectional study to establish age-specific reference intervals for neonates and children in the setting of clinical biochemistry, immunology and haematology: the HAPPI Kids study protocol. BMJ Open, 9(4), e025897. https://doi.org/10.1136/bmjopen-2018-025897
  5. Li, M., Cobbold, S. A., & Smyth, G. K. (2025). Quantification and differential analysis of mass spectrometry proteomics data with probabilistic recovery of information from missing values. bioRxiv, 2025.2004.2028.651125. https://doi.org/10.1101/2025.04.28.651125
  6. Li, M., & Smyth, G. K. (2023). Neither random nor censored: estimating intensity-dependent probabilities for missing values in label-free proteomics. Bioinformatics, 39(5). https://doi.org/10.1093/bioinformatics/btad200
  7. Lunke, S., Bouffler, S. E., Patel, C. V., Sandaradura, S. A., Wilson, M., Pinner, J., Hunter, M. F., Barnett, C. P., Wallis, M., Kamien, B., Tan, T. Y., Freckmann, M. L., Chong, B., Phelan, D., Francis, D., Kassahn, K. S., Ha, T., Gao, S., Arts, P.,…Stark, Z. (2023). Integrated multi-omics for rapid rare disease diagnosis on a national scale. Nat Med, 29(7), 1681-1691. https://doi.org/10.1038/s41591-023-02401-9
  8. Martin, A. R., Williams, E., Foulger, R. E., Leigh, S., Daugherty, L. C., Niblock, O., Leong, I. U. S., Smith, K. R., Gerasimenko, O., Haraldsdottir, E., Thomas, E., Scott, R. H., Baple, E., Tucci, A., Brittain, H., de Burca, A., Ibañez, K., Kasperaviciute, D., Smedley, D.,…McDonagh, E. M. (2019). PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nature Genetics, 51(11), 1560-1565. https://doi.org/10.1038/s41588-019-0528-2