Background: Resistance to targeted molecular therapies—both primary and acquired—remains a major obstacle to effective cancer treatment. Despite extensive research, the molecular determinants of treatment resistance are still incompletely understood, underscoring the need to identify robust resistance drivers to improve therapeutic outcomes. Recently, the ProCan and Wellcome Sanger Institute released the world’s largest pan-cancer proteomic dataset to date, comprising 949 cancer cell lines treated with 625 anti-cancer agents. Despite progress in identifying single protein markers, scalable methods for detecting synergistic protein interactions driving drug susceptibility remain limited. Machine learning models such as random forests may offer a robust framework for capturing non-linear protein interactions across diverse cancer types.
Results: Our study presents synerOmics, a scalable framework for identifying putative synergistic protein interactions by leveraging parent–child co-occurrences in random forest regression trees to reduce the interaction search space. We validated our approach using simulated data and two independent cancer proteomic datasets, identifying both pan-cancer and breast cancer-specific markers associated with drug susceptibility. Synergistic interactions that consistently replicated across datasets were enriched for endoplasmic reticulum stress pathways. Notably, shared targets included established sensitivity markers for tyrosine kinase inhibitors (TKIs) and revealed a resistance-associated network centred on ATL3, which demonstrated both prognostic and predictive relevance.
Conclusions: Applying synerOmics to the largest cancer cell line proteomic dataset to date, we identified ATL3 as a strong candidate biomarker for lapatinib resistance in breast cancer, with superior predictive performance compared to the current gold standard sensitivity biomarker, ERBB2.