Simple and Scalable Algorithms for Cluster-Aware Precision Medicine

Buch, Amanda M.; Liston, Conor; Grosenick, Logan

Computer Science > Machine Learning

arXiv:2211.16553 (cs)

[Submitted on 29 Nov 2022 (v1), last revised 17 May 2023 (this version, v3)]

Title:Simple and Scalable Algorithms for Cluster-Aware Precision Medicine

Authors:Amanda M. Buch, Conor Liston, Logan Grosenick

View PDF

Abstract:AI-enabled precision medicine promises a transformational improvement in healthcare outcomes by enabling data-driven personalized diagnosis, prognosis, and treatment. However, the well-known "curse of dimensionality" and the clustered structure of biomedical data together interact to present a joint challenge in the high dimensional, limited observation precision medicine regime. To overcome both issues simultaneously we propose a simple and scalable approach to joint clustering and embedding that combines standard embedding methods with a convex clustering penalty in a modular way. This novel, cluster-aware embedding approach overcomes the complexity and limitations of current joint embedding and clustering methods, which we show with straightforward implementations of hierarchically clustered principal component analysis (PCA), locally linear embedding (LLE), and canonical correlation analysis (CCA). Through both numerical experiments and real-world examples, we demonstrate that our approach outperforms traditional and contemporary clustering methods on highly underdetermined problems (e.g., with just tens of observations) as well as on large sample datasets. Importantly, our approach does not require the user to choose the desired number of clusters, but instead yields interpretable dendrograms of hierarchically clustered embeddings. Thus our approach improves significantly on existing methods for identifying patient subgroups in multiomics and neuroimaging data, enabling scalable and interpretable biomarkers for precision medicine.

Comments:	15 pages, 3 figures, 3 tables
Subjects:	Machine Learning (cs.LG); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)
Cite as:	arXiv:2211.16553 [cs.LG]
	(or arXiv:2211.16553v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2211.16553

Submission history

From: Amanda M. Buch Ph.D. [view email]
[v1] Tue, 29 Nov 2022 19:27:26 UTC (2,147 KB)
[v2] Tue, 31 Jan 2023 02:21:28 UTC (1,722 KB)
[v3] Wed, 17 May 2023 22:49:42 UTC (862 KB)

Computer Science > Machine Learning

Title:Simple and Scalable Algorithms for Cluster-Aware Precision Medicine

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Simple and Scalable Algorithms for Cluster-Aware Precision Medicine

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators