Modelling Cellular Perturbations with the Sparse Additive Mechanism Shift Variational Autoencoder

Bereket, Michael; Karaletsos, Theofanis

Statistics > Machine Learning

arXiv:2311.02794 (stat)

[Submitted on 5 Nov 2023 (v1), last revised 16 Jan 2024 (this version, v2)]

Title:Modelling Cellular Perturbations with the Sparse Additive Mechanism Shift Variational Autoencoder

Authors:Michael Bereket, Theofanis Karaletsos

View PDF

Abstract:Generative models of observations under interventions have been a vibrant topic of interest across machine learning and the sciences in recent years. For example, in drug discovery, there is a need to model the effects of diverse interventions on cells in order to characterize unknown biological mechanisms of action. We propose the Sparse Additive Mechanism Shift Variational Autoencoder, SAMS-VAE, to combine compositionality, disentanglement, and interpretability for perturbation models. SAMS-VAE models the latent state of a perturbed sample as the sum of a local latent variable capturing sample-specific variation and sparse global variables of latent intervention effects. Crucially, SAMS-VAE sparsifies these global latent variables for individual perturbations to identify disentangled, perturbation-specific latent subspaces that are flexibly composable. We evaluate SAMS-VAE both quantitatively and qualitatively on a range of tasks using two popular single cell sequencing datasets. In order to measure perturbation-specific model-properties, we also introduce a framework for evaluation of perturbation models based on average treatment effects with links to posterior predictive checks. SAMS-VAE outperforms comparable models in terms of generalization across in-distribution and out-of-distribution tasks, including a combinatorial reasoning task under resource paucity, and yields interpretable latent structures which correlate strongly to known biological mechanisms. Our results suggest SAMS-VAE is an interesting addition to the modeling toolkit for machine learning-driven scientific discovery.

Comments:	Presented at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023) (Post-NeurIPS fixes: cosmetic fixes, updated references, added simulation to appendix)
Subjects:	Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
Cite as:	arXiv:2311.02794 [stat.ML]
	(or arXiv:2311.02794v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2311.02794

Submission history

From: Theofanis Karaletsos [view email]
[v1] Sun, 5 Nov 2023 23:37:31 UTC (15,750 KB)
[v2] Tue, 16 Jan 2024 01:18:50 UTC (15,792 KB)

Statistics > Machine Learning

Title:Modelling Cellular Perturbations with the Sparse Additive Mechanism Shift Variational Autoencoder

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Modelling Cellular Perturbations with the Sparse Additive Mechanism Shift Variational Autoencoder

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators