Code Repository for "A Pooled Cell Painting CRISPR Screening Platform Enables de novo Inference of Gene Function by Self-supervised Deep Learning"
Create environment, install dependencies and activate the environment
curl -fsSL https://pixi.sh/install.sh | bash
pixi install
# activating the environment for running shell scripts
pixi shell
# for running jupyter notebooks
pixi run jupyter labThe datasets are publicly available in Amazon S3 at the following URLs:
- 124 gene CRISPR KO Morphology dataset
- 300 gene CRISPR KO Mechanism of Action (MoA) dataset
- 1640 gene CRISPR KO Druggable Genome dataset
Tutorial for using the POSH barcode sequencing data to assign barcodes to CellPainting images can be found here
Scripts for Training CP-DINO model can be found here
Scripts for Inference using CP-DINO model can be found here
Notebooks for analysis and evaluation of CP-DINO embeddings are available for the following datasets:
-
Morphology 124 Dataset
-
MoA 300 Dataset
-
Druggable Genome 1640 Dataset
Command for training CP-DINO model on 124 gene CRISPR KO Morphology dataset
Note: User may need to enable access to .sh scripts via chmod 700 ./scripts/training/cpdino*
./scripts/training/cpdino_300_fp32_100ep.sh
./scripts/training/cpdino_1640_fp32_100ep.sh
./scripts/training/cpdino_1640_with_pS6_fp32_100ep.shPretrained model weights for CP-DINO can be found here.
Command for inference using CP-DINO model on 124 gene CRISPR KO Morphology dataset
./scripts/inference/cpposh_124_inference.sh
./scripts/inference/cpposh_300_inference.sh
./scripts/inference/cpposh_1640_inference.sh