BioNeMo Framework Tutorials#

The best way to get started with BioNeMo Framework is with the tutorials. Below are some of the example walkthroughs which contain code snippets that you can run from within the container.

Tutorials are presented as notebooks (.ipynb format), which may contain various code snippets in formats like Python, Bash, YAML, etc. You can follow the instructions in these files, make appropriate code changes, and execute them in the container.

It is convenient to first launch the BioNeMo Framework container and copy the tutorial files to the container, either via the JupyterLab interface drag-and-drop or by mounting the files during the launch of the container (docker run -v ...).

Topic

Title

Model Pre-Training

Launching a MegaMolBART model pre-training with ZINC-15 dataset

Custom Datasets

Setting up the ZINC15 dataset used for training MolMIM

Model Pre-Training

Launching a MolMIM model pre-training with ZINC-15 dataset, both from scratch and starting from an existing checkpoint

Model Pre-Training

ESM-1nv: Data preprocessing and model pre-training using BioNeMo with curated data from UniRef50, UniRef90

Model Pre-Training

ESM-2nv: Data Preprocessing and Model Training

Model Pre-Training

Pretraining a Geneformer model for representing single cell RNA-seq data

Geneformer Benchmarking

Benchmarking pre-trained Geneformer models against a baseline with cell type classification

Model Training

Launching an EquiDock model pre-training with DIPS or DB5 datasets

Inference

Performing Inference with MegaMolBART for Generative Chemistry and Predictive Modeling with RAPIDS

Inference

Zero-Shot Protein Design Using ESM-2nv

Inference

Performing Inference with ESM-2nv and Predictive Modeling with RAPIDS

Inference

MolMIM Inferencing for Generative Chemistry and Downstream Prediction

Inference

Performing Property-guided Molecular Optimization with MolMIM, which internally involves inference

Inference

Performing inference and cell clustering on CELLxGENE data with a pretrained geneformer model

Inference

Performing inference on OAS sequences with ESM-2nv

Model Finetuning

Overview of Finetuning pre-trained models in BioNeMo

Model Finetuning

Fine-Tune ESM-2nv on FLIP Data for Sequence-Level Classification, Regression, Token-Level Classification, and with LoRA Adapters

Model Pre-Training and Finetuning

Pretrain from Scratch, Continue Training from an Existing Checkpoint, and Fine-tune ESM-2nv on Custom Data

Encoder Finetuning

Encoder Fine-tuning in BioNeMo: MegaMolBART

Downstream Tasks

Training a Retrosynthesis Model using USPTO50 Dataset

Downstream Tasks

Fine-tuning MegaMolBART for Solubility Prediction

Custom Datasets

Adding the OAS Dataset: Downloading and Preprocessing

Custom Datasets

Adding the OAS Dataset: Modifying the Dataset Class

Custom DataLoaders

Creating a Custom Dataloader

Inference

Creating and Visualizing Embeddings with DNABERT

Model Finetuning

Pretrain, Fine-tune, and Perform Inference with DNABERT for Splice Site Prediction