Release Notes
Contents
Release Notes#
0.4.0
New Models#
ESM-1nv for protein sequence representations, pre-trained weights available
ProtT5nv for protein sequence representation and sequence-to-sequence tasks, pre-trained weights available
New Features#
Pre-training for all models, including automated data processing and full configuration for training
Fine-tuning of MegaMolBART, ESM-1nv, and ProtT5nv with encoder frozen or trainable
Downstream task example applications – secondary structure prediction for ESM-1nv and ProtT5nv, physchem prediction (lipophilicity, FreeSolv, ESOL) and retrosynthesis prediction for MegaMolBART
Validation in loop to evaluate performance on downstream tasks during training: physchem prediction (MegaMolBART) and secondary structure prediction (ESM-1nv and ProtT5nv).
Pipeline parallelism supported as a beta feature. Not fully tested.
Example notebooks for pre-training, fine tuning, and downstream tasks
New APIs#
BioNeMoDataModule - Encapsulates dataset instantiation in BioNeMo models so that many different datasets can be used with the same model
EncoderFineTuning - Base class to facilitate implementation of downstream tasks built on embeddings of other models