MoleculeNet Physical Chemistry Datasets - Lipophilicity, FreeSolv, ESOL#

Link: MoleculeNet - Lipophilicity, FreeSolv, ESOL

** Data Collection Method by dataset

  • Hybrid: Human & Automatic/Sensors

** Labeling Method by dataset

  • Hybrid: Human & Automated

Properties (Quantity, Dataset Descriptions, Sensor(s)):

MoleculeNet Physical Chemistry is an aggregation of public molecular datasets. The physical chemistry portion of MoleculeNet that we used for evaluation is made up of ESOL (1128 compunds), FreeSolv (642 compunds) and Lipohilicity (4200 compunds).

Zhenqin Wu, Bharath Ramsundar, Evan N. Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, Vijay Pande, MoleculeNet: A Benchmark for Molecular Machine Learning, arXiv preprint, arXiv: 1703.00564, 2017.

From the MoleculeNet documentation:

  • ESOL is made up of water solubility data(log solubility in mols per litre) for common organic small molecules.

  • FreeSolv is made up of experimental and calculated hydration free energy of small molecules in water.

  • Lipophilicity is composed of experimental results of octanol/water distribution coefficient(logD at pH 7.4).