Multiple Machine Learning Comparisons of HIV Cell-Based and Reverse Transcriptase Datasets

  • Kimberley M Zorn, Thomas R Lane, Daniel P Russo, Alex M. Clark, Vadim Makarov, Sean Ekins
  • Molecular Pharmaceutics, February 2019, American Chemical Society (ACS)
  • DOI: 10.1021/acs.molpharmaceut.8b01297

Comparing different machine learning models for HIV whole cell and reverse transcriptase

What is it about?

We curated select whole cell and reverse transcriptase datasets from the NIAID ChemDB database and used these with other datasets to build and validate machine learning methods (deep learning, bayesian, support vector machines etc.). We performed 5 fold cross validation and external validation with different test sets.

Why is it important?

We describe how there is really not a huge difference between different machine learning methods. We also describe a new metric that can be used to summarize many different model scores. We demonstrate how Assay Central Bayesian models may be a useful starting point for drug discovery efforts.


Dr Sean Ekins
Collaborations in Chemistry

Application of machine learning methods to HIV and RT datasets is rare. Also there have been few efforts to use the ChemDB database. This would suggest it is worthy of assessment and further curation to make the data ready for models.

Read Publication

The following have contributed to this page: Dr Sean Ekins