What is it about?

Databases are becoming ever more important in the biomedical sciences. This paper compares and contrasts four key resources broadly related to drug discovery research

Featured Image

Why is it important?

As databases proliferate relentlessly comparative utility judgments become more difficult. The problem is exacerbated since while short descriptions of databases in review articles abound, quantitative comparisons of content are rare. This paper is also unusual in that we evaluate both the key entities of chemistry and proteins. As expected, it is somewhat superceded by post-2013 updates but we hope the approaches and general concusions remain useful

Perspectives

As all database teams and experienced users intuit, the only way to really understand occupancy rules, entity content, the relationship matrix and, crucialy, real-worl utility, is to analyse the statistics and distributions in detail. This can be difficult for teams to get round to internally let alone to take the time to make this a comparative exersise (i.e. doing the same analyses on domain-neighbour resources). There is of course a social problem in that (certainly in cheminformatics and bioinformatics) are generally "nice" to each other. This is obviously good (esp for collaboration and cooperation) but it does mean much less technical comparison ever surfaces (including pinpointing frank errors in database entries) because folk feel this could be missintepreted as critism. I thus suggest more of these types of comparisons should be undertaken (n.b. linked data should make this easier)

Dr Christopher Southan

Read the Original

This page is a summary of: Comparing the Chemical Structure and Protein Content of ChEMBL, DrugBank, Human Metabolome Database and the Therapeutic Target Database, Molecular Informatics, December 2013, Wiley,
DOI: 10.1002/minf.201300103.
You can read the full text:

Read

Contributors

The following have contributed to this page