Extracting and Modeling a Large Melting Point Dataset (300k) from a Patent Collection
What is it about?
Text-mining was used for automated extraction of melting point data from published PATENTS. Almost 300,000 data points were collected and used to develop models to predict melting and pyrolysis (decomposition). The models are available for everyone to use!
Why is it important?
This paper indicates that it is now possible to text-mine property data directly out of a large corpus and, following automated curation/validation the data can then be used as the basis of building models. This work was focused on Melting Point data but could be extended to other properties such as logP, NMR data etc.
The following have contributed to this page: Dr Antony John Williams
In partnership with: