Readers of my blog know that I am an advocate for Open Data, whereby scientists permit the widespread distribution of data without restrictions. Data should be available to anyone at any time at no cost without any legal (i.e., copyright or IP) restrictions. This will enhance our abilities to follow up on research and reuse the data in whatever way we wish. In particular, reuse of data should be seamless and lossless.

I have noted many times that Supporting Materials in today’s journals is far from ideal. Often authors do not include data at all! Sometimes the data is corrupted, especially if data is being deposited solely through pdf, which involves the loss of almost all semantic information about the data. Unfortunately, very rarely is data deposited in a form that is readily reusable. For example, I make use of 3-D coordinates of molecules for this blog, and these are invariably deposited as simply text within a pdf. I then have to copy-and-paste this data into a new file formatted for use in some molecular viewer of my choice (for me, typically GaussView or Avogadro).

The leader in advocating and demonstrating chemical data reuse is Henry Rzepa (see his blog for many examples). He and his group have published a paper describing a system for separating data from the paper narrative – a process they call data emancipation – as part of the scientific publication process.1 I strongly encourage readers of this blog to take a look at this paper for the publication model they propose that places data at the nexus of the scientific process and makes it available for widespread reuse. Take a look at the web enhanced objects, such as this one (you might need a subscription to access this, but this link takes you to the Figshare site which is open), to see how data can be deposited for search, retrieval, and direct reuse. This is a model I hope many computational chemists will adopt. We also need to advocate with journal editors and publishers to establish similar procedures for all manuscript submissions.


(1) Harvey, M. J.; Mason, N. J.; Rzepa, H. S. "Digital Data Repositories in Chemistry and Their Integration with Journals and Electronic Notebooks," J. Chem. Inf. Model. 2014, ASAP, DOI: 10.1021/ci500302p.