Arguments in favour of research data sharing stress the need for verification and reproducibility. It is fundamental to the scientific method and to good research practice for other researchers to be able to test the evidence underpinning the hypotheses and interpretations presented in a given scholarly publication.
In recognition of this a number of journals have recommended or mandated that research data be deposited in appropriate data repositories prior to publication. Parallel to this, there are a growing number of initiatives that explicitly link journal articles with the underlying data or that may be characterised as data journals, championing the publication of research data sets with commentary, analysis and visualisation.
Technical, procedural and cultural challenges exist around the use of identifiers, exchange of metadata, effective linking and data citation. There is also a need to establish sustainable partnerships between journals, data centres and research organisations which are necessary to underpin innovative forms of data publication.
Innovative data publications are likely to provide researchers with recognition and reward for making datasets available and thus encourage data to be viewed as a first class research output, for data publication to be considered an essential part of the scholarly process. Likewise, it seems likely that as well as making it easier for researchers to locate and access datasets, linking between publications and supporting data will provide a means for established data centres, or even institutional data repositories to enhance and draw attention to well-curated research outputs.
For partnerships around data publication to become established, there are important questions to be considered:
What policies are required on the behalf of journals’ editorial boards to achieve greater levels on data sharing, citation and linkages between publications and datasets?
What partnerships between journals, data centres and research organisations are necessary to establish sustainable solutions, and what business models are appropriate?
How may the costs of long term data archiving be met and appropriately distributed in models that stress the importance of publishing data and linking data sets to published outputs?
- What characterises a suitable repository and what criteria of quality and assurance are necessary of the data archive underpinning such collaborations?
- What, if any, peer review of data is appropriate before publication?
The JISC Managing Research Data Programme 2011-13 has, therefore, funded two projects to design and implement innovative technical models and organisational partnerships to encourage and enable publication of research data. These projects will also explore these questions listed above and thereby shed light on solutions which will enable the greater development of data publication.
PREPARDE: Peer REview for Publication & Accreditation of Research Data in the Earth sciences
PREPARDE will capture the processes and procedures required to publish a scientific dataset, ranging from ingestion into a data repository, through to formal publication in a data journal. It will also address key issues arising in the data publication paradigm, namely, how does one peer-review a dataset, what criteria are needed for a repository to be considered objectively trustworthy, and how can datasets and journal publications be effectively cross-linked for the benefit of the wider research community.
Project website: http://proj.badc.rl.ac.uk/preparde
PRIME: Publisher, Repository and Institutional Metadata Exchange
PRIME will enable the automated exchange of metadata between publishers, subject-based and institutional repositories. A partnership between UCL, the Archaeology Data Service and Ubiquity Press, a campus-based open access publisher located at UCL, PRIME will ensure that each stakeholder has a record of content relevant to them, even when the data itself is held elsewhere.
As previously noted, scholarly journals are increasingly recommending or requiring as a condition of publication that research data should be made available in an appropriate repository. A service to collate and summarise journal research data policies would serve the purpose of providing researchers, managers of research data services and other stakeholders with an easy source of reference to understand the requirements and recommendations made by journal editorial board with regard to data sharing. Such a service would provide a useful information and advocacy tool for a variety of stakeholders in this area (including exponents of open data, research data infrastructure providers, institutional managers with responsibilities for research data management etc). It is also likely to provide a helpful incentive for the increasing systematisation and codification of such policies and for their more regular review.
JISC and other stakeholders need to understand precisely what is required in such a service and what business models are available to maintain a sustainable service, including a consideration of sources of funding and cost recovery.
The third project funded by the JISC Managing Research Data Programme in the area of data publication is feasibility study for a service to collate and summarise journal data policies, which will consider requirements and present possible business models.
JoRD: Journal Research Data Policy Bank
The Journal Research Data Policy Bank (JoRD) project will conduct a feasibility study into the scope and shape of a sustainable service to collate and summarise journal policies on Research Data. The aim of this service will be to provide researchers, managers of research data and other stakeholders with an easy source of reference to understand and comply with Research Data policies.
Project website: http://crc.nottingham.ac.uk/projects/jord.php