Higher education institutions are coming under increasing pressure to manage the research data generated by their researchers that cannot be curated by subject-based data centres – and many are unsure how to proceed given the absence of clear good practice. To address such concerns, JISC’s Managing Research Data programme has, with an investment of nearly £2M, funded eight projects to provide the UK Higher Education sector with examples of good research data management.
The projects are first identifying requirements to manage data created by researchers within an institution, or across a group of institutions, and then piloting research data management infrastructures at institutional, departmental or research group level, to address these requirements. In order better to understand the investment and change that may be required, cost-benefit analysis is included in the projects’ work.
It is intended that these pilots will demonstrate the varied ways in which research data management infrastructure can generate benefits for the HE sector: e.g. savings of time, more efficient research, better allocation of resources to active research, greater opportunities for sharing and reuse. Individually and as a group, it is hoped that these projects will contribute to a broader case for the improving data management practices in UK HE.
ADMIRAL: A Data Management Infrastructure for Research Activities in the Life sciences, University of Oxford, Department of Zoology
The ADMIRAL Project will create a complete ‘front-to-back’ data management infrastructure to facilitate the saving, management, annotation, sharing and archiving of heterogeneous research datasets arising from biological research. The ADMIRAL system will comprise (a) a secure and regularly backed up local research data filestore, the Life Science Data Store, to which saving datasets will be easy for users; (b) a Web file system build on top of that, accessible from any location using a Web browser, that will provide password-protected collaborative access to the data files for group members and trusted collaborators; (c) a Web-based annotation system for entering metadata describing datasets based on SHUFFL (http://code.google.com/p/shuffl/), which will be developed to permit structured metadata entry using Web services and global ontologies, and which will save the created metadata as RDF in a form suitable for open access publication as Open Linked Data; and (d) an automated upload system to the Oxford University DataBank for archival preservation, with options for embargo. A key principal underlying the ADMIRAL data management infrastructure, and the secret of its successful uptake by the user community, is that of ‘sheer curation’, namely enabling principled data curation in ways that employ file formats and user interfaces familiar to the user, do not disrupt to researchers’ conventional workflows and software choices, and impose little by way of cognitive overhead. Adopting a user-led and iterative design approach, ADMIRAL will build on several former and current JISC projects, particularly SHUFFL, and will safeguard research council and charitable research funding investments by enabling both the long-term institutional preservation and the open access publication of hard-won research datasets. To facilitate citation and personal recognition for data publication, ADMIRAL will assign DOIs to datasets in a collaboration with the British Library, and grant open Creative Commons licenses in collaboration with Science Commons. While developed to serve a local user community, the ADMIRAL Project will establish methodologies and demonstrate good practice for the semantic enrichment, archiving and publication of research datasets, and will create a generic SOA data management infrastructure, that could be re-deployed to other areas of scholarly activity, facilitating a sea change in community attitudes towards better annotation, greater archiving and increased open access publishing of research datasets.
Project Manager: Graham Klyne, graham.klyne at zoo.ox.ac.uk
Project Director: David Shotton, david.shotton at zoo.ox.ac.uk
FISHNet (Freshwater Information SHaring Network), King’s College London
FISHnet will address the data management requirements of a particular research community – in the field of freshwater biology – then develop, pilot and evaluate a data management infrastructure to address those requirements. The community is relatively small, but is highly dispersed; unlike other disciplines, there are no large freshwater biology university departments. Instead freshwater research is represented by small groups scattered across university departments, as well as in government agencies and industry. FISHNet will take academic researchers at King’s as a focus, but an essential part of our approach will be to engage the broad and diverse freshwater research community. We will achieve this through the involvement of the Freshwater Biological Association (FBA), a charity with very strong links with freshwater researchers in all sectors. To maximise the benefits to the research community, there is limited point in addressing just the needs of researchers within a single institution; this proposal will look beyond institutional confines to investigate solutions that address the needs of this research community as a whole, at the same time providing a model for other disciplines with similar issues.
Project Manager: Stephen Grace, stephen.grace at kcl.ac.uk
Project Directors, Mark Hedges: mark.hedges at kcl.ac.uk and Keiron McNicol: kmcnicol at fba.org.uk
IDMB (Institutional Data Management Blueprint), University of Southampton
The aim of the Institutional Data Management Blueprint (IDMB) project is to create a practical and attainable institutional framework for managing research data that facilitates ambitious national and international e-research practice. The objective is to produce a framework for managing research data that encompasses a whole institution (exemplified by the University of Southampton) and based on an analysis of current data management requirements for a representative group of disciplines with a range of different data. Building on the developed policy and service-oriented computing framework, the project will scope and evaluate a pilot implementation plan for an institution-wide data model, which can be integrated into existing research workflows and extend the potential of existing data storage systems, including those linked to discipline and national shared service initiatives. The project will build upon a decade of previous open access repository initiatives at Southampton to create a coherent set of next actions for an institutional, cross-discipline 10-year roadmap, which will be flexible in accommodating future moves to shared services, and provide a seamless transition of data management from the desktop to national/international repositories. The outcomes from this project, which will draw together technical, organisational and professional expertise from across the institution, will be widely disseminated within the sector as a form of HEI Data Management “Business Plan How-To”.
Project Manager: Kenji Takeda, ktakeda at soton.ac.uk
I2S2 (Infrastructure for Integration in Structural Sciences), University of Bath
I2S2 will identify requirements for a data-driven research infrastructure in “Structural Science”, focusing on the domain of Chemistry, but with a view towards inter-disciplinary application. I2S2 will develop use cases that explore perspectives of scale and complexity and research discipline throughout the data lifecycle.
Two research data management pilots based on use cases, will examine the business processes of research, and highlight the benefits of an integrated approach. Both pilots will address traversing administrative boundaries between institutions to national facilities in addition to issues of scale (local laboratory to national facilities, DIAMOND synchrotron and ISIS respectively). Pilot 2 will in particular, apply the approach to Earth Sciences and demonstrate the benefit to scientific disciplines other than Chemistry.
A key component of the infrastructure will be a harmonised Integrated Information Model to include all stages of the Data Life Cycle.
A “before and after” cost-benefit analysis will be performed using the Keeping Research Data Safe (KRDS2) model, which will be extended to address inter-disciplinary requirements in I2S2.
Project Manager: Manjula Patel, m.patel at ukoln.ac.uk
Project Director: Liz Lyon, e.j.lyon at ukoln.ac.uk
Incremental (a step by step approach to informing, improving, and increasing research data curation practice), University of Cambridge
The Incremental project addresses a crucial step in informing, improving, and increasing data curation activity within UK HEIs. We aim to improve data curation activities using a bottom-up approach by building researchers’ capacity to better understand the data curation lifecycle and how it relates to the management of their data. The project team will work with research groups based at both the University of Cambridge and the University of Glasgow to identify their current data curation practices and data management requirements as well as their current support mechanisms. From here, the project team will work with the research groups to support the improvement of their current practices using existing tools and resources wherever possible, and work to identify gaps where additional support is required.
By working from the bottom up, our research groups will be better equipped to inform and influence the development and workable implementation of data repository policies and higher-level, institutional strategies. The Incremental methodology and findings will be promoted, and embedded wherever possible, via Digital Curation Centre (DCC) activity to support other UK HEI’s embarking on similar exercises.
Project Director: Grant Young, gy219 at cam.ac.uk
MaDAM: A Pilot Data Management Infrastructure for Biomedical Researchers at the University of Manchester, University of Manchester
The MaDAM project will capture requirements and develop a pilot infrastructure as a first step in introducing a university-wide data management service for the University of Manchester. This encompasses data capture, data storage and data curation, and is designed to add value both to the full lifecycle of research projects and also by making data readily available for reuse. The project will focus on a specific domain area (Life & Medical Sciences) as input to a wider strategic activity to address the needs of the whole of the University research community.
Project Manager: June Finch, june.finch at manchester.ac.uk
PEG-BOARD, University of Bristol, School of Geography
The PEG-BOARD project is topic-led, focusing on management of palæoclimate data, an important research area today as a result of the worldwide focus on anthropocentric climate change. This data is presently reused by many communities: palæoclimate research, predictive climate models, oceanography, atmospheric and earth science, biology and ecology, mathematics, archaeology, teaching in HE, and the media, publishing scientific communications for a global audience. The project focuses on enabling open access to historical climate data in a systematic, managed environment. PEG-BOARD explores the data management needs of a palæoclimate research group and the linked ecosystem of researchers, including named project partners and associates active in Earth Sciences (University of Leeds), Archaeology (University of Southampton) and journalism/broadcasting (BBC). It examines identification of requirements, social, policy and technical, within and without the core institutions that make up BRIDGE, and on the adaptation (or development) and deployment of a pilot data management infrastructure. Outcomes will be evaluated via a user-centred process of agile development and feedback centred around the named user communities. Finally a business model is developed exploring the future sustainability of project outputs in light of the findings, a process involving relevant stakeholders such as the British Atmospheric Data Centre.
Project Manager: Greg Tourte, g.j.l.tourte at bristol.ac.uk
Sudamih (Supporting Data Management Infrastructure for the Humanities), University of Oxford
The Supporting Data Management Infrastructure for the Humanities (Sudamih) Project aims to address a coherent range of requirements for the more effective management of data (broadly defined) within the Humanities at an institutional level. Whilst the project is fully embedded within the institutional context of Oxford University, the methodologies, outputs and outcomes will be of relevance to other research-led universities, especially but not only, in their support of research within the humanities. The project places emphasis on two particular areas: recognition and support for the “life’s work” nature of much of humanities research; recognition and support for the simple and effective creation of online databases for typical data-types within the Humanities (Database as Service for e.g. text, image and geo-data). The Sudamih Project is driven by the requirements of researchers within the Humanities Division at Oxford; will operate as a collaborative project between the research community and institutional service providers; builds on existing internal and JISC-funded strategic activities within Oxford; and will work closely with the Digital Curation Centre (DCC), the Research Information Network (RIN) and the UK Research Data Service (UKRDS) initiative.
Project Manager: James Wilson, james.wilson at oucs.ox.ac.uk
Project Director: Paul Jeffreys, paul.jeffreys at odit.ox.ac.uk