Research Data Management Infrastructure Projects

Higher education institutions are coming under increasing pressure to manage the research data generated by their researchers that cannot be curated by subject-based data centres - and many are unsure how to proceed given the absence of clear good practice.  To address such concerns, JISC’s Managing Research Data programme has, with an investment of nearly £2M, funded eight projects to provide the UK Higher Education sector with examples of good research data management.

The projects are first identifying requirements to manage data created by researchers within an institution, or across a group of institutions, and then piloting research data management infrastructures at institutional, departmental or research group level, to address these requirements. In order better to understand the investment and change that may be required, cost-benefit analysis is included in the projects’ work.

It is intended that these pilots will demonstrate the varied ways in which research data management infrastructure can generate benefits for the HE sector: e.g. savings of time, more efficient research, better allocation of resources to active research, greater opportunities for sharing and reuse.  Individually and as a group, it is hoped that these projects will contribute to a broader case for the improving data management practices in UK HE.

ADMIRAL: A Data Management Infrastructure for Research Activities in the Life sciences, University of Oxford, Department of Zoology

The ADMIRAL Project will create a complete ‘front-to-back’ data management infrastructure to facilitate the saving, management, annotation, sharing and archiving of heterogeneous research datasets arising from biological research. The ADMIRAL system will comprise (a) a secure and regularly backed up local research data filestore, the Life Science Data Store, to which saving datasets will be easy for users; (b) a Web file system build on top of that, accessible from any location using a Web browser, that will provide password-protected collaborative access to the data files for group members and trusted collaborators; (c) a Web-based annotation system for entering metadata describing datasets based on SHUFFL (http://code.google.com/p/shuffl/), which will be developed to permit structured metadata entry using Web services and global ontologies, and which will save the created metadata as RDF in a form suitable for open access publication as Open Linked Data; and (d) an automated upload system to the Oxford University DataBank for archival preservation, with options for embargo. A key principal underlying the ADMIRAL data management infrastructure, and the secret of its successful uptake by the user community, is that of ’sheer curation’, namely enabling principled data curation in ways that employ file formats and user interfaces familiar to the user, do not disrupt to researchers’ conventional workflows and software choices, and impose little by way of cognitive overhead. Adopting a user-led and iterative design approach, ADMIRAL will build on several former and current JISC projects, particularly SHUFFL, and will safeguard research council and charitable research funding investments by enabling both the long-term institutional preservation and the open access publication of hard-won research datasets. To facilitate citation and personal recognition for data publication, ADMIRAL will assign DOIs to datasets in a collaboration with the British Library, and grant open Creative Commons licenses in collaboration with Science Commons. While developed to serve a local user community, the ADMIRAL Project will establish methodologies and demonstrate good practice for the semantic enrichment, archiving and publication of research datasets, and will create a generic SOA data management infrastructure, that could be re-deployed to other areas of scholarly activity, facilitating a sea change in community attitudes towards better annotation, greater archiving and increased open access publishing of research datasets.

Project Manager: Graham Klyne, graham.klyne at zoo.ox.ac.uk

Project Director: David Shotton, david.shotton at zoo.ox.ac.uk

FISHNet (Freshwater Information SHaring Network), King’s College London

FISHnet will address the data management requirements of a particular research community – in the field of freshwater biology – then develop, pilot and evaluate a data management infrastructure to address those requirements. The community is relatively small, but is highly dispersed; unlike other disciplines, there are no large freshwater biology university departments. Instead freshwater research is represented by small groups scattered across university departments, as well as in government agencies and industry. FISHNet will take academic researchers at King’s as a focus, but an essential part of our approach will be to engage the broad and diverse freshwater research community. We will achieve this through the involvement of the Freshwater Biological Association (FBA), a charity with very strong links with freshwater researchers in all sectors. To maximise the benefits to the research community, there is limited point in addressing just the needs of researchers within a single institution; this proposal will look beyond institutional confines to investigate solutions that address the needs of this research community as a whole, at the same time providing a model for other disciplines with similar issues.

Project Manager: Stephen Grace, stephen.grace at kcl.ac.uk

Project Directors, Mark Hedges: mark.hedges at kcl.ac.uk and Keiron McNicol: kmcnicol at fba.org.uk

IDMB (Institutional Data Management Blueprint), University of Southampton

The aim of the Institutional Data Management Blueprint (IDMB) project is to create a practical and attainable institutional framework for managing research data that facilitates ambitious national and international e-research practice. The objective is to produce a framework for managing research data that encompasses a whole institution (exemplified by the University of Southampton) and based on an analysis of current data management requirements for a representative group of disciplines with a range of different data. Building on the developed policy and service-oriented computing framework, the project will scope and evaluate a pilot implementation plan for an institution-wide data model, which can be integrated into existing research workflows and extend the potential of existing data storage systems, including those linked to discipline and national shared service initiatives. The project will build upon a decade of previous open access repository initiatives at Southampton to create a coherent set of next actions for an institutional, cross-discipline 10-year roadmap, which will be flexible in accommodating future moves to shared services, and provide a seamless transition of data management from the desktop to national/international repositories. The outcomes from this project, which will draw together technical, organisational and professional expertise from across the institution, will be widely disseminated within the sector as a form of HEI Data Management “Business Plan How-To”.

Project Manager: Kenji Takeda, ktakeda at soton.ac.uk

Website: http://www.southamptondata.org

I2S2 (Infrastructure for Integration in Structural Sciences), University of Bath

I2S2 will identify requirements for a data-driven research infrastructure in “Structural Science”, focusing on the domain of Chemistry, but with a view towards inter-disciplinary application. I2S2 will develop use cases that explore perspectives of scale and complexity and research discipline throughout the data lifecycle.

Two research data management pilots based on use cases, will examine the business processes of research, and highlight the benefits of an integrated approach. Both pilots will address traversing administrative boundaries between institutions to national facilities in addition to issues of scale (local laboratory to national facilities, DIAMOND synchrotron and ISIS respectively). Pilot 2 will in particular, apply the approach to Earth Sciences and demonstrate the benefit to scientific disciplines other than Chemistry.

A key component of the infrastructure will be a harmonised Integrated Information Model to include all stages of the Data Life Cycle.

A “before and after” cost-benefit analysis will be performed using the Keeping Research Data Safe (KRDS2) model, which will be extended to address inter-disciplinary requirements in I2S2.

Project Manager: Manjula Patel, m.patel at ukoln.ac.uk

Project Director: Liz Lyon, e.j.lyon at ukoln.ac.uk

Website: http://www.ukoln.ac.uk/projects/I2S2

Incremental (a step by step approach to informing, improving, and increasing research data curation practice), University of Cambridge

The Incremental project addresses a crucial step in informing, improving, and increasing data curation activity within UK HEIs. We aim to improve data curation activities using a bottom-up approach by building researchers’ capacity to better understand the data curation lifecycle and how it relates to the management of their data. The project team will work with research groups based at both the University of Cambridge and the University of Glasgow to identify their current data curation practices and data management requirements as well as their current support mechanisms. From here, the project team will work with the research groups to support the improvement of their current practices using existing tools and resources wherever possible, and work to identify gaps where additional support is required.

By working from the bottom up, our research groups will be better equipped to inform and influence the development and workable implementation of data repository policies and higher-level, institutional strategies. The Incremental methodology and findings will be promoted, and embedded wherever possible, via Digital Curation Centre (DCC) activity to support other UK HEI’s embarking on similar exercises.

Project Director: Grant Young, gy219 at cam.ac.uk

MaDAM: A Pilot Data Management Infrastructure for Biomedical Researchers at the University of Manchester, University of Manchester

The MaDAM project will capture requirements and develop a pilot infrastructure as a first step in introducing a university-wide data management service for the University of Manchester.  This encompasses data capture, data storage and data curation, and is designed to add value both to the full lifecycle of research projects and also by making data readily available for reuse.  The project will focus on a specific domain area (Life & Medical Sciences) as input to a wider strategic activity to address the needs of the whole of the University research community.

Project Manager: June Finch, june.finch at manchester.ac.uk

Website: http://www.library.manchester.ac.uk/aboutus/projects/MaDAM

PEG-BOARD, University of Bristol, School of Geography

The PEG-BOARD project is topic-led, focusing on management of palæoclimate data, an important research area today as a result of the worldwide focus on anthropocentric climate change. This data is presently reused by many communities: palæoclimate research, predictive climate models, oceanography, atmospheric and earth science, biology and ecology, mathematics, archaeology, teaching in HE, and the media, publishing scientific communications for a global audience. The project focuses on enabling open access to historical climate data in a systematic, managed environment. PEG-BOARD explores the data management needs of a palæoclimate research group and the linked ecosystem of researchers, including named project partners and associates active in Earth Sciences (University of Leeds), Archaeology (University of Southampton) and journalism/broadcasting (BBC). It examines identification of requirements, social, policy and technical, within and without the core institutions that make up BRIDGE, and on the adaptation (or development) and deployment of a pilot data management infrastructure. Outcomes will be evaluated via a user-centred process of agile development and feedback centred around the named user communities. Finally a business model is developed exploring the future sustainability of project outputs in light of the findings, a process involving relevant stakeholders such as the British Atmospheric Data Centre.

Project Manager: Greg Tourte, g.j.l.tourte at bristol.ac.uk

Sudamih (Supporting Data Management Infrastructure for the Humanities), University of Oxford

The Supporting Data Management Infrastructure for the Humanities (Sudamih) Project aims to address a coherent range of requirements for the more effective management of data (broadly defined) within the Humanities at an institutional level. Whilst the project is fully embedded within the institutional context of Oxford University, the methodologies, outputs and outcomes will be of relevance to other research-led universities, especially but not only, in their support of research within the humanities. The project places emphasis on two particular areas: recognition and support for the “life’s work” nature of much of humanities research; recognition and support for the simple and effective creation of online databases for typical data-types within the Humanities (Database as Service for e.g. text, image and geo-data). The Sudamih Project is driven by the requirements of researchers within the Humanities Division at Oxford; will operate as a collaborative project between the research community and institutional service providers; builds on existing internal and JISC-funded strategic activities within Oxford; and will work closely with the Digital Curation Centre (DCC), the Research Information Network (RIN) and the UK Research Data Service (UKRDS) initiative.

Project Manager: James Wilson, james.wilson at oucs.ox.ac.uk

Project Director: Paul Jeffreys, paul.jeffreys at odit.ox.ac.uk

Press Release Announces the Managing Research Data Programme #jiscmrd

Press Release: JISC helps researchers to meet the research data challenge

JISC marked the launch of its research data management programme at the UK e-Science All Hands meeting this week. The programme helps researchers, institutions, funders and policy makers meet the challenge of keeping research data for re-use in the future. A new briefing paper is also published.

Researchers in almost all disciplines now create ‘data’ in digital form. Some data are well-managed and kept so that they can be re-used in future. But the majority go un-catalogued, are stored in an ad hoc fashion and are thus lost to posterity, resulting in a failure to reap the full potential of investment from present day research.

JISC’s Managing Research Data programme is addressing the research data challenge from new perspectives, plugging notable gaps in current knowledge and making links between the needs of researchers, institutions and policy makers.

The programme, which ends in 2011, will provide researchers and institutions with case studies of good research data management, better tools for managing research data and training materials.  Improved tools for citing, integrating and linking data, including to publications, will also be developed. Policy makers will gain a clearer understanding of researchers’ and institutions’ requirements, a better view of the value of different types of data and a roadmap outlining the steps needed to achieve a coherent UK policy for research data.

‘Research and scientific innovation depend on finding, integrating and re-using the products of previous research,’ argues Dr Simon Hodson, JISC e-research programme manager.  ‘As research becomes increasingly reliant upon information and data held in a digital form the question of how these resources are maintained becomes of vital importance.

‘The Managing Research Data programme is addressing the sector’s need for both the infrastructure required to manage research data effectively and the skills and knowledge-base needed to make the most out of the research data asset,’ he concluded.

Dr William Kilbride, director of the Digital Preservation Coalition, introduced the programme during the poster session at All Hands, encouraging participants, all of whom create research data, to think about how they can ensure that their work has a lasting impact.

JISC is a major sponsor at All Hands this year. See more details of the meeting: http://www.jisc.ac.uk/events/2009/12/allhands09

Find out more about ‘Meeting the Research Data Challenge’ in the briefing paper: http://www.jisc.ac.uk/publications/documents/bpresearchdatachallenge

Read about the Managing Research Data programme, #jiscmrd: http://www.jisc.ac.uk/whatwedo/programmes/mrd

Explore JISC’s research 3.0 activity: http://www.jisc.ac.uk/whatwedo/campaigns/res3

Prospective Partners for Data Management Infrastructure Bids

Below is a list of organisations which, at the 6 July Briefing Day, made themselves known as keen to be partners in a Bid to the Data Management Infrastructure Call.

British Geological Survey (BGS): Jeremy Giles (jrag@bgs.ac.uk) Garry Baker (grba@bgs.ac.uk)

The British Geological Survey is part of the Natural Environment Research Council.  We manage the ‘National Geoscience Data Centre’ and have a wealth of experience in Information and Data Management including managing the output of NERC funded geoscience programmes. We would welcome the opportunity to partner with HEI’s in this JISC –Data management call. (Links: http://www.bgs.ac.uk/ and http://www.bgs.ac.uk/services/ngdc/home.html)

The British Library: Adam Farquhar (adam.farquhar@bl.uk) or Max Wilkinson (max.wilkinson@bl.uk) for projects relating to data citation.

The British Library’s Dataset Programme seeks to define and implement services to reduce the divide between traditional research publications and the data that underlies them.

As part of this programme, the Digital Library Technology department considers the JISC ‘Data Management Infrastructure Call for Projects’ as an excellent opportunity to identify and further develop the requirements for those who generate and consume data with those that persist and present high value datasets.

As a partner, the BL can provide

  1. The Researcher Information Centre (RIC).  Developed with Microsoft, the RIC provides a collaborative environment to support the full research life-cycle.
  2. UK PubMedCentral (UKPMC).  The UKPMC service provides a stable, permanent, and free-to-access online digital archive of full-text, peer-reviewed research publications.
  3. NAMES.  The Jisc-funded NAMES project is developing a pilot name authority service for researchers. DataCite – the international data citation initiative.
  4. DataCite enables data centres to assign digital object identifiers (DOIs) to datasets and is developing essential services to cite, find, and reuse data.

The BL is particularly interested in working with partners to understand the requirements of researchers and other stakeholders around data citation and use, identify best practices in several disciplines, and extend these practices to a broader community.
Bull Information Systems: Gordon Nother (gordon.nother@bull.co.uk)

Bull are the largest European owned System Integrator and have being providing our customers with end to end solutions for over 70 years. As a major supporter of Open Source and Green Computing, we have been bringing a professional and structured approach to further develop open ecosystems, and promote new ways of helping organisations to innovate, collaborate, and become more competitive.

We have also carried this physiology into our ‘StoreWay’ offering. We provide a full lifecycle of services which encompass Data Management & Storage Infrastructure:

We deliver true and measurable transformation into the DataCentre, and enable secure Data Management Efficiency through optimisation, consolidation, Virtulisation, active tiered storage & management, and collaboration.

Digital Archives (at ULCC): Kevin Ashley (k.ashley@ulcc.ac.uk) or enquiries@ulcc.ac.uk

ULCC’s Digital Archives department is a cross-disciplinary team of developers, archivists, data specialists and repository specialists. We have experience with audit and assessment methodologies relevant to this programme such as DAF, Drambora and AIDA (the last of which we developed) and in guiding and implementing organisational and technological changes in relation to digital asset management of all sorts, including research data. We also collaborate with the DCC on training in data curation and digital preservation, and have experience with JISC project delivery and management. We seek partners with research data who are looking for assistance with the problem scoping phase, the specification of pilot solutions, and project evaluation and dissemination.

Ex Libris UK Ltd: Robert Bley (robert.bley@exlibrisgroup.com)

National Grid Service (NGS): Andrew Richards (andrew.richards@stfc.ac.uk)

DCC 101-Lite Course for Prospective Bidders to Data Management Infrastructure Call

JISC and DCC Joint Workshop: Digital Curation 101 Lite
15 July 2009
Park Plaza Hotel,
Leeds, England
http://www.dcc.ac.uk/events/digital-curation-101-leeds-2009/register

About the Course
This course is being offered jointly by JISC and the DCC to support new bids under the JISC Data Management Infrastructure Programme call. Using our DCC Curation Lifecycle Model (http://www.dcc.ac.uk/lifecycle-model/) as a reference point, this one-day course will introduce participants to the range and nature of data curation activities and provide hands-on experience in making use of the Data Audit Framework (DAF) and Digital Repository Audit Method Based on Risk Assessment (DRAMBORA).  The event will run from 11:00-16:00 to allow for travel.

Background
The majority of scientific research is carried out through short-term, funded projects. Accordingly, principle investigators and researchers must constantly be on the lookout for new funding opportunities to continue their research activity. This, coupled with often limited staffing resources, has meant that data management and curation activities have not generally been given a high priority within research projects. However, research councils and funding bodies are becoming increasingly aware of the value of sharing and reusing data and now require evidence of adequate and appropriate provisions for data management and curation in new grant funding applications. To assist researchers in developing sound data management and curation plans, we developed this workshop to provide an introduction to digital curation and the range of activities that should be considered when planning and implementing new projects.

Benefits of Participation
Upon completion of this workshop, participants will have gained an insight into the range and nature of data management and curation activities that should be considered when planning new research projects using the digital curation lifecycle model as a reference point, and be better equipped to develop bids that reflect the recommendations cited in the JISC Data Management Infrastructure Programme call.

Target Audience
The target audience for this workshop is prospective bidders for the JISC Data Management Infrastructure Programme call.

Objectives:

Costs
This course is offered free of charge but places are limited to 25 participants. Register at http://www.dcc.ac.uk/events/digital-curation-101-leeds-2009/register.

Further courses may be added over the life of the Data Management Infrastructure Programme. Details will be posted via the DCC and JISC websites as they are confirmed.

Please note: This course aims to introduce participants to the range of activities and stakeholders that should be considered for active data curation, from conceptualisation of research projects through to access and reuse of data generated. If you are more interested in learning about organisational and technological issues with regards to digital preservation, we highly recommend the Digital Preservation Training Programme (DPTP)  (http://www.ulcc.ac.uk/dptp/about-dptp.html) which is targeted at managers in institutions who are grappling with fundamental digital preservation issues.

Joy Davidson, DCC, British.Editor@erpanet.org

STFC e-Science Centre Looking to Partner a Bid to Research Data Management Call

The STFC e-Science centre would like to partner an institution in a bid to the JISC Research Data Management Call.

The e-science centre at The Science and Technology Facilities Council (See http://www.e-science.stfc.ac.uk/) would like to hear from partners with experience in delivering data management solutions and operational services.  STFC (previously CCLRC and PPARC) have significant expertise in this area. In particular the e-science centre at STFC:

Please contact Gordon Brown Gordon.brown@stfc.ac.uk at the e-science centre at STFC for more information

Edinburgh Napier University Looking for Project Partners

Edinburgh Napier University would be interested in hearing from any potential partners who would be interested in taking advantage of Napier’s expertise.

David Telford, Deputy Director of C&IT Services writes: ‘We are implementing an information architecture with change management.  This supports our data integration and reporting.  It may be that our growing expertise may contribute some useful input and we may be a potential partner for those eligible to submit.’

Any interested parties should please contact David Telford d.telford@napier.ac.uk

Eduserv Looking for Project Partners for a Data Management Infrastructure Bid

Andy Powell of Eduserv is looking for Project Partners in a Bid to the Data Management Infrastructure Call.  On the eFoundation Blog, he explains what Eduserv would bring to any consortium.

Data Management Infrastructure Call

The Research Data Programme 2009-11 will build upon existing work, within and without the JISC, both nationally and internationally, to establish the foundations for the UK research data infrastructure. A briefing paper about the Programme will be available shortly.  One of the Programme’s principal aims is to encourage more effective management of research data across the HE sector and within institutions. The recent Call for Projects 07/09 Data Management Infrastructure addresses this objective.

The Call seeks projects which will identify requirements to manage data created by researchers, and then will deploy a pilot data management infrastructure to address these requirements. Projects may work at any appropriate level (e.g. research group, department, faculty, school etc) within an institution, or they may work with researchers collaborating across institutions.

Up to £250,000 per project is available. 6-8 projects will be funded for a total of £1.5M. A community Briefing Event will be held on Monday 6 July 2009. The deadline for submission of bids is 12 noon on Thursday 6 August 2009.

The Digital Curation Centre (DCC) will provide general support for this Call and for the JISC Research Data Programme more generally.  Bidders are invited to consult with the DCC in preparing their bids.  A specialised e-mail address and telephone number for enquiries will shortly be set up: please watch out for announcements.  In the meantime, bidders may use the contact details provided on the DCC’s general Helpdesk Page.

Please follow this blog for updates and ongoing announcements relating to this Call and the Research Data Programme more generally.

In advance of the Briefing Day, you may post queries relating to the Call on Twitter (#datman).  This will help us assemble a useful set of FAQs in advance of that event.

Looking for consortium partners?  Contact the DCC, the Programme Manager (s.hodson@jisc.ac.uk) or tweet (#datman).