Research at Risk

Research Data Metadata Workshop

On the 7th July Jisc hosted an event to discuss Research Data Metadata with the broad aim to begin the path to gaining community consensus around a discipline neutral metadata specification that would be compatible with multiple research systems. There were just under 40 attendees from institutions and research system providers.

The event was a discussion-based workshop and resources collected on the day are available on the event padlet.

John Kaye from Jisc started the day with an introduction outlining why metadata was key in Jisc’s research at risk plan, outlining the response from the initial community consultation and the funder policy drivers in this area. John outlined the objectives for the workshop and introduced the programme for the day. John also introduced Jisc’s draft architecture for Research data systems and data and metadata flows, which highlighted the importance of metadata in this area.

Chris Gutteridge from Southampton briefly introduced the new UK Research Data Community Wiki this was originally set up to capture knowledge from the Jisc research data spring projects, but can also be used to share knowledge about standards, including ‘unofficial and draft’ ones that institutions are currently using around research data.

Institutional experiences

To set the scene for a substantive discussion around what institutions were currently using in terms of research data Valerie McCutcheon from Glasgow introduced the specification that they are implementing and Rachel Proudfoot from Leeds talked about a similar specification that they are using internally and outlined the experiences of the N8 consortium in working together in an attempt to define a common metadata schema for the consortium.

Group-based discussions occupied most of the morning session.  They served to demonstrate that we in the UK are strongly placed to achieve consensus on a discipline-neutral research data metadata specification thanks to the many relevant projects and developments that have been going on during the past few years in this field.  There is a tendency at present to relate metadata choices to resources such as DataCite, the work done by the UK Data Service (which is embodied in the EPrints ReCollect plugin, as well as the efforts of the N8 research partnership (and the C4D project

At a local level there was discussion around the challenges of sourcing and re-using metadata from different systems so that the number of metadata fields required to be completed by depositors is kept to a minimum.  Some hold the view that discipline-specific metadata should be accommodated and the means by which that might be achieved were explored.  Although there are clearly many challenges that present themselves as institutions move towards offering the full range of services thought necessary to manage and store research data effectively, the importance of metadata in enabling interoperation, discovery and re-use was clearly recognised by delegates and there is clearly a desire to continue to make progress in this area.

Requirements of ‘metadata consumers’

After the group discussion there was a session on how a discipline-neutral metadata schema would need to map to national and international aggregators. Rachael Kotarski from the British Library outlined the latest developments with the DataCite metadata schema (Veerle Van den Eynden from UK Data Service outlined the latest thinking around the metadata work package for the UK Research Data Discovery Service. A schema for this service is still being worked up by the project’s metadata advisory group, but should map to an agreed discipline neutral schema. John Kaye from Jisc outlined the metadata requirements for the European Commission’s Research Data pilot in OpenAIRE (which asks Horizon2020 projects to use the DataCite 3.0 metadata schema with some additional mandatory fields to the standard DataCite schema. There was some discussion for fields that needed to be used to make research data metadata scholarly communications more effective, these included ORCIDs, links and identifiers to related publications and organisational ID’s, possibly ISNI’s.

Links to disciplinary metadata

To start the afternoon there was a discussion around links between discipline neutral metadata and disciplinary metadata. Veerle Van den Eynden set the scene in this session with a demo of the UK Data Service’s ReShare self deposit service which builds on the e-prints ReCollect plugin to allow social science researchers to easily deposit their data with UKDS. This led to a discussion on how technical or discipline specific metadata could be linked to the discipline neutral metadata schema. The University of Bath, for example have an ‘upload technical metadata’ button on their ingest form and others linked to existing published disciplinary metadata where available. It was seen to be useful to have an equipment ID field in the schema, as used in the service.

Metadata collection automation

Veerle’s ReShare demonstration also introduced a discussion on the automation of metadata collection, as ReShare utilises the RCUK Gateway to Research API to pull in details about grants and funding surrounding the dataset. There was a wider discussion around whether additional sources such as institutional Current Research Information Systems (CRIS), funder systems and Data Management Plans could be used to populate fields in metadata profiles. There was some work in this area, but more needed to be done to link up systems to reduce the amount of data a researcher needs to enter at data ingest. This session also featured Chris Gutteridge introducing the Organisational Profile Document (OPD) (which is a simple text file located on an institution’s home page that could potentially list the repositories at the institutions, the metadata specifications used and the location of the API or OAI feed for sharing metadata. Extending the OPD to cover RDM is also the subject of a DCC led Research Data Spring Project.

Feedback and next steps

Towards the end of the day we heard from commercial and open source research systems suppliers about their thoughts on the day and whether a discipline neutral specification could be incorporated into their products. The broad consensus from suppliers was that they could be flexible and work with any agreed metadata specification that met the needs of their clients.

To close the day there was a brief discussion on a way forward. There seemed to be a need for a community led consultation and including a document outlining work carried out so far and lessons learned. There were useful suggestions around engaging with standards bodies such as CASRAI and getting input from the international community through the Research Data Alliance. The full report of discussions from this useful day will be available on this blog within the next few weeks and Jisc would like to thank everybody who participated in this useful and informative event.