Metadata focus groups - report published - Research infrastructure and data

In the latter half of 2016, the Research Data Shared Service project commissioned a series of focus groups with the pilot institutions. Nicky Ferguson from Clax Ltd. joined Research Consulting to conduct these focus groups. Alongside wider data and cultural issues, Nicky concentrated specifically on metadata issues.

2016-10-03 13.51.19 — User requirements created by researchers in the focus groups were captured using the classic Post-It method (photo courtesy of Research Consulting)

The focus groups consisted of brief presentations, discussions, sometimes in small groups, and individual written feedback using various prompts. In some cases, if the institution preferred it, individual interviews were conducted, covering the same topics as the focus groups. The outputs from Nicky’s work are available at:

Report: https://doi.org/10.5281/zenodo.193018

Use case dataset: https://doi.org/10.5281/zenodo.193011

The use cases were produced from all the participant activity in the focus groups, some transcribed directly from handwritten participant forms and some noted by us during the discussions and interviews. Nine focus groups were held and participants came from 11 of the 12 Research Data Shared Service pilot institutions and from CREST, a consortium of 22 smaller and/or specialist institutions, which is also part of the pilot.

Emerging themes

Focus groups expressed concern about a number of areas with regard to metadata. Some can be addressed by training and support, many can be addressed by suppliers working with institutions and Research Data Shared Service. A few require new technologies or culture change.

Early creation and collection of metadata was often mentioned. This can be achieved through the use of dynamic data management plans so that metadata is collected from the planning stage and updated throughout the data collection and analysis process. Systems should preserve the form and content of the deposited data while allow updating of the metadata to link to related datasets, subsequent publications and other materials which may have been created after the data was deposited.They should also allow updating of keywords and descriptive materials to reflect changes in the discipline. The facility to allow metadata to include links to other Digital Object Identifiers (DOI) and URLs (where a DOI does not exist) is essential.

It is often assumed that the collection of metadata will involve researchers in arduous and time consuming form-filling at data deposit time. This is undesirable and unlikely to produce good metadata. Instead, automation of tools, collection processes, equipment and metadata collection integrated into researchers’ workflow throughout the research will, ideally, allow a push button submission of the data, with metadata already attached, to the repository.

Concerns were frequently expressed by users and creators of very large data and by users and creators of sensitive data which will have specific confidentiality and ethical considerations. The Research Data Shared Service will need to bear in mind these considerations if it is to provide a service to all academic researchers.

In order to fit with researchers’ existing workflows, the Research Data Shared Service will need to provide some level of integration with local, national and international services. Current Research Information Systems (CRIS), DataCite and GitHub were frequently mentioned.

Many thanks to the research support staff at all the institutions Nicky visited for setting up these consultations; and of course to the researchers, technicians, research managers, publishers, funders, repository managers and support staff who wholeheartedly gave up their time to discuss frankly the difficulties, benefits and future requirements for creating, recording, submitting, sharing, storing, securing and searching their research data.