Research data shared service at IDCC 16

In this post Paul Stokes reports on the Jisc Research Data Shared Service workshop held at IDCC 2016 in Amsterdam on 22nd February 2016.

At IDCC 2016 Jisc held a workshop to share some of our early experiences—and those of others—in putting together the requirements for the Jisc Research Data Shared Service Pilot. Apart from sharing what we knew and the work we had carried out, we also wanted to ask the wider community—especially those from beyond the UK borders—for their perspective on the challenges experienced when scoping and implementing similar shared service (if indeed such services exist).

Before I go into a description of the interactive workshop part it’s worthwhile I think to briefly recap on the presentations we had to set the scene at the start of the session. We were fortunate to have a number of experienced practitioners involved in the pilot to set us off on the right path.

We started with Rachel Bruce (Deputy chief innovation officer at Jisc) outlining the Research @ Risk program, the drivers behind it and, in particular, Jisc’s motivation for embarking on the research data shared service project [http://www.slideshare.net/JiscRDM/research-at-risk-developing-a-shared-research-data-management-service-for-uk-universities]. This provided John Kaye (Senior Co-Design Manager – Research Data) with an opening to show a little more detail about the shared service project itself [http://www.slideshare.net/JiscRDM/jisc-research-data-shared-service-overview-idcc-2016] revealing to the audience the identities oi the pilot institutions, the time line and the eight core lots making up, the underlying framework.

Jenny Mitcham (Digital archivist at the University of York) and Marta Teperek (Research Data Facility Manager at the University of Cambridge) spoke of the current RDM provisions at their institutions and their motivations for being involved as pilot institutions. Both managed to squeeze an enormous amount of facts and figures into the brief time allotted to them so do check out their presentations for more information (http://www.slideshare.net/JiscRDM/jisc-research-data-management-shared-service-workshop-an-institutional-perspective and http://www.slideshare.net/JiscRDM/research-data-management-and-cambridge-and-our-motivations-for-the-pilot).

Then it was time to get the audience working (it was after all a workshop). We had a full house with representatives from the all over the globe with a wide range of experience. We divided them into a few ad-hoc groups and asked them to consider the procurement lots and the shape of the service introduced by John earlier. Suggested discussion topics included the following (although people were encouraged to “go off on one” if their group felt so inclined):

Have we missed anything? Are there gaps?
Key issues around User Experience
Repository, preservation and reporting platforms for RDM.
Hooks and incentives for researchers for using services.
Lessons learned, advice for the project and success factors.
Any potential linkages to other national and international projects

So what did the workshop come up with? Encouragingly, a lot of what was fed back at the end of the session matched or complemented our earlier finding with UK centric audiences. Standards and interoperability—which in itself is of course very much dependent upon standards—were a leitmotif throughout the discussions as was the question of cost, particularly when it comes to transitioning from a pilot project to a service.

The complete set of responses if too long for a short blog such as this, but some of the more interesting points included:

Ethics—the interoperability of ethical restrictions between systems (in particular at the current research systems to repository interface) appears to be a particularly thorny problem.
Costs—working on shared infrastructure projects may help overcome the project to operational costs hurdle.
Post project funding—how can you fund the retention of data after a project has ended? This is seen to be a particular problem with EC projects.
Big data—on the whole you need to accumulate bug data for a long time before it becomes useful (for data mining, etc.). This means that there is a potentially significant period where there if no benefit to offset the cost of preservation.
Dark archive—having a dark archive within a repository could help offset the risk associated with premature release of data or the complete failure to deposit data.
Versioning PIDs—there needs to be a way to connect earlier and later versions of the same data.
Linking data to papers—there’s an obvious need to link the underlying data to the published paper (and vice versa). However, any linking system needs to cope not only with the versioning problem alluded to above, but also with cases where subsequent new research and new papers are produced from the same data.

So what have we done with all this information since? For a start we’ve fed some of it back into our requirements and use cases for the service. Its real impact, however, will be seen in the next stage of the project where we will be working on the detailed pilot institution requirements and entering the development phase.

Have we missed out on considering your particular requirement? Do let us know (email john.kaye@jisc.ac.uk).

Finally, those of you who attended, thank you for your input.

And everyone, watch this space!

About the author: Paul Stokes is a senior co-design manager at Jisc. He has responsibility for supporting the exploration and, where appropriate, the take-up of innovative technologies and standards that will have a demonstrable positive impact on further and higher education in the UK.