Arriving for the first time in St Andrews last Wednesday (on St Andrew’s Day!) for the research data network meeting I could see the town nestling off to the east as the taxi swung around to some modern buildings just off the main road. I was to spend the whole day in the excellent facilities of the Medical School, and so will need to come again to see St Andrews! However, roomy meeting rooms, a large comfortable lecture theatre, and an airy public area for breaks all afforded comfort and space for discussions and learning to take place.
The third Jisc Research Data Network meeting was being hosted at St Andrews as they are one of the pilot projects for the current research data shared service development. John MacColl, University Librarian and Director of Library Services, kicked off the presentations for the day. His keynote took the library perspective on research data – where it fits in and why it matters. He questioned whether research data is an output and does it validate a researcher? He offered the view that unless data is going to be rewarded then there may be no point in holding onto it!
Challenges for data librarians
John focussed on some published reports including the Ithaka UK survey of Academics 2015 ; some extracts from this analysis of relevance to the research data network day included:
- Substantial increase in number of academics who preserve data in online repository was shown; decrease in local preservation
- Scientists more likely to collect scientific, quantitative or computational data; Non-RLUK academics more frequently collect qualitative data
- <5% of all respondents used their library in organising and managing data
- 20% find it difficult to preserve data long term
On the whole the trends showed greater preservation of research data.
John highlighted the message from the Royal Society Science as an open enterprise report that “data must be accessible, intelligible, assessable and usable”. Further recommendations from this report included that data should be open by default, research councils should include the costs of data preparation and metadata in the costs of research, and that journals should require the underpinning data for publication. The report brings us back to the role of libraries in research data management, and that libraries have a potential relationship with all academic and research staff in this respect.
Considerations for using active data
In the next session that I attended we had the opportunity to hear from two researchers about their experiences with active data and research procedures. Their research fields in NMR spectroscopy and marine mammal research brought a fresh perspective to questions about research data and what to do with it. In particular both fields generate vast quantities of data, and lots of waste. In both these areas, and probably most other fields, one of the main challenges is making datasets discoverable as well as maintaining access to active data.
Parallel sessions at this time were delving into issues and strategies around engaging with arts and humanities researchers convinced that they don’t produce data, and using community generated rubrics to evaluate data management plans. For the arts and humanities research data can be harder to define. It is important for research data managers and researchers to determine together data that can be archived.
Jisc research data shared service
Back in plenary after lunch, Rachel Bruce and John Kaye updated us on developments for the research data shared service. Rachel described the creation of the University of Jisc as a development environment. Systems have been installed on this framework, in addition to e-prints and D-space. This has also enabled us to work with other systems and tools from other Jisc development areas such as scholarly communications. John went on to outline the technical architecture, which will be added to our research data network site soon and report on developments in integration between systems. The Data Asset Framework report and toolkit will be published soon with good intelligence about researchers’ current views and attitudes towards data.
Birds of a feather…
…flock together! We had asked everyone earlier in the day to sign up to one of our break-out sessions. BoFs are “informal gatherings of like-minded individuals who wish to discuss a certain topic without a pre-planned agenda”. The five topics on offer covered issues around sharing sensitive data; measuring impact for the REF, and the value of the research and its output; a demo from one of our suppliers to the research data shared service, Preservica; a discussion about Jisc’s next co-design research challenge areas; and a session about CRIS and research data. The c.100 delegates spread themselves across the sessions and lively discussions were later reported on in the notes from the meeting.
Further workshop sessions
Another set of workshop sessions were run next. In the Minting DOI’s session that I attended Rosie Higman used case studies from her experiences at Cambridge University. These illustrated various points about managing both the DOI lifecycle and also workflow issues. We ended on an interesting discussion about the authorship of data, which is generally not well understood. As data is a separate output the authorship of the data is likely to be different to the publications arising from it. Researchers need to think about who is a ‘creator’ and who is a ‘contributor’.
Parallel with this Rachel Bruce and Christopher Brown outlined Jisc’s work with the Research Data Alliance (RDA). Jisc is working with RDA Europe to ensure that UK research and its outputs are part of the global research infrastructure. This involves helping to inform the best practices and standards that Jisc and other related UK infrastructure can implement to support the creation, management and sharing of research data as a primary research output and knowledge foundation.
A workshop on frictionless data was also available at this time. Frictionless data are specifications for packaging data, with the containerisation based on open source software. Frictionless data is trying to solve the problem of not just openness, but different types of “friction” or barriers such as:
- Legal – compliance and open data sharing agreements
- Data Quality – variable!
- Interoperability – no standardised schema/ ontology
- Accessibility – data are hard to find
- No tool integration – manual processes required
There was a demonstration of how anyone can meaningfully add simple metadata to their research data.
And finally…
…after a welcome coffee break, we were back in the big lecture theatre together again to hear the results of a recent survey for the Wellcome Trust and ESRC into open research practice, experiences, barriers and opportunities from Veerle Van den Eynden (UK Data Service) and Gareth Knight (London School of Hygiene and Tropical Medicine). This was a preview of the mass of data gathered from c. 800 researchers. The data are in part focussed on looking at barriers that inhibit or prevent them from sharing their data. Some headlines include:
- 95% of researchers report generating research data.
- 77% say they reuse existing data for background, validation, methodology development and new analysis
- Different levels of sharing occur from early-career to established researchers – more established researchers are happier to share their data or are under less pressure to advance their career.
Overall initial recommendations for data sharing and reuse:
- policy development (provide guidelines, address contradictions between government and funder data sharing policy)
- rewards (recognise in career progress evaluation)
- promotion of the benefits
- infrastructure development
- funding!
There are also similar recommendations for code sharing and reuse. However, for code, definitions of code need widening and existing services (GitHub) need to be investigated to see how they are being used and whether they are fit for purpose long term.
Hometime
My colleague Catherine Grout closed a very successful day with a round of thanks for all the speakers. She also mentioned the excellent support and venue organisation, and thanked all the participants for having made the effort to get to such a far flung place in late November! She sent us out into a glorious sunset for onward journeys home.