Research Data Management

Research Data Management

Tibor Koltay
Copyright: © 2023 |Pages: 12
DOI: 10.4018/978-1-7998-9220-5.ch072
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This chapter explains the main aspects and the importance of being familiar with research data management (RDM), stressing the centrality of the research data lifecycle and the importance of data sharing, whereby the ideas of the FAIR principles are characterized as a main driver of reuse. The importance of data management plans is emphasized as well. RDM processes such as adding metadata, citing, retrieving, and curating datasets are portrayed. The need for cooperating between disciplinary researchers and different data professionals, such as data librarians and data curators, is highlighted. Moreover, it is underlined that a number of educational programs to data science also contain data management.
Chapter Preview
Top

Background

The consistent management of research data is especially crucial for the success of any long-term and large-scale collaborative research. RDM is also “the basis for efficiency, continuity, and quality of the research, as well as for maximum impact and outreach, including the long-term publication of data and their accessibility” (Finkel et al., 2020, p. 1).

As emphasized by several researchers, as well as by funding agencies and publishers, organizing and sharing research data is a fundamental part of the research process, thus comprehensive RDM is indispensable not only for ensuring reproducible and open scientific research, but increases citation rates for publications, and fosters research reproducibility (Borycz, 2021).

The main drivers of RDM are Open Science and Open Data. These initiatives gained momentum with the adoption of the Fair Access to Science and Technology Research Act in 2013 (US Congress, 2013), but this culture change is strongly supported also by the European Commission through the European Open Science Cloud (EOSC, n.d.).

RDM is often viewed as a set of mechanical, managerial, and technical handling processes (Ojanen et al., 2020). However, by encouraging the collaboration between researchers, and fostering better science, it can lead to better decision-making (USGS, n.d.).

RDM falls mainly into the domain and responsibility of researchers and data professionals. The latter group includes (data) librarians, data curators, and data stewards. It is unimaginable without services, offered by research offices, academic libraries, and computing (information technology) service units. Beside of these professionals, data scientists must be knowledgeable of RDM because it is one of the main services that provide them with the data that they are working with. Contrarily, RDM is also a data science issue, which should not be restricted to machine learning or statistics, thus data scientists need to face the challenges of organizing and storing data. As declared by Davenport and Patil (2012, p. 73), data scientist’s job is “bringing structure to large quantities of formless data and making analysis possible.” In the light of this, we need to acknowledge that their job goes beyond formal data analysis, among others by including RDM.

Strongly related to the broad and varied aspects of RDM and data science are the activities of the Research Data Alliance, which is a “community-driven organization dedicated to the development and use of technical, social, and community infrastructure promoting data sharing and data-driven exploration. This organization is particularly important for “the global academic community where research infrastructure is often ad hoc, may have a short shelf-life, and hard to fund (Berman & Crosas, 2020).

Data stewardship also deserves attention because these professionals take care of data assets that do not belong to the stewards themselves. Data stewards’ aim is ensuring that data-related work is performed in accordance with policies and practices. Together with preservation of data, their activities also fall into the domain of data science.

Key Terms in this Chapter

Research Data Management: An integral part of the research process, helping to ensure that datasets, pertaining to a project are properly organized, described, preserved, and shared. By involving the care and maintenance of the data, RDM is exercised in the course of the given research cycle.

Data Curation: A complex activity for the active and ongoing management of data through its lifecycle in order to preserve, share, and discover it.

Data Reuse: A set of activities for enabling the availability of data in order to eliminating, or reducing repeated accesses to the same data, and bringing in varied benefits for scientific research.

Data Retrieval: A process of identifying and extracting data by queries, made in a database.

Data Sharing: A complex activity for making data available to other investigators in order to allow its use and reuse for scholarly research.

Data Management Plan: A formal document that states what researchers intend to do with the data in the course and after their research project.

Complete Chapter List

Search this Book:
Reset