Article Preview
Top1. Introduction
Access control is an important component of information security to protect data from unauthorized access. It is therefore linked to trust, privacy and data integrity (Herzberg, Mass, Mihaeli, Naor, & Ravid, 2002).
The recent explosion in produced data in every field of science has resulted in the emergence of highly data-driven sciences. These sciences, in turn, also generate new data resources adding to the increasing complexity of storing, managing and sharing this data. Therefore, in order to cope with this ''data tsunami'', adequate information platforms are necessary that can deal with the increased volume, scale and complexity related to managing, preserving and sharing these data. In 2010, the Riding the Wave report identified the need for an e-infrastructure for data management enabling seamless access, re-use and trust of data (Wood et al., 2010). The authors present challenges to improve scientific discovery and support collaboration across disciplinary and geographical boundaries. These challenges include: data preservation and curation; linking people and data; describing data through adequate metadata and semantics for data discovery; enabling interoperability and data exchange across scientific domains; and establishing trust through adequate authentication and authorization platforms.
The Epidemic Marketplace (EM) is an information platform that aims to address the above challenges in the epidemiology domain (Lopes et al., 2010). It stores and manages health-related resources, including sensitive data, such as epidemic incidence datasets (see http://www.epimarketplace.net). A major challenge to platforms like the Epidemic Marketplace is that the data is often sensitive in nature and must therefore be under adequate access control mechanisms. While some previous approaches hint towards giving resource owners more control over their resources, the standard practice for managing sensitive health-related resources is to give this responsibility to the resource owner. In some countries, this is also a legal requirement (European Commission, 1995, 2012). Therefore, there is the need to empower resource owners in the EM to control who has access to their resources.
Scientific communities, as other publishing communities, have previously adopted several approaches for access control. Some provide variations of Role-Based Access Control (Ferraiolo, Sandhu, Gavrila, Kuhn, & Chandramouli, 2001). However, such approaches are tailored to specific organizational needs and limited to administrator created roles. They do not adequately address the sharing needs of typical scientific resource creators, particularly in a vast user community where every user is potentially a dataset generator. Furthermore, permission and role assignment become increasingly complex as the number of roles required by users increase.
Other access control approaches explore the necessity of user-groups for larger granularity in user permissions (Chan, 2004; Kapica, 2014). However, while providing valuable insight into the problem of group-based access control, they continue to rely on administrators for the creation of user groups, and user-group assignments, in a similar way to POSIX access control lists (Grunbacher & Nuremberg, 2003). These group-based approaches vary on permission assignment, some of them demanding it to be done by users with administrative roles, while others give this responsibility to resource owners or creators.
We adopt a decentralized and discretionary approach over permission assignment as well as in the management of user groups. Additionally, our access control model includes an object structure that separates data from meta-data, enabling the search of resources without exposing sensitive data.
In this paper, we explore the implementation of access control mechanisms from these technologies enabling users to share their created resources while protecting sensitive data. We identify the access control requirements of an epidemiological resource repository and propose a group-based approach to access control over repository resources, which could also be used in similar scientific environments.