Article Preview
TopIntroduction
Big data have received a great deal of attention in recent years. Not only the amount of data is on a completely different level than before, but also we have different type of data including factors such as format, structure, and sources. In addition, the speed at which these data must be collected and analyzed is increasing. This has definitely impacted the tools required to store Big Data, and new kinds of data management tools i.e. NoSQL systems have arisen (Cattell, 2011). Compared to existing systems, NoSQL systems are commonly accepted to support larger volume of data, provide faster data access, better scalability and higher flexibility (Angadi et al., 2013).
One of the NoSQL key features is that databases can be schema-less. This means, in a table, meanwhile the row is inserted, the attributes names and types are specified. Unlike relational systems where first the user defines the schema and creates the tables, second he inserts data, the schema-less property offers undeniable flexibility that facilitates the physical schema evolution. End-users are able to add information without the need of database administrator. For instance, in the medical program that follows-up patients suffering from a chronic pathology – case of study detailed in ILLUSTRATIVE EXAMPLE Section – one of the benefits of using NoSQL databases is that the evolution of the data (and schema) is fluent. In order to follow the evolution of the pathology, information is entered regularly for a cohort of patients. But the situation of a patient can evolve rapidly which needs the recording of new information. Thus, few months later, each patient will have his own information, and that’s how data will evolve over time. Therefore, the data model (i) differs from one patient to another and (ii) evolves in unpredictable way over time. We should highlight that this flexibility concerns the physical level i.e. the stored database exclusively (Herrero et al., 2016).
In information systems, the importance and the necessity of conceptual models are widely recognized. The conceptual model provides a high level of abstraction and a semantic knowledge element close to human comprehension, which guarantees efficient data management (Abelló, 2015). Furthermore, this model is a document of interchange between end-users and designers, and between designers and developers. Also, the conceptual model is used for system maintenance and evolution that can affect business needs and/or deployment platform. The Unified Modeling Language (UML) is widely accepted as the standard of information system modeling.
On the one hand, NoSQL systems have proven their efficiency to handle Big Data. On the other hand, the needs of a conceptual modeling and design approach remain up-to-date. Therefore, we are convinced that it’s important to provide a precise and automatic approach that guides and facilitates the Big Database implementation task within NoSQL systems. This approach will assist the developers to map Big Database UML conceptual model into NoSQL physical models. It’s also required to have a tool to maintain data consistency since most of the NoSQL systems lacks of constraints checking and enforcement mechanism.
For this, we propose the “Object2NoSQL” MDA-based approach. The Model Driven Architecture (MDA) is well-known as a framework for models automatic transformations. The Object2NoSQL approach starts from a conceptual model (PIM) (UML class diagram and OCL constraints) and transforms it into a unified logical model compatible with the four types of NoSQL database (column, document, graph and key-value). The conceptual model is automatically transformed into a logical model using QVT rules. Then, logical model is transformed into a physical model (PSM after choosing one of the four platforms: Cassandra, MongoDB, Neo4J or Redis). In this paper, we focus on how to automatically transform UML/OCL conceptual model into NoSQL physical level. As discussed in the related work, few solutions have dealt with the NoSQL databases conceptual modeling. To the best of our knowledge, none of the existing contribution has treated the OCL constraints and their implementation into NoSQL databases.
The remainder of the paper is structured as follows. ILLUSTRATIVE EXAMPLE Section motivates our work using a case of study in the healthcare field. OBJECT2NOSQL APPROACH Section defines our models transformation approach. OCL2JAVA APPROACH Section introduces our OCL constraints transformation approach and shows two transformation processes. The first one creates a logical model starting from an OCL conceptual model, and the second one generates the java code required to check the constraints within NoSQL database. RELATED WORK Section reviews previous work. Finally, CONCLUSION Section summarizes the objectives achieved and announces future work.