Article Preview
Top1. Introduction
A key aspect of enterprise applications is metrics collection which in turn is used to determine performance indicators that can be tuned and improved to provide higher software quality. For a monolith, these metrics could easily be collected via one entry point, for example a bootstrap url or application log files. In the case of microservices, however, there are now an increased number of interconnections and intercommunication between processes. A computation that once took placed within a single process could now be fragmented into many micro-computations across different services, processes and even threads (Christudas, 2019).
In a monolith architecture, method invocations or interactions between processes are localized allowing for processes to make assumptions and optimizations for complete execution of its operations. However, for the same architecture in microservices, implementations and possible deployment is split into processes(Bakshi, 2017) and even locations. This increases the number metrics collection channels available for the application thereby requiring an enhanced approach to the metrics collection and feedback.
This is illustrated with the Hospital Information System (HIS) scenario in Figure 1 which shows a system of four components and a single entry point, the Admissions Service. Since this entry point interfaces with other services, it becomes a trivial task to gather system-wide metric from this service. This is because all needed data gathering and optimization could be achieved via a single interface. Data stream is sure to reach the remaining three services in a synchronous manner.
Figure 1.
Single entry point for monoliths
In the case of a microservice architecture, each component could be considered as a full-fledged independently deployable application and therefore can generate its own dataset. For example, in Figure 2, the metrics generated by the Patients services http://path1:port(1) need to be evaluated relative to the metrics generated by the Admissions service at http://path3:port(3) and so on. Then these data somehow needs to be fed back into the system for purpose of optimizing system performance.
Therefore, the objective of this research is to develop an approach that can be used to integrated data generation, collection, analysis, feedback and optimization in a microservices architecture. This model adopts a data mining approach for microservices monitoring, data collection, analysis and feedback for performance metrics tuning.
Figure 2.
Multiple channels for –services architecture
This research is arranged as follows: chapter 2 provides an overview of the theoretical and architectural framework of data mining in applications. Chapter 3 discusses the concept of metrics gathering and performance tuning for microservices. Some existing research are presented as well.
Then in chapter 4, the methodology and proof of concept for the SAFAO approach is covered. Results and discussions are given chapter 5. Finally, chapter 6 discusses a summary achievements of the research, challenges and possibilities of further research in the area.
Top2. Data Mining Overview
This section presents a general data mining framework and its application in enterprise systems. A number of existing research in this area is also presented.
Data mining with respect to recent growth in data generation combines the principles of statistics with ideas, tools and techniques from computer science, machine learning, database technology and other classical data analytical methods in the collection and discovery of interesting, unexpected or valuable structures in large datasets(Hand, 2007). Data mining can largely be divided into three main classes: