Article Preview
TopIntroduction
One of the advantages of service-oriented computing is that it allows business processes to be composed by executing distributed web services (Jordan et al., 2007). Unlike traditional distributed transaction processing, however, since each service is autonomous and platform-independent, the commit of a service execution is controlled by the residing service instead of the global process. As a result, processes composed of web services do not generally execute as transactions that conform to the concept of serializability. Since a service can commit before a global process is complete, dirty reads and dirty writes can occur among globally executing processes.
From an application point of view, dirty reads and dirty writes do not necessarily indicate an incorrect execution, and a relaxed form of correctness dependent on application semantics can produce better throughput and performance. User-defined correctness of a process can be specified as in related work with advanced transaction models (Rolf, Klas, & Veijalainen, 1998) and transactional workflows (Worah & Sheth, 1997), using concepts such as compensation to semantically undo a process. But even when one process determines that it needs to execute compensating procedures, information about global data dependencies is needed to determine how the data changes caused by the recovery of one process can possibly affect other processes that have either read or written data modified by the services of the failed process. This ability to capture and analyze data dependencies in a service composition environment does not exist in current service-oriented architectures, thus creating data consistency problems for concurrent execution and limiting the effectiveness of recovery procedures for failed processes.
This paper presents our results with the investigation of an approach that performs decentralized data dependency analysis among concurrently executing processes in a service-oriented environment. In particular, we present the concept of Process Execution Agents (PEXAs) and the manner in which multiple PEXAs communicate to discover data dependencies that can be used to support recovery activities. PEXAs are responsible for controlling the execution of processes that are composed of web services. PEXAs are associated with specific distributed sites and are also responsible for capturing and exchanging information with other PEXAs about the data changes that occur at those sites in the context of service executions.
The ability to capture data changes, known as deltas, builds on our past work with the use of Delta-Enabled Grid Services (DEGS) (Blake, 2006; Urban, Xiao, Blake, & Dietrich, 2009). DEGS are Grid Services that have been extended with the capability of recording and externalizing incremental data changes using features such as Oracle Streams (Tumma, 2004). Whereas the work in Urban, Xiao, et al. (2009), Xiao (2006), and Xiao and Urban (2008a, in press) forwarded streaming deltas from multiple DEGS to a single, time-ordered, delta object schedule for a centralized approach to data dependency analysis, the work presented in this paper has extended the data dependency analysis process to support decentralized communication among multiple PEXAs. Each PEXA creates its own local delta object schedule that can be used to create process dependency graphs. But since a process can execute services that are associated with multiple PEXAs, the data dependency analysis process requires a global view of distributed process dependency graphs.