Deep Reinforcement Learning for Adaptive Stock Trading: Tackling Inconsistent Information and Dynamic Decision Environments

Lei Zhao, Bowen Deng, Liang Wu, Chang Liu, Min Guo, Youjia Guo

Source Title: Journal of Organizational and End User Computing (JOEUC) 36(1)

DOI: 10.4018/JOEUC.335083

Article PDF Download Open access articles are freely available for download

Abstract

In this study, the authors explore how financial institutions make decisions about stock trading strategies in a rapidly changing and complex environment. These decisions are made with limited, often inconsistent information and depend on the current and future strategies of both the institution itself and its competitors. They develop a dynamic game model that factors in this imperfect information and the evolving nature of decision-making. To model reward transitions, they utilize a combination of t-Copula simulation of a non-stationary Markov chain, probabilistic fuzzy regression, and chaos optimization algorithms. They then apply deep q-network, a method from deep reinforcement learning, to ensure the effectiveness of the chosen strategy during ongoing decision-making. The approach is significant for both researchers across fields and practical professionals in the finance industry.

Article Preview

Top

1. Introduction

Decision making is the process of making decision based on imperfect information about environment and opponents. The importance of decision making in many fields makes it receive much attention from scientist. Making decisions with imperfect information under uncertainty in inconsistent and dynamic environments is particularly true for stock institutions in the stock market. What makes things even more complicated is the situation in which the participants in the game (competition) need to make their strategic business decisions in a conflicting or cooperative way (A.R. Heidari, 2010; Sun et al,2022a). A cooperative competition strategy evaluates the strategy effects not only for itself, but also for its opponent. On the other hand, a conflicting strategy maximizes the reward only for itself (J.-Y. Kim, J.Y. Kwon, 2017). As the simulation results in this study demonstrate, for financial institutions, in some scenarios, they should adopt the conflicting competition strategy, and in others, they should adopt the cooperative competition strategy. When decision making in a real environment becomes complex, the optimum equilibrium for games would not be easily achieved without the aid of intelligent computational algorithms.

In order to be able to deal with inconsistent information and dynamic decision-making environment in an isolated environment without interference from other conditions, this study chose a small-scale listed company. According to the annual report of this listed company, in a long period of time There are only two institutions trading the company's shares. In this study, a dynamic and imperfect information game model and algorithm are built to address the inconsistency of information and the dynamic nature of the decision-making process toward maximizing rewards under two scenarios, conflicting (algorithm 1) and cooperative (algorithm 2). Next, in order to model the risk transition probabilities, which would be used in the Markov Decision Process of Reinforced Learning, Probabilistic Fuzzy Regression (PFR), Chaos Optimization Algorithm (COA) and t-Copula simulation of a Non-Stationary Markov Chain model (algorithm 3) would be implemented to study the external risk factors’ effect on the transition probabilities. Finally, the Deep Q-Network (DQN) of Reinforced Learning would be used to estimate the optimum actions (strategies) derived from the progressive game decision process, under the two assumptions, conflicting or cooperative. Each institution has three opportunities to adjust its stocks orders placement strategies (Algorithm 3) or each has unlimited opportunities (Algorithm 4). The former focuses on the analysis of the impact of competition type and reaction speed, while the latter focuses on the comparison of compensation under different competition setups. In addition, in order to study the long-term dynamic decision-making of participants in inconsistent stochastic complex environment, the optimal order placement strategies and the optimal competition strategies would be estimated under the infinite adjustment scenarios. Therefore, the method proposed in this paper is of great significance not only to interdisciplinary research, but also to the practitioners as well.

The approach proposed in this study would solve several obstacles common in dynamical decision-making with inconsistent information under a stochastic decision outcome (T. Zu, M. Wen, R. Kang, 2017; Sun et al., 2021)). First, in real business competition, the information is seldom complete. In most circumstances, the quality of the information is insufficient at best. Participants in the game often release false information (hence inconsistent) about their strategies on purpose in order to strategically mislead their opponents (Sun et al., 2022b; Sun et al.,2022c). As a result, the imperfect dynamical information game model with different competition scenarios (conflicting or cooperative) is proposed to isolate the untrustworthy information and abstract the true behavior pattern from the actual events. Since this is the audited key risk indicator, as well as the well-known performance measurement, even the lagged figure of reward could be quite useful in collecting information. On the other hand, by comparing the simulation results under conflicting scenarios with those under cooperative scenarios, the effect of misleading information could be kept at a minimum. Detailed descriptions and reasoning of the game model and competition scenarios are found in section 2.

Complete Article List

Search this Journal:

Reset

Volume 36: 1 Issue (2024)

Volume 35: 3 Issues (2023)

Volume 34: 10 Issues (2022)

Volume 33: 6 Issues (2021)

Volume 32: 4 Issues (2020)

Volume 31: 4 Issues (2019)

Volume 30: 4 Issues (2018)

Volume 29: 4 Issues (2017)

Volume 28: 4 Issues (2016)

Volume 27: 4 Issues (2015)

Volume 26: 4 Issues (2014)

Volume 25: 4 Issues (2013)

Volume 24: 4 Issues (2012)

Volume 23: 4 Issues (2011)

Volume 22: 4 Issues (2010)

Volume 21: 4 Issues (2009)

Volume 20: 4 Issues (2008)

Volume 19: 4 Issues (2007)

Volume 18: 4 Issues (2006)

Volume 17: 4 Issues (2005)

Volume 16: 4 Issues (2004)

Volume 15: 4 Issues (2003)

Volume 14: 4 Issues (2002)

Volume 13: 4 Issues (2001)

Volume 12: 4 Issues (2000)

Volume 11: 4 Issues (1999)

Volume 10: 4 Issues (1998)

Volume 9: 4 Issues (1997)

Volume 8: 4 Issues (1996)

Volume 7: 4 Issues (1995)

Volume 6: 4 Issues (1994)

Volume 5: 4 Issues (1993)

Volume 4: 4 Issues (1992)

Volume 3: 4 Issues (1991)

Volume 2: 4 Issues (1990)

Volume 1: 3 Issues (1989)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Deep Reinforcement Learning for Adaptive Stock Trading: Tackling Inconsistent Information and Dynamic Decision Environments

Abstract

1. Introduction

Complete Article List