Real-Time Pose Recognition for Billiard Players Using Deep Learning

Real-Time Pose Recognition for Billiard Players Using Deep Learning

DOI: 10.4018/979-8-3693-1738-9.ch010
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In this book chapter, the authors propose a method for player pose recognition in billiards matches by combining keypoint extraction and an optimized transformer. Given that those human pose analysis methods usually require high labour costs, the authors explore deep learning methods to achieve real-time, high-precision pose recognition. Firstly, they utilize human key point detection technology to extract the key points of players from real-time videos and generate key points. Then, the key point data is input into the transformer model for pose analysis and recognition. In addition, the authors design a human skeletal alignment method for comparison with standard poses. The experimental results show that the method performs well in recognizing players' poses in billiards matches and provides real-time and timely feedback on players' pose information. This research project provides a new and efficient tool for training billiard players and opens up new possibilities for applying deep learning in sports analytics. In addition, one of these contributions is the creation of a dataset for pose recognition.
Chapter Preview
Top

Introduction

In recent years, deep learning has been developed rapidly, especially the field of computer vision has become one of the core parts in computing (He et al., 2016). Human pose estimation is one of the essential branches of computer vision, deep learning has proved its effectiveness in dealing with diverse poses, complicated and occlusion problems (Sun et al., 2019). The purpose of human pose estimation is to predict or detect the pose of a human body from an image or video, however, traditional pose estimation methods rely on manual feature input and statistics (Cao et al., 2017). With the emergence of deep neural networks, such as Convolutional Neural Networks (CNN), Trans-former and Recurrent Neural Networks (RNN), more accurate and robust methods for human pose estimation were presented (Chen et al., 2018).

With the popularity of computer vision technology, human posture estimation has been widely used in the fields of behaviour recognition, medical rehabilitation, multimedia and so on, and the field of motion analysis is also a hot research direction. For example, Wang et al. Used convolutional neural network to identify human posture in a single RGB image, and put forward personalized sports suggestions for skiers, and improved the training experience (Wang et al., 2019). Therefore, we urgently need a real-time and accurate human posture analysis method, which can collect athletes' posture data in real time and analyze it automatically. Athletes can immediately correct the shortcomings of posture, improve the training effect and avoid sports injury.

Despite significant advances in human pose estimation achieved by using deep learning, many challenges must be addressed, especially in specific areas like billiards. Most existing deep learning methods rely on large-scale labelled data (Wen et al., 2016) and obtain high-quality labelled data in real scenarios which are both time-consuming and laborious. In billiards games, the occlusion of a player's arms, cue, ball, and tabletop make labelling work even much difficult (Andriluka et al., 2014).

Furthermore, deep learning models perform well on the task of human pose estimation within single images. Still, ensuring temporal continuity and accuracy of human pose estimation in performing continuous actions or video sequences remains a challenge that has yet to be fully addressed (Carreira & Zosserman, 2017). The temporal continuity and accuracy of pose estimation are fundamental in applications such as sports that require high timeliness, e.g., real-time analysis and recommendations of player’s poses or stroke strategies (Choutas et al., 2018). In addition, in billiards or other sport games where human poses and environments possess variability and complexity, it may be difficult for a single deep learning model to use all scenarios and actions, which requires solutions with generalization and robustness (Yang et al., 2017).

Transformer is an effective deep learning model. Through its unique self-attention mechanism, it can effectively capture the temporal and spatial relationships in the input sequence, so as to improve the performance of the model. The ability of parallel computing makes transformer more efficient in processing large-scale human key point data (Vaswani et al., 2017). Thanks to the powerful transformer, we propose a new method combined with key point recognition. Key point recognition can continuously output the coordinate data of human key points with spatio-temporal relationship. Transformer can process these coordinates based key point data, effectively capture the spatial relationship between human parts and consider the posture relevance under different nodes, so as to accurately and efficiently recognize human posture.

Real time posture recognition is based on scientific and technological methods. Human key point recognition can collect a large number of player posture data, and transformer can analyse player posture through these data to obtain accurate results. This research work can better understand the performance of billiards players. At the same time, real-time posture recognition can also provide valuable data resources for coaches and other professionals to study and improve better billiards techniques and teaching methods.

Complete Chapter List

Search this Book:
Reset