Basic Search / Detailed Display

Author: 林聖傑
Lin, Sheng-Jie
Thesis Title: 基於深度學習之羽球動作分析系統
A Badminton Pose Analysis System Based on Deep Learning
Advisor: 方瓊瑤
Fang, Chiung-Yao
Committee: 方瓊瑤
Fang, Chiung-Yao
陳世旺
Chen, Sei-Wang
黃仲誼
Huang, Zhong-Yi
羅安鈞
Luo, An-Chun
吳孟倫
Wu, Meng-Luen
Approval Date: 2024/07/12
Degree: 碩士
Master
Department: 資訊工程學系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2024
Academic Year: 112
Language: 中文
Number of pages: 48
Keywords (in Chinese): 羽球羽球動作辨識羽球動作分析3D人體模型分析資料增強電腦視覺
Keywords (in English): Badminton, Badtminton Motion Recognition, Badminton Motion Analysis, 3D Human Model Analysis, Data Augmentation, Computer Vision
Research Methods: 實驗設計法比較研究觀察研究現象分析
DOI URL: http://doi.org/10.6345/NTNU202401359
Thesis Type: Academic thesis/ dissertation
Reference times: Clicks: 275Downloads: 8
Share:
School Collection Retrieve National Library Collection Retrieve Error Report

近年來由於2020年東京奧運,台灣在羽球項目拿下一面金牌以及一面銀牌的好成績,隨著奪冠之後的聲浪,台灣的羽球人口也持續上升,因此本研究提出一套基於深度學習之羽球動作分析系統,能夠讓使用者輸入一段羽球動作影片,即可分析出動作的正確性,以避免造成傷害。也可以使得使用者剩下昂貴的教練費及場地費。
羽球動作分析系統主要可以分成三個部分,分別為資料前處理、羽球動作辨識子系統及3D人體模型建構及分析子系統,羽球為世界上最快的球類運動,在拍攝時容易造成物件模糊的情形,因此本研究透過資料的前處理解決模糊影像,後續使用Frame Flexible Network架構,學習來自不同頻率的特徵圖,接著透過Temporal Shift Module位移部分通道的特徵圖,以達到時序融合。後續使用近年來新穎的3D人體模型技術,透過其中24個人體關鍵點,使用普式分析(Procrustes analysis)輸出容易受傷的關節點。
本研究建立一個羽球動作資料集,命名為CVIU badminton datasets,該資料集包含7個常見的羽球動作,分別為反手擊球、正手擊球、右挑球、左挑球、低手發球、高手發球、防守動作,實驗結果顯示在CVIU badminton datasets中的Top-1準確度達到91.87%。類別準確度(Class accuracy)達到85.71%。後續實驗結果顯示本研究所提出改良都有提升效果。

In recent years, due to the 2020 Tokyo Olympics, Taiwan achieved excellent results in badminton, winning a gold medal and a silver medal. Following these victories, the number of badminton players in Taiwan has continued to rise. Therefore, this study proposes a deep learning-based badminton motion analysis system, which allows users to input a video of badminton movements to analyze the correctness of the movements and avoid injuries. It also helps users save on expensive coaching and venue fees.
The badminton motion analysis system can be divided into three main parts: data preprocessing, badminton motion recognition subsystem, and 3D human model construction and analysis subsystem. Badminton is the fastest racket sport in the world, which often causes motion blur when filming. Therefore, this study addresses blurry images through data preprocessing. Subsequently, it uses the Frame Flexible Network architecture to learn feature maps from different frequencies. Then, the Temporal Shift Module is used to shift feature maps of some channels to achieve temporal fusion. The latest 3D human model technology is then used, utilizing 24 human key points. By employing Procrustes analysis, the system outputs joint points that are prone to injury.
This study established a badminton motion dataset named CVIU Badminton Datasets, which includes seven common badminton actions: backhand stroke, forehand stroke, right lift, left lift, low serve, high serve, and defensive action. Experimental results showed that the Top-1 accuracy on the CVIU Badminton Datasets reached 91.87%. The class accuracy reached 85.71%. Subsequent experimental results indicated that the proposed improvements in this study have had an enhancement effect.

第1章 緒論 1 第一節 研究動機與目的 2 第二節 研究困難與限制 5 第三節 研究貢獻 6 第四節 論文架構 7 第2章 文獻探討 8 第一節 羽球運動常見動作介紹 8 第二節 羽球動作辨識系統 13 第三節 人體骨架分析 15 第四節 3D人體模型重建及應用 18 第3章 羽球動作分析系統 22 第一節 系統流程 22 第二節 資料前處理 23 第三節 羽球動作辨識子系統 24 第四節 3D人體模型建立與分析子系統 27 第五節 資料增強 29 第4章 實驗結果與討論 31 第一節 實驗環境和資料集建立 31 第二節 羽球動作辨識子系統分析 32 第三節 幀數彈性框架分析 35 第四節 實驗討論 36 第5章 結論與未來工作 43 第一節 結論 43 第二節 未來工作 43 參考文獻 45

Z. Liu, J. Ning, Y. Cao, Y. Wei, Z. Zhang, S. Lin, and H. Hu, “Video Swin Transformer,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3202-3211, 2022.
M. Kocabas, N. Athanasiou, and M. J. Black, “VIBE: Video Inference for Human Body Pose and Shape Estimation,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, pp. 5252-5262, 2020.
Z. Liu, L. Wang, W. Wu, C. Qian, and T. Lu, “TAM: Temporal Adaptive Module for Video Recognition,” Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 13708-13718, 2021.
Y. Zhang, Y. Bai, C, Liu, H. Wang, S. Li, and Y. Fu, “Frame Flexible Network,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10504-10513, 2023.
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and Neil Houlsby, “An image is worth 16 x 16 words: transformers for image recognition at scale,” Proceedings of International Conference on Learning Representations (ICLR), pp. 1-22, 2021
Y. Lecun, L.Bottou, Y.Bengio, and P.Haffner, “Gradient-based learning applied to document recognition” Proceedings of the IEEE, pp. 2278-2324
G. Pavlakos, V. Choutas, N. Ghorbani, T. Bolkart, A. A. A. Osman, D. Tzionas, and M. J. Black, “Expressive Body Capture: 3D Hands, Face, and Body from a Single Image,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10975-10985, 2019.
M. Loper, N. Mahmood, J. Romero, G. Pons-Moll, and M. J. Black, “SMPL: A Skinned Multi-Person Linear Model,” Association for Computing Machinery (ACM), pp. 248:1-248:16, 2015.
J. Shuiwang, X. Wei, Y, Ming, and Y, Kai, "3D Convolutional Neural Networks for Human Action Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 221-231, 2013.
A. Wang, H. Chen, Z. Lin, J. Han, and G. Ding, “RepViT: Revisiting Mobile CNN From ViT Perspective,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15909-15920, 2024.
X. Sun, P. Chen, L. Chen, C. Li, T. H. Li, M. Tan, and C. Gan, "Masked Motion Encoding for Self-Supervised Video Representation Learning," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2235-2245, 2023.
Y. Li, B. Ji, X. Shi, J. Zhang, B. Kang, and L. Wang, “TEA: Temporal Excitation and Aggregation for Action Recognition,” Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 909-918, 2020.
R. Wang, D. Chen, Z. Wu, Y. Chen, X. Dai, M. Liu, Y.-G. Jiang, L. Zhou, and L. Yuan, "BEVT: BERT Pretraining of Video Transformers," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14733-14743, 2022.
Z. Liu, R. Feng, H. Chen, S. Wu, B. Yang, S. Ji, and X. Wang, "Deep Dual Consecutive Network for Human Pose Estimation," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 525-534, 2021.
X. Ma, J. Su, C. Wang, W. Zhu, and Y. Wang, "3D Human Mesh Estimation from Virtual Markers," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 534-543, 2023.
Z. Liu, R. Feng, H. Chen, S. Wu, Y. Gao, Y. Gao, and X. Wang, "Temporal Feature Alignment and Mutual Information Maximization for Video-Based Human Pose Estimation," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11006-11016, 2022.
Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, "Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7291-7299, 2017.
K. Li, Y. Wang, Y. He, Y. Li, Y. Wang, Y. Liu, Z. Wang, J. Xu, G. Chen, P. Luo, L. Wang, and Y. Qiao, "MVBench: A Comprehensive Multi-modal Video Understanding Benchmark," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 22195-22206, 2023.
Y. Zhao, W. Lv, S. Xu, J. Wei, G. Wang, Q. Dang, Y. Liu, and J. Chen, "DETRs Beat YOLOs on Real-time Object Detection," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16965-16974, 2024.
T. Cheng, L. Song, Y. Ge, W. Liu, X. Wang, and Y. Shan, "YOLO-World: Real-Time Open-Vocabulary Object Detection," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16901-16911, 2024.
W. Li, M. Liu, H. Liu, P. Wang, J. Cai, and N. Sebe, "Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 604-613, 2024.

下載圖示
QR CODE