Author: |
張吳嘉 Chang, Wu-Jia |
---|---|
Thesis Title: |
基於攝影機的多人健身運動偵測與辨識 Camera-Based Multi-Person Fitness Detection And Identification |
Advisor: |
李忠謀
Greg C Lee |
Committee: |
李忠謀
Greg C Lee 江政杰 Chiang, Cheng-Chieh 劉寧漢 Liu, Ning-Han 蔣宗哲 Chiang, Tsung-Che 柯佳伶 Koh, Jia-Ling |
Approval Date: | 2024/10/07 |
Degree: |
碩士 Master |
Department: |
資訊工程學系 Department of Computer Science and Information Engineering |
Thesis Publication Year: | 2024 |
Academic Year: | 113 |
Language: | 中文 |
Number of pages: | 47 |
Keywords (in Chinese): | 人體姿態估計 、物件偵測 、動作辨識 、遮擋補償 、多人健身追蹤 |
Keywords (in English): | human posture estimation, object detection, motion recognition, occlusion compensation, multi-person fitness tracking |
Research Methods: | 實驗設計法 |
DOI URL: | http://doi.org/10.6345/NTNU202401945 |
Thesis Type: | Academic thesis/ dissertation |
Reference times: | Clicks: 103 Downloads: 0 |
Share: |
School Collection Retrieve National Library Collection Retrieve Error Report |
大多數健身運動追蹤研究主要以單人追蹤進行,單人追蹤可以記錄的訓練資訊,包含:動作辨識、動作計數、重量辨識、訓練時間以及準確度分析。然而在健身房環境中同時會有多人使用各種器材,僅進行單人追蹤會無法捕捉到多人的運動情形。利用攝影機的多人追蹤技術,可以同時對健身區域進行大範圍的偵測與追蹤,而不局限於單人追蹤。
本研究提出基於攝影機的多人健身運動偵測與辨識方法,拍攝廣角的畫面藉以同時涵蓋多樣健身器材,處理這些器材使用者的訓練影像資訊。首先,對健身影片進行人體偵測,找出畫面中所有人物的位置。由於訓練器材的位置固定而人員則是隨意走動,利用物件交集(Intersection Over Union, IOU)方法,可以定位出正在使用健身器材的人物。對於這些訓練者,利用人體姿態估計方法記錄使用者的運動資訊,辨識划船、肩推、胸推、上斜胸推、腿部屈伸等五種不同的健身動作,並計算運動者在該器材的動作次數。除此之外,由於健身動作可能被其他移動的人員所遮蔽,造成健身動作辨識與計次的判斷出現錯誤,因此藉由多攝影機的協調建立補償機制,改善在多人環境中因為遮擋產生的辨識問題。
本研究根據健身房的實際情形拍攝,未刻意安排訓練過程,使用者根據自身習慣自由的選擇訓練動作與次數。為了驗證補償機制是否改善遮擋產生的問題,會確保拍攝的每部影片中都有遮擋情形發生。最後設計三項實驗用以驗證偵測與辨識方法之效果。根據實驗結果,系統可以利用物件交集方式區分出不同人物的運動過程,在多人的環境中區分出運動與非運動之人員,並且辨識使用者訓練的動作與次數。在發生遮擋情況時,加入補償機制減少運動次數漏檢情形發生,補償後的次數回復率為52%,改善因為遮擋產生的辨識問題。
Most fitness exercise tracking research is mainly conducted with single-person tracking. Single-person tracking can record training information, including: action identification, action counting, weight identification, training time, and accuracy analysis. However, in a gym environment, there are many people using various equipment at the same time, and tracking only one person will not be able to capture the movement of multiple people. Using the multi-person tracking technology of the camera, a large-scale detection and tracking of the fitness area can be carried out at the same time, and is not limited to single-person tracking.
This study proposes a camera-based method for detecting and identifying multi-person fitness activities, capturing wide-angle images to cover a variety of fitness equipment at the same time, and processing the training image information of users of these equipment. First, perform human body detection on fitness videos to find out the locations of all people in the screen. Since the position of the training equipment is fixed and people move around at will, the Intersection Over Union (IOU) method can be used to locate the people using the fitness equipment. For these trainers, the human body posture estimation method is used to record the user's movement information, identify five different fitness movements such as rowing, shoulder press, chest press, incline chest press, and leg flexion and extension, and calculate the movement of the athlete on the equipment. Number of actions. In addition, since fitness movements may be obscured by other moving people, resulting in errors in the recognition and counting of fitness movements, a compensation mechanism is established through the coordination of multiple cameras to improve the problems caused by occlusion in a multi-person environment. Identify the problem.
This study was shot based on the actual situation in the gym. The training process was not deliberately arranged. Users can freely choose training actions and times according to their own habits. In order to verify whether the compensation mechanism improves the problems caused by occlusion, we will ensure that occlusion occurs in every video shot. Finally, three experiments were designed to verify the effectiveness of the detection and identification methods. According to the experimental results, the system can use the intersection of objects to distinguish the movement processes of different people, distinguish between moving and non-moving people in a multi-person environment, and identify the actions and times of the user's training. When occlusion occurs, a compensation mechanism is added to reduce the number of missed movements. The recovery rate after compensation is 52%, which improves recognition problems caused by occlusion.
[1] Bochkovskiy, A., Wang, C.-Y., & Liao, H.-Y. M. (2020). YOLOv4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. https://arxiv.org/abs/2004.10934
[2] Cao, Z., Hidalgo, G., Simon, T., Wei, S.-E., & Sheikh, Y. (2018). OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008. https://arxiv.org/abs/1812.08008
[3] Cattuzzo, M. T., dos Santos Henrique, R., Ré, A. H. N., de Oliveira, I. S., Melo, B. M., de Souza, E. A., ... & Stodden, D. (2016). Motor competence and health related physical fitness in youth: A systematic review. Journal of Science and Medicine in Sport, 19(2), 123-129. https://doi.org/10.1016/j.jsams.2014.12.004
[4] Cheng, J., Zhang, Y., & Wang, Z. (2022). Research and development of intelligent recognition system for pull-up action norms based on OpenPose. In 2022 4th International Conference on Advances in Computer Technology, Information Science and Communications (CTISC) (pp. 61-64). IEEE. https://doi.org/10.1109/CTISC54820.2022.9800751
[5] Chaudhri, R., Bajracharya, S., & Wang, R. (2008). An RFID based system for monitoring free weight exercises. In Proceedings of the 6th International Conference on Embedded Networked Sensor Systems (pp. 327-328). https://doi.org/10.1145/1460412.1460432
[6] Ding, F., Zhao, Y., Wang, Q., Wang, X., & Zhang, W. (2018). TTBA: An RFID-based tracking system for two basic actions in free-weight exercises. In Proceedings of the 14th ACM International Symposium on QoS and Security for Wireless and Mobile Networks (pp. 7-14). https://doi.org/10.1145/3267204.3267206
[7] Girdhar, R., & Ramanan, D. (2018). Detect-and-Track: Efficient pose estimation in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3506-3515). https://doi.org/10.1109/CVPR.2018.00370
[8] Girshick, R. (2015). Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 1440-1448). Santiago, Chile. https://doi.org/10.1109/ICCV.2015.169
[9] Khurana, R., Pandit, A., & Sarangi, S. (2018). GymCam: Detecting, recognizing and tracking simultaneous exercises in unconstrained scenes. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(4), 1-17. https://doi.org/10.1145/3287054
[10] Li, T. (2021). Community application of wearable sports fitness equipment in the embedded system environment of the Internet of Things. Journal of Ambient Intelligence and Humanized Computing, 12(3), 3257-3264. https://doi.org/10.1007/s12652-020-02523-5
[11] Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollar, P. (2017). Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), 318-327. https://doi.org/10.1109/TPAMI.2018.2858826
[12] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A. C. (2016). SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 21-37). Springer, Cham. https://doi.org/10.1007/978-3-319-46448-0_2
[13] Militaru, C., Militaru, M.-D., & Benta, K.-I. (2020). Physical exercise form correction using neural networks. In Proceedings of the 2020 International Conference on Multimodal Interaction (pp. 240-244). https://doi.org/10.1145/3395035.3425206
[14] Nagarkoti, A., & Khakurel, S. (2019). Realtime indoor workout analysis using machine learning and computer vision. In 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 5811-5814). IEEE. https://doi.org/10.1109/EMBC.2019.8856480
[15] Newell, A., Huang, Z., & Deng, J. (2017). Associative embedding: End-to-end learning for joint detection and grouping. In Advances in Neural Information Processing Systems 30 (NIPS 2017) (pp. 2274-2284).
[16] Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 779-788). https://doi.org/10.1109/CVPR.2016.91
[17] Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 28, 91-99. https://doi.org/10.5555/2969239.2969250
[18] Ross, G., Dollár, P., & He, K. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 580-587). https://doi.org/10.1109/CVPR.2014.81
[19] Smith, J. R., Philipose, M., & Roy, S. (2006). A wirelessly-powered platform for sensing and computation. In Proceedings of the 8th International Conference on Ubiquitous Computing (pp. 495-506). https://doi.org/10.1007/11853565_30
[20] Squires, R. W., Shultz, A. M., & Herrmann, J. (2018). Exercise training and cardiovascular health in cancer patients. Current Oncology Reports, 20(27). https://doi.org/10.1007/s11912-018-0685-8
[21] Toshev, A., & Szegedy, C. (2014). DeepPose: Human pose estimation via deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1653-1660). https://doi.org/10.1109/CVPR.2014.214
[22] Wang, C.-Y., Bochkovskiy, A., & Liao, H.-Y. M. (2022). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696. https://arxiv.org/abs/2207.02696
[23] Xiao, B., Wu, H., & Wei, Y. (2018). Simple baselines for human pose estimation and tracking. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 466-481). Springer, Cham. https://doi.org/10.1007/978-3-030-01237-3_28
[24] Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., & Sun, J. (2018). Cascaded pyramid network for multi-person pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 7103-7112). IEEE. https://doi.org/10.1109/CVPR.2018.00742​:contentReference[oaicite:0]{index=0}​:contentReference[oaicite:1]{index=1}.
[25] Zhou, H., Hu, R., Zhang, Y., & Yan, W. (2020). Posture tracking meets fitness coaching: A two-phase optimization approach with wearable devices. In 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS) (pp. 633-640). IEEE. https://doi.org/10.1109/MASS50613.2020.00082
[26] Ultralytics. (2023). YOLOv8. Ultralytics Documentation. https://docs.ultralytics.com/
[27] Li, Y., & Wu, Y. (2021). SimCC: A simple coordinate classification perspective for human pose estimation. arXiv preprint arXiv:2107.03332. https://arxiv.org/abs/2107.03332
[28] Jiang, T., Zhang, C., & Tao, Z. (2023). RTMPose: Real-time multi-person pose estimation based on MMPose. arXiv preprint arXiv:2303.07399. https://arxiv.org/abs/2303.07399
[29] 林厚廷.(2024). 基於攝影機的自由重量訓練追蹤 (碩士論文,國立臺灣師範大學)