研究生: |
胡碩宸 Shuo-Chen Hu |
---|---|
論文名稱: |
利用RGB-D Sensors進行人類動作的分析 HUMAN ACTION ANALYSIS USING RGB-D SENSORS |
指導教授: |
梁祐銘
Liang, Yu-Ming 陳世旺 Chen, Sei-Wang |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2013 |
畢業學年度: | 101 |
語文別: | 中文 |
論文頁數: | 60 |
中文關鍵詞: | 流形學習 、等構映圖 、動作辨識 、Kinect |
英文關鍵詞: | manifold learning, Isomap, action recognition, Kinect |
論文種類: | 學術論文 |
相關次數: | 點閱:198 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究利用視訊資料進行人類動作的分析,目的在於發展一套通用的人類動作分析技術可以應用在不同的領域上,如公共安全:機場、地鐵、體育館、購物中心等公共區域或是大樓中的自動化監控系統,偵測是否有人有異常行為(例如破壞公共區域)。或者居家照護系統:偵測家裡的孩童或老人是否有跌倒或爬到高處等危險的行為,若有危險行為發生便通報家屬及照護人員。
本研究利用微軟所開發的RGB-D Sensors(亦即Kinect)來擷取人體3D關節資訊,並計算關節夾角當作人類姿勢的特徵向量,由於這些特徵向量維度極高,因此我們利用流形學習(manifold learning)之等構映圖(Isomap)進行降維,並在低維度的等構映圖空間進行基本動作的切割與分群。接著將每一群的基本動作給定一個語意上的闡述並形成一個編碼書(codebook),最後此編碼書可以用來對測試者進行動作的辨識。
In this study, we develop a human action analysis technology from video data, which can be applied to many different fields, such as the public safety monitoring system in subways, shopping malls, other public area or buildings to detect abnormal behavior; the home care system: to detect the danger behavior of children or elders, such as falling at home.
We use RGB-D Sensors (i.e. Kinect), developed by Microsoft, to retrieve the body joints 3D information and to calculate the joint angle as the feature vector of a human posture. Since the dimension of the feature vector is very high, we apply the Isomap algorithm, which is a manifold learning approach, to reduce the dimension. Then, each atomic action is segmented and clustered in the Isomap space, and all of the clusters from a codebook. Finally, we can use the codebook to recognize the tester’s actions.
[His 08] J. W. Hsieh, Y. T. Hsu, H. Y. Mark Liao and C. C. Chen, “Video-based human movement analysis and its application to surveillance systems,” IEEE Transactions on Multimedia, Vol. 10, pp. 372- 384, 2008.
[Liu 10] C. D. Liu, Y. N. Chung, and P. C. Chung, “An interaction-embedded HMM framework for human behavior understanding: with nursing environments as examples,” IEEE Transactions on Information Technology in Biomedicine, Vol. 14, No. 5, pp. 1236- 1246, 2010.
[Su 07] C. W. Su, H. Y. Mark Liao, H. R. Tyan, C. W. Lin, D. Y. Chen, and K. C. Fan, “Motion flow-based video retrieval,” IEEE Transactions on Multimedia, Vol. 9, No. 6, pp. 1193- 1201, 2007.
[Bid 10] B. Bideau, R. Kulpa, N. Vignais, S. Brault, F. Multon, and C. Craig, “Using virtual reality to analyze sports performance,” IEEE Computer Graphics and Applications, Vol. 30, No. 2, 2010.
[Wei 09] J. Wei, H. Qin, J. Guo, and Y. Chen, “The hand shape recognition of human computer interaction with artificial neural network,” IEEE International Conference on Virtual Environments, Human-Computer
Interfaces and Measurements Systems, pp. 350- 354, 2009.
[Zha 11] Z. Zhang, “Human body language understanding with 3D sensors,” (Panels on 3D Media Analysis and Retrieval), IEEE International Conference on Multimedia and Expo, 2011.
[Wan 03] L. Wang, W. Hu, and T. Tan, “Recent developments in human motion analysis,” Pattern Recognition, Vol. 36, No. 3, pp. 585- 601, 2003.
[Tur 08] P. Turaga, R. Chellappa, V. S. Subrahmanian, and O. Udrea, “Machine recognition of human activities: a survey,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 18, No. 11, 2008.
[Pop 10] R. Poppe, “A survey on vision-based human action recognition,” Image and Vision Computing, Vol. 28, pp. 976- 990, 2010.
[Moe 06] T. B. Moeslund, A. Hilton, and V. Krüger, “A survey of advances in vision-based human motion capture and analysis,” Computer Vision and Image Understanding, Vol. 104, No. 2, pp. 90- 126, 2006.
[Lia 13] Y. M. Liang, S. W. Shih, and A. C. C. Shih, “Human action segmentation and classification based on the Isomap algorithm,” To Appear in Multimedia Tools and Applications.
[Sch 04] C. Schuldt, I. Laptev, and B. Caputo, “Recognizing human actions: a local svm approach,” Proceedings of the 17th IEEE International Conference on Pattern Recognition, pp. 32- 36, 2004.
[Bla 07] B. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, “Actions as space-time shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, No. 12, 2007.
[Moe 01] T. Moeslund and E. Granum, “A survey of computer vision-based human motion capture,” Computer Vision and Image Understanding, Vol. 81, No. 3, pp. 231- 268, 2001.
[Har 00] I. Haritaoglu, D. Harwood, and L. S. Davis, “W4: real-time surveillance of people and their activities,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 8, 2000.
[Bob 01] A. F. Bobick and J. W. Davis, “The recognition of human movement using temporal templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 23, No. 3, pp. 257- 267, 2001.
[Li 07] H. Li, S. Lin, Y. Zhang, and K. Tao, “Automatic video-based analysis of athlete action,” Proceedings of IEEE International Conference on Image Analysis and Processing, pp. 205- 210, Modena, Italy, Sep. 2007.
[Muk 11] S. Mukherjee, S. K. Biswas, and D. P. Mukerjee, “Recognizing human action at a distance in video by key poses,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 21, No. 9, 2011.
[Ohy 02] J. Ohya, “Analysis of human behaviors by computer vision based approaches,” Proceedings of IEEE International Conference on Multimedia and Expo, Vol. 1, pp. 913- 916, Lusanne, Switzerland, Aug. 2002.
[Bou 03] B. Boulay, F. Bremond, and M. Thonnat, “Human posture recognition in video sequence,” Proceedings IEEE Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 23- 29, 2003.
[Wer 05] N. Werghi, “A discriminative 3D wavelet-based descriptors: application to the recognition of human body postures,” Pattern Recognition Letters, Vol. 26, pp. 663- 677, 2005.
[She 99] D. Shen and H. H. S. Ip, “Discriminative wavelet shape descriptors for recognition of 2-D patterns,” Pattern Recognition, Vol. 32, pp. 151- 165, 1999.
[Gu 10] J. Gu, X. Ding, S. Wang, and Y. Wu, “Action and gait recognition from recovered 3-D human joints,”IEEE Transactions on System, Man, and Cybernetics - Part B: Cybernetics, Vol. 40, No. 4, 2010.
[Lin 11] S. Y. Lin, Z. H. You, and Y. P. Hung, “A real-time action recognition approach with 3D tracked body joints and its applications,” Proceedings of the 24th IPPR Conference on Computer Vision, Graphics, and Image Processing, 2011.
[Ham 07] R. Hamid, S. Maddi, A. Bobick, and I. Essa, “Structure from statistics, unsupervised activity analysis using suffix tree,” Proceedings of IEEE International Conference on Computer Vision, pp. 1-8, 2007.
[Ryo 06] M. S. Ryoo and J. K. Aggarwal, “Recognition of composite human activities through context-free grammar based representation,” Proceedings of IEEE Computer Society Conference on Computer Vision
and Pattern Recognition, 2006.
[Tra 08] S. D. Tran and L. S. Davis, “Event modeling and recognition using Markov logic networks,” Proceedings of the 10th European Conference on Computer Vision: PartII, 2008.
[Cai 08] F. Caillette, A. Galata, and T. Howard, “Real-time 3-D human body tracking using learnt models of behaviour,” Computer Vision and Image Understanding, Vol. 109, pp. 112-125, 2008.
[Lia 09] Y. M. Liang, S. W. Shih, A. C. C. Shih, H. Y. M. Liao, and C. C. Lin, “Learning atomic human actions using variable-length Markov models,” IEEE Transactions on Systems, Man, and Cybernetics-Part B:Cybernetics, Vol. 39, No. 1, 2009.
[Nev 00] C. G. Nevill-Manning and I. H. Witten, “On-line and off-line heuristics for inferring hierarchies of repetitions in sequence,” Proceedings of the IEEE, Vol. 88, No. 11, 2000.
[Li 12] J. Y. Li, A. C. Luo , and S.W.Chen , “Automated Lecture Recording System-Virtual Cameraman”, The 25th IPPR Conference on Computer Vision, Graphics, and Image Processing, session F2-8, 2012.
[Jai 06] Law MHC, Jain AK (2006) “Incremental nonlinear dimensionality reduction by manifold learning,” IEEE Trans Pattern Anal Mach Intell 28(3):377–391
[Ten 12] Joshua B. Tenenbaum, Vin de Silva and John C. Langford “A Global Geometric Framework for Nonlinear Dimensionality Reduction,” 450 Serra Mall Stanford University Stanford, CA 94305-2130 (650) 724-4676
網路資料來源:
TREC Video Retrieval Evaluation, http://www-nlpir.nist.gov/projects/trecvid/
Kinect for Windoes, http://kinectforwindows.org/download/