研究生: |
蕭怡涵 |
---|---|
論文名稱: |
基於 Kinect 之台灣手語單字辨識 Kinect Based Taiwanese Sign Language Vocabulary Recognition |
指導教授: | 李忠謀 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2013 |
畢業學年度: | 101 |
語文別: | 中文 |
論文頁數: | 50 |
中文關鍵詞: | Kinect 、手語辨識 、手勢辨識 |
英文關鍵詞: | Kinect, Sign language recognition, Gesture recognition |
論文種類: | 學術論文 |
相關次數: | 點閱:748 下載:40 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
手勢辨識一直被應用於人機互動介面相關的研究,而手語更是其中當熱門的研究之一。 以影像方法為基礎的單字手語辨識發展已有一段時間,並都獲得不錯的辨識結果,但仍有其限制及問題,故近期開始研究加入深度資訊來協助辨識。
然而,市面上的深度攝影機通常價格不斐或配戴不易,而微軟公司於2010年推出的 Kinect控制器則提供了新的選擇。因此本論文提出了一個以Kinect為基礎,可即時辨識台灣手語單字的方法。
手語中的手勢由位置、方向及形狀三個主要部分組成,我們利用Kinect本身的骨架追蹤及提供的深度資訊,對應不同部分的特性分別擷取如軌跡特徵及手型特徵,且利用不同方法來辨識。最後利用上述三個主要部分的辨識結果,並透過本論文所設計的單字決策方法,來達到最後辨識手語單字的目的。
Gesture recognition systems are wildly used in human-computer interaction research problems, and sign language recognition is a popular research among one of the studies. Vision-based sign language recognition approaches have been developing for long time and achieved good results, but it’s still has limitations and problems, so recent research has started adding depth information to solve problems.
However, the depth cameras usually endure with high cost and hard to fetch problems, while recently the Microsoft Kinect has offered an affordable depth camera which has made depth a viable option for more researchers. Therefore, we propose a Kinect-based Taiwan sign language recognition methods.
Sign consists of three main parts: hand position, direction, and shape of the composition. We use Kinect skeleton tracking and depth information, extracting features for recognition, respectively. Finally, we use our proposed method to help us decides possible sign language vocabulary.
[1] 內政部. 全國身心障礙人口統計. Available: http://sowf.moi.gov.tw/stat/year/list.htm
[2] O. M. X. website. Introduction of Kinect. Available: http://www.xbox.com/en-US/kinect
[3] H. Cooper, B. Holt, and R. Bowden, "Sign Language Recognition," Visual Analysis of Humans, pp. 539-562, 2011.
[4] 史文漢、丁立芬, 手能生橋, 民 88.
[5] S. Lang, M. Block, and R. Rojas, "Sign Language Recognition using Kinect" in Proceedings of the Artificial Intelligence and Soft Computing,2012, pp. 394-402.
[6] Z. Zafrulla, H. Brashear, T. Starner, H. Hamilton, and P. Presti, "American Sign Language Recognition with the Kinect," in Proceedings of the International Conference on Multimodal Interfaces, 2011, pp. 279-286.
[7] O. M. K. website. Kinect for Windows. Available: http://www.microsoft.com/en-us/kinectforwindows/
[8] C. Vogler and D. Metaxas, "Handshapes and Movements: Multiple-Channel American Sign Language Recognition," Gesture-Based Communication in Human-Computer Interaction, vol. 2915, pp. 247-258, 2004.
[9] H. Sagawa and M. Takeuchi, "A Method for Recognizing a Sequence of Sign Language Words Represented in a Japanese Sign Language Sentence," in Proceedings of the Automatic Face and Gesture Recognition, 2000, pp. 434-439
[10] J. S. Kim, W. Jang, and Z. Bien, "A Dynamic Gesture Recognition System for the Korean Sign Language (KSL)," Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 26, pp. 354- 359, 1996.
[11] H. Brashear, V. Henderson, K. H. Park, H. Hamilton, S. Lee, and T. Starner, "American Sign Language Recognition in Game Development for Deaf Children," in Proceedings of the International ACM SIGACCESS Conference on Computers and Accessibility, 2006, pp. 79 - 86.
[12] M. W. Kadous, "Machine Recognition of Auslan Signs Using PowerGloves: Towards Large-Lexicon Recognition of Sign Language," in Proceedings of the Workshop on the Integration of Gesture in Language and Speech, 1996, pp. 165-174.
[13] J. Zieren and K. F. Kraiss, "Non-intrusive Sign Language Recognition for Human-computer Interaction," in Proceedings of IFAC/IFIP/IFORS/IEA Symposium on Analysis, Design and Evaluation of Human Machine Systems, 2004.
[14] C. Vogler and D. Metaxas, "ASL Recognition Based on A Coupling between HMMs and 3D Motion Analysis," in Proceedings of the International Conference on Computer Vision, 1998, pp. 363-369.
[15] J. Segen and S. Kumar, "Shadow Gestures: 3D Hand Pose Estimation using A Single Camera," in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999.
[16] T. Starner, J. Weaver, and A. Pentland, "Real-time American Sign Language Recognition using Desk and Wearable Computer Based Video," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 1371-1375, 1998.
[17] T. Starner and A. Pentland, "Real-time American Sign Language Recognition from Video using Hidden Markov Models," in Proceedings of International Symposium on Computer Vision, 1995, pp. 265-270.
[18] D. Kelly, J. Reilly Delannoy, J. Mc Donald, and C. Markham, "A Framework for Continuous Multimodal Sign Language Recognition," in Proceedings of the International Conference on Multimodal Interfaces, 2009, pp. 351-358.
[19] P. Dreuw, D. Rybach, T. Deselaers, M. Zahedi, and H. Ney, "Speech Recognition Techniques for A Sign Language Recognition System," Hand, vol. 60, pp. 80-83, 2007.
[20] Techbang. 身體就是控制器,微軟 Kinect 是怎麼做到的?. Available: http://www.techbang.com/posts/2936-get-to-know-how-it-works-kinect
[21] I. C. Albitar, P. Graebling, and C. Doignon, "Robust Structured Light Coding for 3D Reconstruction," in Proceedings of IEEE International Conference on Computer Vision, 2007, pp. 1-6.
[22] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, et al., "Real-time Human Pose Recognition in Parts from Single Depth Images," in
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 1297-1304.
[23] Z. Ren, J. Meng, J. Yuan, and Z. Zhang, "Robust Hand Gesture Recognition with Kinect Sensor," in Proceedings of the ACM International Conference on Multimedia, 2011, pp. 759-760.
[24] Z. Feng, S. Xu, X. Zhang, L. Jin, Z. Ye, and W. Yang, "Real-time fingertip tracking and detection using Kinect depth sensor for a new writing-in-the air system," in Proceedings of the International Conference on Internet Multimedia Computing and Service, 2012, pp. 70-74.
[25] CATS. Overview of the CopyCat Platform on The Website of The Center for Accessible Technology in Sign (CATS). Available: http://cats.gatech.edu/content/copycat
[26] L. E. Baum and T. Petrie, "Statistical Inference for Probabilistic Functions of Finite State Markov Chains," The Annals of Mathematical Statistics, vol. 37, pp. 1554-1563, 1966.
[27] P. Suryanarayan, A. Subramanian, and D. Mandalapu, "Dynamic Hand Pose Recognition using Depth Data," in Proceedings of International Conference on Pattern Recognition, 2010, pp. 3105-3108.
[28] N. Pugeault and R. Bowden, "Spelling it out: Real-time ASL Fingerspelling Recognition," in IEEE International Conference on Computer Vision Workshops, 2011, pp. 1114-1119.
[29] N. Otsu, "A threshold selection method from gray-level histograms," Automatica, vol. 11, pp. 23-27, 1975.
[30] F. A. Siddiky, M. S. Alam, T. Ahsan, and M. S. Rahim, "An Efficient Approach to Rotation Invariant Face Detection using PCA, Generalized Regression Neural Network and Mahalanobis Distance by Reducing Search Space," in Proceedings of International Conference on Computer and Information Technology, 2007, pp. 1-6.
[31] B. Lee, Y. Cho, and S. Cho, "Translation, Scale and Rotation Invariant Pattern Recognition using Principal Component Analysis (PCA) and Reduced Second-order Neural Network," Neural, Parallel & Scientific Computations, vol. 3, pp. 417 - 429 1995.
[32] C. Cortes and V. Vapnik, "Support-vector Networks," Machine Learning, vol. 20, pp. 273-297, 1995.
[33] S. Stehman, "Selecting and Interpreting Measures of Thematic Classification Accuracy," Remote Sensing of Environment, vol. 62, pp. 77-89, 1997.
[34] C. C. Chang and C. J. Lin, "LIBSVM: A Library for Support Vector Machines," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, p. 27, 2011.