簡易檢索 / 詳目顯示

研究生: 蕭怡涵
論文名稱: 基於 Kinect 之台灣手語單字辨識
Kinect Based Taiwanese Sign Language Vocabulary Recognition
指導教授: 李忠謀
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2013
畢業學年度: 101
語文別: 中文
論文頁數: 50
中文關鍵詞: Kinect手語辨識手勢辨識
英文關鍵詞: Kinect, Sign language recognition, Gesture recognition
論文種類: 學術論文
相關次數: 點閱:855下載:43
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 手勢辨識一直被應用於人機互動介面相關的研究,而手語更是其中當熱門的研究之一。 以影像方法為基礎的單字手語辨識發展已有一段時間,並都獲得不錯的辨識結果,但仍有其限制及問題,故近期開始研究加入深度資訊來協助辨識。
    然而,市面上的深度攝影機通常價格不斐或配戴不易,而微軟公司於2010年推出的 Kinect控制器則提供了新的選擇。因此本論文提出了一個以Kinect為基礎,可即時辨識台灣手語單字的方法。
    手語中的手勢由位置、方向及形狀三個主要部分組成,我們利用Kinect本身的骨架追蹤及提供的深度資訊,對應不同部分的特性分別擷取如軌跡特徵及手型特徵,且利用不同方法來辨識。最後利用上述三個主要部分的辨識結果,並透過本論文所設計的單字決策方法,來達到最後辨識手語單字的目的。

    Gesture recognition systems are wildly used in human-computer interaction research problems, and sign language recognition is a popular research among one of the studies. Vision-based sign language recognition approaches have been developing for long time and achieved good results, but it’s still has limitations and problems, so recent research has started adding depth information to solve problems.
    However, the depth cameras usually endure with high cost and hard to fetch problems, while recently the Microsoft Kinect has offered an affordable depth camera which has made depth a viable option for more researchers. Therefore, we propose a Kinect-based Taiwan sign language recognition methods.
    Sign consists of three main parts: hand position, direction, and shape of the composition. We use Kinect skeleton tracking and depth information, extracting features for recognition, respectively. Finally, we use our proposed method to help us decides possible sign language vocabulary.

    摘要.......................................................i Abstract..................................................ii 附圖目錄.................................................. vi 附表目錄..................................................vii 第一章 緒論................................................ 1 第一節 研究動機........................................... 1 第二節 研究目的........................................... 2 第三節 研究範圍及限制................. .................... 3 第四節 論文架構........................................... 4 第二章 文獻探討............................................. 5 第一節 手語辨識相關研究探討............................ .... 5 第二節 Kinect架構及原理探討................................ 8 第三章 研究方法............................................ 13 第一節 系統架構與流程..................................... 13 第二節 手語位置判斷....................................... 15 第三節 手語方向辨識....................................... 16 3.3.1 特徵擷取....................................... 16 3.3.2 方向辨識....................................... 17 第四節 手語手形辨識....................................... 19 3.4.1 手掌擷取....................................... 20 3.4.2 手掌擷取干擾判斷................................. 21 3.4.3 三維手形特徵擷取................................. 24 3.4.4 連續手形辨識.................................... 25 第五節 手語單字決策....................................... 27 3.5.1 辨識機率表..................................... 27 3.5.2 單字決策....................................... 28 第四章 實驗結果與分析....................................... 29 第一節 實驗目的及方法..................................... 29 第二節 方向辨識實驗............... ....................... 30 第三節 手形辨識實驗....................................... 32 第四節 手語單字決策實驗.................................... 36 4.4.1 不同辨識方法比較................................. 36 4.4.1 不同單字量比較.................................. 39 第五節 銀行服務應用系統.................................... 40 第五章 結論............................................... 42 第一節 結論............................................. 42 第二節 未來展望.......................................... 42 參考文獻.................................................. 44 附錄 A 單字組成表......................................... 47 附錄 B 60 個辨識單字表...................................... 50

    [1] 內政部. 全國身心障礙人口統計. Available: http://sowf.moi.gov.tw/stat/year/list.htm
    [2] O. M. X. website. Introduction of Kinect. Available: http://www.xbox.com/en-US/kinect
    [3] H. Cooper, B. Holt, and R. Bowden, "Sign Language Recognition," Visual Analysis of Humans, pp. 539-562, 2011.
    [4] 史文漢、丁立芬, 手能生橋, 民 88.
    [5] S. Lang, M. Block, and R. Rojas, "Sign Language Recognition using Kinect" in Proceedings of the Artificial Intelligence and Soft Computing,2012, pp. 394-402.
    [6] Z. Zafrulla, H. Brashear, T. Starner, H. Hamilton, and P. Presti, "American Sign Language Recognition with the Kinect," in Proceedings of the International Conference on Multimodal Interfaces, 2011, pp. 279-286.
    [7] O. M. K. website. Kinect for Windows. Available: http://www.microsoft.com/en-us/kinectforwindows/
    [8] C. Vogler and D. Metaxas, "Handshapes and Movements: Multiple-Channel American Sign Language Recognition," Gesture-Based Communication in Human-Computer Interaction, vol. 2915, pp. 247-258, 2004.
    [9] H. Sagawa and M. Takeuchi, "A Method for Recognizing a Sequence of Sign Language Words Represented in a Japanese Sign Language Sentence," in Proceedings of the Automatic Face and Gesture Recognition, 2000, pp. 434-439
    [10] J. S. Kim, W. Jang, and Z. Bien, "A Dynamic Gesture Recognition System for the Korean Sign Language (KSL)," Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 26, pp. 354- 359, 1996.
    [11] H. Brashear, V. Henderson, K. H. Park, H. Hamilton, S. Lee, and T. Starner, "American Sign Language Recognition in Game Development for Deaf Children," in Proceedings of the International ACM SIGACCESS Conference on Computers and Accessibility, 2006, pp. 79 - 86.
    [12] M. W. Kadous, "Machine Recognition of Auslan Signs Using PowerGloves: Towards Large-Lexicon Recognition of Sign Language," in Proceedings of the Workshop on the Integration of Gesture in Language and Speech, 1996, pp. 165-174.
    [13] J. Zieren and K. F. Kraiss, "Non-intrusive Sign Language Recognition for Human-computer Interaction," in Proceedings of IFAC/IFIP/IFORS/IEA Symposium on Analysis, Design and Evaluation of Human Machine Systems, 2004.
    [14] C. Vogler and D. Metaxas, "ASL Recognition Based on A Coupling between HMMs and 3D Motion Analysis," in Proceedings of the International Conference on Computer Vision, 1998, pp. 363-369.
    [15] J. Segen and S. Kumar, "Shadow Gestures: 3D Hand Pose Estimation using A Single Camera," in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999.
    [16] T. Starner, J. Weaver, and A. Pentland, "Real-time American Sign Language Recognition using Desk and Wearable Computer Based Video," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 1371-1375, 1998.
    [17] T. Starner and A. Pentland, "Real-time American Sign Language Recognition from Video using Hidden Markov Models," in Proceedings of International Symposium on Computer Vision, 1995, pp. 265-270.
    [18] D. Kelly, J. Reilly Delannoy, J. Mc Donald, and C. Markham, "A Framework for Continuous Multimodal Sign Language Recognition," in Proceedings of the International Conference on Multimodal Interfaces, 2009, pp. 351-358.
    [19] P. Dreuw, D. Rybach, T. Deselaers, M. Zahedi, and H. Ney, "Speech Recognition Techniques for A Sign Language Recognition System," Hand, vol. 60, pp. 80-83, 2007.
    [20] Techbang. 身體就是控制器,微軟 Kinect 是怎麼做到的?. Available: http://www.techbang.com/posts/2936-get-to-know-how-it-works-kinect
    [21] I. C. Albitar, P. Graebling, and C. Doignon, "Robust Structured Light Coding for 3D Reconstruction," in Proceedings of IEEE International Conference on Computer Vision, 2007, pp. 1-6.
    [22] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, et al., "Real-time Human Pose Recognition in Parts from Single Depth Images," in
    Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 1297-1304.
    [23] Z. Ren, J. Meng, J. Yuan, and Z. Zhang, "Robust Hand Gesture Recognition with Kinect Sensor," in Proceedings of the ACM International Conference on Multimedia, 2011, pp. 759-760.
    [24] Z. Feng, S. Xu, X. Zhang, L. Jin, Z. Ye, and W. Yang, "Real-time fingertip tracking and detection using Kinect depth sensor for a new writing-in-the air system," in Proceedings of the International Conference on Internet Multimedia Computing and Service, 2012, pp. 70-74.
    [25] CATS. Overview of the CopyCat Platform on The Website of The Center for Accessible Technology in Sign (CATS). Available: http://cats.gatech.edu/content/copycat
    [26] L. E. Baum and T. Petrie, "Statistical Inference for Probabilistic Functions of Finite State Markov Chains," The Annals of Mathematical Statistics, vol. 37, pp. 1554-1563, 1966.
    [27] P. Suryanarayan, A. Subramanian, and D. Mandalapu, "Dynamic Hand Pose Recognition using Depth Data," in Proceedings of International Conference on Pattern Recognition, 2010, pp. 3105-3108.
    [28] N. Pugeault and R. Bowden, "Spelling it out: Real-time ASL Fingerspelling Recognition," in IEEE International Conference on Computer Vision Workshops, 2011, pp. 1114-1119.
    [29] N. Otsu, "A threshold selection method from gray-level histograms," Automatica, vol. 11, pp. 23-27, 1975.
    [30] F. A. Siddiky, M. S. Alam, T. Ahsan, and M. S. Rahim, "An Efficient Approach to Rotation Invariant Face Detection using PCA, Generalized Regression Neural Network and Mahalanobis Distance by Reducing Search Space," in Proceedings of International Conference on Computer and Information Technology, 2007, pp. 1-6.
    [31] B. Lee, Y. Cho, and S. Cho, "Translation, Scale and Rotation Invariant Pattern Recognition using Principal Component Analysis (PCA) and Reduced Second-order Neural Network," Neural, Parallel & Scientific Computations, vol. 3, pp. 417 - 429 1995.
    [32] C. Cortes and V. Vapnik, "Support-vector Networks," Machine Learning, vol. 20, pp. 273-297, 1995.
    [33] S. Stehman, "Selecting and Interpreting Measures of Thematic Classification Accuracy," Remote Sensing of Environment, vol. 62, pp. 77-89, 1997.
    [34] C. C. Chang and C. J. Lin, "LIBSVM: A Library for Support Vector Machines," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, p. 27, 2011.

    下載圖示
    QR CODE