簡易檢索 / 詳目顯示

研究生: 張筠婕
Jhang, Yun-Jie
論文名稱: 基於PairNet的連續手勢辨識
Continuous Hand Gesture Recognition Based on PairNet
指導教授: 黃文吉
Hwang, Wen-Jyi
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 39
中文關鍵詞: Human-machine InterfaceDeep LearningConvolutional Neural NetworksHand Gesture Recognition
英文關鍵詞: Human-machine Interface, Deep Learning, Convolutional Neural Networks, Hand Gesture Recognition
DOI URL: http://doi.org/10.6345/THE.NTNU.DCSIE.013.2018.B02
論文種類: 學術論文
相關次數: 點閱:137下載:31
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 這篇論文提出一個可以辨識連續手勢的系統,輸入是一條三軸加速度計 (3-axis Accelerometer) 和三軸陀螺儀 (3-axis Gyroscope) 所組成的時間序列。此研究所提出的辨識演算法——PairNet,是摺積式類神經網路 (Convolutional Neural Networks) 的變形。和一般摺積式類神經網路不同的點有三個:在摺積層中的過濾片 (Kernel) 使用的大小為 2×1。步伐 (Stride) 大小從常用的 1 改為過濾片的大小 (在這裡即是 2)。在計算完最後的摺積層後,加入了全域平均池化層 (Global Average Pooling) ,使其對整體網路所產生的輸出具備聚合 (Ensembling) 的效果。實驗使用了 HTC One M9 和 Google Daydream 這兩種感測器 (Sensors) 所收集成的資料集,手勢個數分別為 11 種和 14 種。PairNet 在兩種資料集的連續手勢測試中,分別取得了 97.81% 和 99.38% 的準確率,優於長短期記憶遞迴類神經網路和一般摺積式類神經網路。

    圖目錄 iv 表目錄 v 第一章 介紹 1 1-1 研究背景 1 1-2 研究動機 2 1-3 研究貢獻 3 第二章 相關研究討論 4 2-1 手勢辨識的種類 4 2-2 基於古典演算法的手勢辨識 5 2-3 深度學習與手勢辨識 6 •長短期記憶遞迴類神經網路 Long Short-Term Memory 6 •閘遞迴單元 Gated Recurrent Units 8 •摺積式類神經網路 Convolutional Neural Network 10 2-4 PairNet 對典型深度學習模型的改進 12 第三章 研究方法 13 3-1 架構總覽 13 3-2 前處理 13 3-3 辨識模型:PairNet 14 3-4 後處理 18 第四章 實驗與分析 19 4-1 實驗設定 19 4-2 實驗所比較的模型 19 4-3 實驗結果 24 •資料集一:十一種手勢 24 •資料集二:十四種手勢 31 •PairNet 的變數比較 34 第五章 結論與未來規劃 35 參考文獻 36

    一、英文文獻

    Cabral, Marcio C., Carlos H. Morimoto, and Marcelo K. Zuffo. "On the usability of gesture interfaces in virtual reality environments." Proceedings of the 2005 Latin American conference on Human-computer interaction. ACM, 2005.

    Vogler, Christian, and Dimitris Metaxas. "Parallel hidden markov models for american sign language recognition." Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on. Vol. 1. IEEE, 1999.

    Chai, Xiujuan, et al. "Sign language recognition and translation with kinect." IEEE Conf. on AFGR. 2013.

    Sun, Chao, et al. "Discriminative exemplar coding for sign language recognition with Kinect." IEEE Transactions on Cybernetics 43.5 (2013): 1418-1428.

    Zhang, Xu, et al. "A framework for hand gesture recognition based on accelerometer and EMG sensors." IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 41.6 (2011): 1064-1076.

    Starner, Thad, and Alex Pentland. "Real-time american sign language recognition from video using hidden markov models." Motion-Based Recognition. Springer, Dordrecht, 1997. 227-243.

    Xu, Deyou. "A neural network approach for hand gesture recognition in virtual reality driving training system of SPG." Pattern Recognition, 2006. ICPR 2006. 18th International Conference on. Vol. 3. IEEE, 2006.

    Murthy, G. R. S., and R. S. Jadon. "A review of vision based hand gestures recognition." International Journal of Information Technology and Knowledge Management 2.2 (2009): 405-410.

    Mitra, Sushmita, and Tinku Acharya. "Gesture recognition: A survey." IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 37.3 (2007): 311-324.

    Stotts, David, Jason McC Smith, and Karl Gyllstrom. "Facespace: endo-and exo-spatial hypermedia in the transparent video facetop." Proceedings of the fifteenth ACM conference on Hypertext and hypermedia. ACM, 2004.

    Rautaray, Siddharth S., and Anupam Agrawal. "Vision based hand gesture recognition for human computer interaction: a survey." Artificial Intelligence Review 43.1 (2015): 1-54.

    Sharma, Rajeev, et al. "Speech/gesture interface to a visual computing environment for molecular biologists." Pattern Recognition, 1996., Proceedings of the 13th International Conference on. Vol. 3. IEEE, 1996.

    O'Hagan, R. G., Alexander Zelinsky, and Sebastien Rougeaux. "Visual gesture interfaces for virtual environments." Interacting with Computers 14.3 (2002): 231-250.

    Molchanov, Pavlo, et al. "Hand gesture recognition with 3D convolutional neural networks." Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2015.

    Ohn-Bar, Eshed, and Mohan Manubhai Trivedi. "Hand gesture recognition in real time for automotive interfaces: A multimodal vision-based approach and evaluations." IEEE transactions on intelligent transportation systems 15.6 (2014): 2368-2377.

    Gupta, Hari Prabhat, et al. "A continuous hand gestures recognition technique for human-machine interaction using accelerometer and gyroscope sensors." IEEE Sensors Journal16.16 (2016): 6425-6432.

    Elmezain, Mahmoud, et al. "A hidden markov model-based continuous gesture recognition system for hand motion trajectory." Pattern Recognition, 2008. ICPR 2008. 19th International Conference on. IEEE, 2008.

    Potter, Leigh Ellen, Jake Araullo, and Lewis Carter. "The leap motion controller: a view on sign language." Proceedings of the 25th Australian computer-human interaction conference: augmentation, application, innovation, collaboration. ACM, 2013.

    Luzhnica, Granit, et al. "A sliding window approach to natural hand gesture recognition using a custom data glove." 3D User Interfaces (3DUI), 2016 IEEE Symposium on. IEEE, 2016.

    Marin, Giulio, Fabio Dominio, and Pietro Zanuttigh. "Hand gesture recognition with leap motion and kinect devices." Image Processing (ICIP), 2014 IEEE International Conference on. IEEE, 2014.

    Lee, Hyeon-Kyu, and Jin-Hyung Kim. "An HMM-based threshold model approach for gesture recognition." IEEE Transactions on pattern analysis and machine intelligence 21.10 (1999): 961-973.

    Laurel, Brenda, and S. Joy Mountford. The art of human-computer interface design. Addison-Wesley Longman Publishing Co., Inc., 1990.

    Huang, Deng-Yuan, Wu-Chih Hu, and Sung-Hsiang Chang. "Vision-based hand gesture recognition using PCA+ Gabor filters and SVM." Intelligent Information Hiding and Multimedia Signal Processing, 2009. IIH-MSP'09. Fifth International Conference on. IEEE, 2009.

    Hong, Pengyu, Matthew Turk, and Thomas S. Huang. "Gesture modeling and recognition using finite state machines." Automatic face and gesture recognition, 2000. proceedings. fourth ieee international conference on. IEEE, 2000.

    Lefebvre, Grégoire, et al. "BLSTM-RNN based 3D gesture classification." International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg, 2013.

    Shin, Sungho, and Wonyong Sung. "Dynamic hand gesture recognition for wearable devices with low complexity recurrent neural networks." Circuits and Systems (ISCAS), 2016 IEEE International Symposium on. IEEE, 2016.

    Ordóñez, Francisco Javier, and Daniel Roggen. "Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition." Sensors 16.1 (2016): 115.

    Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. "Learning long-term dependencies with gradient descent is difficult." IEEE transactions on neural networks 5.2 (1994): 157-166.

    Pascanu, Razvan, Tomas Mikolov, and Yoshua Bengio. "On the difficulty of training recurrent neural networks." International Conference on Machine Learning. 2013.

    Hochreiter, Sepp. "The vanishing gradient problem during learning recurrent neural nets and problem solutions." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6.02 (1998): 107-116.

    Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.

    Gers, F. A., Schmidhuber, J., & Cummins, F. (1999). Learning to forget: Continual prediction with LSTM.

    Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850.

    Gers, Felix A., Nicol N. Schraudolph, and Jürgen Schmidhuber. "Learning precise timing with LSTM recurrent networks." Journal of machine learning research 3.Aug (2002): 115-143.

    Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).

    Chung, Junyoung, et al. "Empirical evaluation of gated recurrent neural networks on sequence modeling." arXiv preprint arXiv:1412.3555 (2014).

    LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.

    Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.

    Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).

    Szegedy, Christian, et al. "Going deeper with convolutions." Cvpr, 2015.

    He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

    Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention." International Conference on Machine Learning. 2015.

    Zeiler, Matthew D. "ADADELTA: an adaptive learning rate method." arXiv preprint arXiv:1212.5701 (2012).

    Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).

    下載圖示
    QR CODE