Author: |
王雅慶 Wang, Ya-Ching |
---|---|
Thesis Title: |
以FPGA實現摺積神經網路及應用於人臉辨識之研究 The implementation of CNN-based face recognition systems based on FPGA |
Advisor: |
吳榮根
Wu, Jung-Gen 黃文吉 Hwang, Wen-Jyi |
Degree: |
碩士 Master |
Department: |
資訊工程學系 Department of Computer Science and Information Engineering |
Thesis Publication Year: | 2016 |
Academic Year: | 104 |
Language: | 中文 |
Number of pages: | 78 |
Keywords (in Chinese): | 摺積神經網路 |
DOI URL: | https://doi.org/10.6345/NTNU202204484 |
Thesis Type: | Academic thesis/ dissertation |
Reference times: | Clicks: 183 Downloads: 42 |
Share: |
School Collection Retrieve National Library Collection Retrieve Error Report |
本研究主要提出一個以可程式化邏輯閘陣列(Field Programmable Gate Array; FPGA) [1]為主的硬體架構來實現快速辨識影像架構,此架構是採用摺積神經網路(Convolutional Neural Network ; CNN)的向前傳遞法則(Forward propagation)來實現影像的辨識階段,現有的CNN系統架構多以GPU實現,GPU有高功率的缺點,而現有使用FPGA實現CNN運算的電路設計大部分只有設計CNN中的少數幾層,只實作出摺積層或是全連結層,本研究以FPGA為平台,設計CNN中的Lenet5模型,設計出Lenet5完整架構,具有低功率消耗跟極高的辨識率的優點。
本研究的架構為可程式化系統晶片(System on Programmable Chip; SOPC)中的硬體加速器以實現圖像辨識,本研究使用人臉圖像來當作辨識影像,總共辨識28個人的人臉。實驗結果顯示本研究所提出的CNN架構十分合適於使用在需要高可攜性,高辨識率,高計算速度等的視覺應用。本論文實作CNN的Lenet5架構比較適合運用在社區的人臉監視系統,Lenet5 模型對於很多人的辨識運用比其它摺積神經網路較差些,像是VGG Net [2]、GOOGLE Net [3],但對於30人左右的辨識率Lenet5模型還是辨識率還是足夠的。本研究可以使用在社區人臉辨識,社區的人臉監視系統只需要辨識社區內所有人物,而且辨識的速度快速,一有辨識錯的影像可以馬上被察覺,不會讓社區以外的人進入,這是本論文的一個有趣的應用。
[1] C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, and J. Cong, (2015). Optimizing FPGA-Based Accelerator Design for Deep Convolutional Neural Networks. In Proc. ACM/SIGDA Int. Symp. on Field- Programmable Gate Arrays, pp.161-170.
[2] L. Wang, S. Guo, W. Huang, and Y. Qiao (2015). Places205-VGGNet Models for Scene Recognition; Available online: https://github.com/wanglimin/Places205-VGGNet (accessed on 20 Nov. 2015).
[3] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A.Rabinovich (2015). Going Deeper with Convolutions. In Proc. IEEE Int. Conf. on Comp. Vis. and Pattern Recogn.
[4] Y. LeCun, Y. Bengio and G. Hinton, (2015). Deep Learning, Nature, 521, pp. 436-444.
[5] J. Fan, W. Xu, Y. Wu, and Y. Gong, (2010) Human Tracking Using Convolutional Neural Networks. IEEE Trans. Neural Networks, 21, pp. 1610 -1623.
[6] S. Ji, W. Xu, M. Yang and K. Yu (2013). 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 35, pp.221-231.
[7] K. Simonyan and A. Zisserman, (2015). Very Deep Convolutional Neural Networks for Large Scale Image Recognition, In Proc. Int. Conf. on Learning Representation.
[8] Y. Jia, http://caffe.berkeleyvision.org/.
[9] S. Sun and J. Zambreno (2009). A Floating-point Accumulator for FPGA-based High Performance Computing Applications. In Proc. IEEE Int. Conf. on Field-Programmable Tech., pp. 493-499.
[10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, (2012) Imagenet Classification with Deep Convolutional Neural Networks. Advances in Neural Inform. Processing Syst., 25, pp. 1106-1114.
[11] S. Himavathi, D. Anitha, and A. Muthuramalingam, (2007).Feedforward Neural Network Implementation in FPGA Using Layer Multiplexing for Effective Resource Utilization, IEEE Trans. Neural Networks, 18, pp.880-888..
[12] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,
S. Guadarrama, and T. Darrell(2014) Caffe: Convolutional Architecture for Fast Feature Embedding. In Proc. ACM Int. Conf. on Multimedia, pp. 675-678
[13] Y. LeCun and Y. Bengio, (1995) Convolutional networks for images, speech, and time series. In The Handbook of Brain Theory and Neural Networks, M. A. Arbib, Ed., MIT Press, pp. 255-258.
[14] 紀凱文(2016),摺積神經網路全連結層FPGA實現之研究, 國立台灣師範大學碩士論文