研究生: |
周世耀 Jou, Shyh-Yaw |
---|---|
論文名稱: |
高性能之輕量級卷積神經網路之設計 Design of A Novel Lightweight CNN with High Performance |
指導教授: |
蘇崇彥
Su, Chung-Yen |
口試委員: |
瞿忠正
Chiu, Chung-Cheng 賴穎暉 Lai, Ying-Hui 蘇崇彥 Su, Chung-Yen |
口試日期: | 2021/06/15 |
學位類別: |
碩士 Master |
系所名稱: |
電機工程學系 Department of Electrical Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 中文 |
論文頁數: | 47 |
中文關鍵詞: | 深度學習 、卷積神經網路 、影像辨識 、物件偵測 |
英文關鍵詞: | deep learning, CNN, image recognition, object detection |
研究方法: | 實驗設計法 、 比較研究 、 觀察研究 |
DOI URL: | http://doi.org/10.6345/NTNU202100586 |
論文種類: | 學術論文 |
相關次數: | 點閱:191 下載:31 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
因深度學習強大的分析能力,其時常被用做影像辨識與物件偵測的工具。時至今日,已有許多基於深度學習方法的著名模型被提出,例如:SENet、EfficientNet、Densenet、MobileNet、ResNet、ShuffleNet、GhostNet、Yolo等。
深度學習模型的性能主要可從4個層面進行探討,分別是參數量,資料分析能力、處理資料的速度以及模型的泛用能力。一般而言,模型能在上述4個層面都表現優秀是很困難的。
在本論文中,我們設計出一各性能皆優秀的深度學習模型―ExquisiteNetV2。我們選用了15個具公信力的影像辨識資料集以及1個物件偵測資料集進行實驗,並以上述提到的著名模型做為比較對象。我們分別使用兩種不同的權重更新法做實驗,根據實驗結果,無論使用何種權重更新法,在超過一半以上的資料集中,ExquisiteNetV2的分類正確率都是第一名。ExquisiteNetV2的參數量遠少於其他模型,但資料分析能力以及運算速度卻優於其他模型,因此,ExquisiteNetV2是一種高性能之輕量級卷積神經網路,可通用於影像分類與物件偵測之應用。
The deep learning is often applied in the object detection and image recognition because of its outstanding ability of analyzing images. So far, many famous deep learning models have been proposed, such as SENet, EfficientNet, DenseNet, MobileNet, ResNet, ShuffleNet, GhostNet, Yolo and so on.
There are four types of model performance to be evaluated, namely the amounts of parameters, the ability of analyzing data, computing speed and model generalization. Generally, it is difficult to perform well in all the four types of performance for a model.
In this paper, we proposed a outstanding model called ExquisiteNetV2 which is good at all the four types of performance. We test ExquisiteNetV2 and the aforementioned models on fifteen credible datasets. According to the experimental results, ExquisiteNetV2 gets the highest accuracy on over half datasets. Moreover, ExquisiteNetV2 has the fewest parameters and its ability of analyzing data is better than other models. Experimental results show that ExquisiteNetV2 is a high performance lightweight model suitable for image recognition and object detection.
[1] A. Howard, M. Sandler, G. Chu, L. C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam, “Searching for mobilenetv3,” CoRR, abs/1905.02244, 2019.
[2] J. Hu, L. Shen, S. Albanie, G. Sun and E. Wu, "Squeeze-and-Excitation Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 8, pp. 2011-2023, 1 Aug. 2020, doi: 10.1109/TPAMI.2019.2913372.
[3] M. Tan and Q. V. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” arXiv preprint arXiv:1905.11946, 2019.
[4] G. Huang, Z. Liu, and K. Q. Weinberger, “Densely connected convolutional networks,” CoRR, abs/1608.06993, 2016.
[5] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE CVPR, 2016, pp. 770-778, doi: 10.1109/CVPR.2016.90.
[6] N. Ma, X. Zhang, H. Zheng, and J. Sun, “Shufflenet V2: practical guidelines for efficient CNN architecture design,” CoRR, vol. abs/1807.11164, 2018.
[7] K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, and C. Xu, “GhostNet: More Features from Cheap Operations,” arXiv:1911.11907 [cs], Nov. 2019.
[8] S. Y. Zhou and C. Y. Su, "Efficient Convolutional Neural Network for Pest Recognition-ExquisiteNet," 2020 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE), pp. 216-219, 2020.
[9] D. E. Rumelhart, G. E. Hinton and R. J. Williams, “Learning representations by back-propagating errors,” Nature 323, 533–536, 1986.
[10] J. F. Kolen and S. C. Kremer, "Gradient Flow in Recurrent Nets: The Difficulty of Learning LongTerm Dependencies," doi: 10.1109/9780470544037.ch14, 2001.
[11] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen. “Mobilenetv2: Inverted residuals and linear bottlenecks,” arXiv preprint arXiv:1801.04381, 2018.
[12] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” In arXiv:1704.04861, 2017.
[13] X.Y. Zhang, X.Y. Zhou, M.X. Lin, J. Sun, M. Inc,“ShuffleNet: An extremely efficient convolutional neural network for mobile devices,”arXiv preprint arXiv:1707.01083, 2017.
[14] S. Ioffe, and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” ArXiv, abs/1502.03167, 2015.
[15] J. L. Ba, J. R. Kiros, and G. E. Hinton. “Layer normalization,” arXiv preprint, arXiv:1607.06450, 2016.
[16] M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed and A. Vedaldi, "Describing Textures in the Wild," Proceedings of the IEEE Conf. on CVPR, 2014.
[17] K. Daniel, Z. Kang, G. Michael (2018), “Large Dataset of Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images,” Mendeley Data, V3, doi: 10.17632/rscbjbr9sj.3.
[18] A. Gupta, & R. Gupta, (2019). ALL Challenge dataset of ISBI 2019. The Cancer Imaging Archive. https://doi.org/10.7937/tcia.2019.dc64i46r
[19] Li, Shan, Deng, Weihong and Du, JunPing, “Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild,” IEEE CVPR, pp. 2584-2593, 2017.
[20] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. “SUN Database: Large-scale Scene Recognition from Abbey to Zoo,” IEEE CVPR.
[21] H. Muresan and M. Oltean, “Fruit recognition from images using deep learning,” Acta Univ. Sapientiae, Informatica Vol. 10, Issue 1, pp. 26-42, 2018.
[22] B. Lukas, G. Matthieu and V. G. Luc, “Food-101 -- Mining Discriminative Components with Random Forests,” ECCV, 2014.
[23] H. O. Song, Y. Xiang, S. Jegelka and S. Savarese. “Deep Metric Learning via Lifted Structured Feature Embedding.” In IEEE CVPR, 2016.
[24] G. Griffin, AD. Holub, P. Perona, “The Caltech 256,” Caltech Technical Report.
[25] A. Coates, H. Lee, A. Y. Ng, “An Analysis of Single Layer Networks in Unsupervised Feature Learning.” AISTATS, 2011.
[26] Alex Krizhevsky, “Learning Multiple Layers of Features from Tiny Images, “ 2009.
[27] C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, “The Caltech-UCSD Birds-200-2011 Dataset,” Computation & Neural Systems Technical Report, CNS-TR-2011-001.
[28] M-E. Nilsback, and A. Zisserman, “Automated flower classification over a large number of classes,” Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, 2008.
[29] T. M. Giselsson, R. N. Jørgensen, P. K. Jensen, M. Dyrmann, and H. S. Midtiby, “A public image database for benchmark of plant seedling classification algorithms,” arXiv preprint arXiv:1711.05458, 2017.
[30] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. "Gradient-based learning applied to document recognition," Proceedings of the IEEE, November 1998.
[31] R. Prasun, G. Subhankar, B. Saumik and P. Umapada, “Effects of Degradations on Deep Neural Network Architectures,” arXiv preprint arXiv:1807.10108, 2018.
[32] H. Yong, J. Huang, X. Hua, and L. Zhang. “Gradient centralization: A new optimization technique for deep neural networks,” 2020, arXiv preprint arXiv:2004.01461.
[33] H. Robbins, S. Monro “A Stochastic Approximation Method,” The Annals of Mathematical Statistics, Vol. 22, No. 3., pp. 400-407, 1951.
[34] Ultralytics, “YoloV5,” doi: 10.5281/zenodo.4679653.
[35] Misc. “Mask Dataset,” Make ML.
Kaggle https://www.kaggle.com/andrewmvd/face-mask-detection
[36] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size,” arXiv:1602.07360, Nov. 2016