簡易檢索 / 詳目顯示

研究生: 周世耀
Jou, Shyh-Yaw
論文名稱: 高性能之輕量級卷積神經網路之設計
Design of A Novel Lightweight CNN with High Performance
指導教授: 蘇崇彥
Su, Chung-Yen
口試委員: 瞿忠正
Chiu, Chung-Cheng
賴穎暉
Lai, Ying-Hui
蘇崇彥
Su, Chung-Yen
口試日期: 2021/06/15
學位類別: 碩士
Master
系所名稱: 電機工程學系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 47
中文關鍵詞: 深度學習卷積神經網路影像辨識物件偵測
英文關鍵詞: deep learning, CNN, image recognition, object detection
研究方法: 實驗設計法比較研究觀察研究
DOI URL: http://doi.org/10.6345/NTNU202100586
論文種類: 學術論文
相關次數: 點閱:191下載:31
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 因深度學習強大的分析能力,其時常被用做影像辨識與物件偵測的工具。時至今日,已有許多基於深度學習方法的著名模型被提出,例如:SENet、EfficientNet、Densenet、MobileNet、ResNet、ShuffleNet、GhostNet、Yolo等。
    深度學習模型的性能主要可從4個層面進行探討,分別是參數量,資料分析能力、處理資料的速度以及模型的泛用能力。一般而言,模型能在上述4個層面都表現優秀是很困難的。
    在本論文中,我們設計出一各性能皆優秀的深度學習模型―ExquisiteNetV2。我們選用了15個具公信力的影像辨識資料集以及1個物件偵測資料集進行實驗,並以上述提到的著名模型做為比較對象。我們分別使用兩種不同的權重更新法做實驗,根據實驗結果,無論使用何種權重更新法,在超過一半以上的資料集中,ExquisiteNetV2的分類正確率都是第一名。ExquisiteNetV2的參數量遠少於其他模型,但資料分析能力以及運算速度卻優於其他模型,因此,ExquisiteNetV2是一種高性能之輕量級卷積神經網路,可通用於影像分類與物件偵測之應用。

    The deep learning is often applied in the object detection and image recognition because of its outstanding ability of analyzing images. So far, many famous deep learning models have been proposed, such as SENet, EfficientNet, DenseNet, MobileNet, ResNet, ShuffleNet, GhostNet, Yolo and so on.
    There are four types of model performance to be evaluated, namely the amounts of parameters, the ability of analyzing data, computing speed and model generalization. Generally, it is difficult to perform well in all the four types of performance for a model.
    In this paper, we proposed a outstanding model called ExquisiteNetV2 which is good at all the four types of performance. We test ExquisiteNetV2 and the aforementioned models on fifteen credible datasets. According to the experimental results, ExquisiteNetV2 gets the highest accuracy on over half datasets. Moreover, ExquisiteNetV2 has the fewest parameters and its ability of analyzing data is better than other models. Experimental results show that ExquisiteNetV2 is a high performance lightweight model suitable for image recognition and object detection.

    誌 謝 i 摘 要 ii ABSTRACT iii 目 錄 iv 圖 目 錄 vi 表 目 錄 vii 第一章 緒論 1 1.1 研究動機與目的 1 1.2 論文結構 2 第二章 相關文獻探討 3 2.1 ResNet介紹 3 2.2 MobileNetV3介紹 4 2.3 SE-ResNet介紹 6 2.4 ShuffleNetV2介紹 7 2.5 DenseNet介紹 8 2.6 EfficientNet介紹 9 2.7 GhostNet介紹 10 2.8 ExquisiteNetV1介紹 11 第三章 模型ExquisiteNetV2的設計理念 13 3.1 ExquisiteNetV2的介紹 13 3.2 極值擴張模塊 13 3.3 特徵濃縮模塊 15 3.4 SE-LN模塊 16 3.5 雙殘差SE-LN模塊 17 3.6 ExquisiteNetV2完整結構 18 第四章 資料集 21 4.1 資料集的選用原則 21 4.2 紋理圖案資料集―DTD 22 4.3 兒童胸腔X光資料集―Chest X-Ray Dataset 22 4.4 視網膜影像資料集―OCT 23 4.5 兒童白血病資料集―Leukemia 23 4.6 面部情緒資料集―Real-world Affective Faces Database 24 4.7 地景資料集―Scene categorization 24 4.8 水果資料集―Fruits-360 25 4.9 食物資料集―Food-101 25 4.10 網購商品資料集―Stanford Online Products 26 4.11 257類資料集―Caltech256 26 4.12 10類資料集―STL-10 27 4.13 鳥類資料集―Caltech-UCSD Birds-200-2011 27 4.14 花資料集―VGG Flower Datasets 28 4.15 作物種子資料集―Plant Seedlings DatasetV2 28 4.16 彩色數字資料集―Synthetic Digits 29 第五章 實驗設計與結果 30 5.1 實驗設置 30 5.2 實驗項目 32 5.3 實驗結果 33 5.4 實驗結果分析 37 5.5 ExquisiteNetV2的缺陷 38 5.6 ExquisiteNetV2在物件偵測的應用 39 第六章 結論與未來展望 41 參 考 文 獻 42 自  傳 46 學 術 成 就 47

    [1] A. Howard, M. Sandler, G. Chu, L. C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam, “Searching for mobilenetv3,” CoRR, abs/1905.02244, 2019.
    [2] J. Hu, L. Shen, S. Albanie, G. Sun and E. Wu, "Squeeze-and-Excitation Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 8, pp. 2011-2023, 1 Aug. 2020, doi: 10.1109/TPAMI.2019.2913372.
    [3] M. Tan and Q. V. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” arXiv preprint arXiv:1905.11946, 2019.
    [4] G. Huang, Z. Liu, and K. Q. Weinberger, “Densely connected convolutional networks,” CoRR, abs/1608.06993, 2016.
    [5] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE CVPR, 2016, pp. 770-778, doi: 10.1109/CVPR.2016.90.
    [6] N. Ma, X. Zhang, H. Zheng, and J. Sun, “Shufflenet V2: practical guidelines for efficient CNN architecture design,” CoRR, vol. abs/1807.11164, 2018.
    [7] K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, and C. Xu, “GhostNet: More Features from Cheap Operations,” arXiv:1911.11907 [cs], Nov. 2019.
    [8] S. Y. Zhou and C. Y. Su, "Efficient Convolutional Neural Network for Pest Recognition-ExquisiteNet," 2020 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE), pp. 216-219, 2020.
    [9] D. E. Rumelhart, G. E. Hinton and R. J. Williams, “Learning representations by back-propagating errors,” Nature 323, 533–536, 1986.
    [10] J. F. Kolen and S. C. Kremer, "Gradient Flow in Recurrent Nets: The Difficulty of Learning LongTerm Dependencies," doi: 10.1109/9780470544037.ch14, 2001.
    [11] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen. “Mobilenetv2: Inverted residuals and linear bottlenecks,” arXiv preprint arXiv:1801.04381, 2018.
    [12] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” In arXiv:1704.04861, 2017.
    [13] X.Y. Zhang, X.Y. Zhou, M.X. Lin, J. Sun, M. Inc,“ShuffleNet: An extremely efficient convolutional neural network for mobile devices,”arXiv preprint arXiv:1707.01083, 2017.
    [14] S. Ioffe, and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” ArXiv, abs/1502.03167, 2015.
    [15] J. L. Ba, J. R. Kiros, and G. E. Hinton. “Layer normalization,” arXiv preprint, arXiv:1607.06450, 2016.
    [16] M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed and A. Vedaldi, "Describing Textures in the Wild," Proceedings of the IEEE Conf. on CVPR, 2014.
    [17] K. Daniel, Z. Kang, G. Michael (2018), “Large Dataset of Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images,” Mendeley Data, V3, doi: 10.17632/rscbjbr9sj.3.
    [18] A. Gupta, & R. Gupta, (2019). ALL Challenge dataset of ISBI 2019. The Cancer Imaging Archive. https://doi.org/10.7937/tcia.2019.dc64i46r
    [19] Li, Shan, Deng, Weihong and Du, JunPing, “Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild,” IEEE CVPR, pp. 2584-2593, 2017.
    [20] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. “SUN Database: Large-scale Scene Recognition from Abbey to Zoo,” IEEE CVPR.
    [21] H. Muresan and M. Oltean, “Fruit recognition from images using deep learning,” Acta Univ. Sapientiae, Informatica Vol. 10, Issue 1, pp. 26-42, 2018.
    [22] B. Lukas, G. Matthieu and V. G. Luc, “Food-101 -- Mining Discriminative Components with Random Forests,” ECCV, 2014.
    [23] H. O. Song, Y. Xiang, S. Jegelka and S. Savarese. “Deep Metric Learning via Lifted Structured Feature Embedding.” In IEEE CVPR, 2016.
    [24] G. Griffin, AD. Holub, P. Perona, “The Caltech 256,” Caltech Technical Report.
    [25] A. Coates, H. Lee, A. Y. Ng, “An Analysis of Single Layer Networks in Unsupervised Feature Learning.” AISTATS, 2011.
    [26] Alex Krizhevsky, “Learning Multiple Layers of Features from Tiny Images, “ 2009.
    [27] C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, “The Caltech-UCSD Birds-200-2011 Dataset,” Computation & Neural Systems Technical Report, CNS-TR-2011-001.
    [28] M-E. Nilsback, and A. Zisserman, “Automated flower classification over a large number of classes,” Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, 2008.
    [29] T. M. Giselsson, R. N. Jørgensen, P. K. Jensen, M. Dyrmann, and H. S. Midtiby, “A public image database for benchmark of plant seedling classification algorithms,” arXiv preprint arXiv:1711.05458, 2017.
    [30] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. "Gradient-based learning applied to document recognition," Proceedings of the IEEE, November 1998.
    [31] R. Prasun, G. Subhankar, B. Saumik and P. Umapada, “Effects of Degradations on Deep Neural Network Architectures,” arXiv preprint arXiv:1807.10108, 2018.
    [32] H. Yong, J. Huang, X. Hua, and L. Zhang. “Gradient centralization: A new optimization technique for deep neural networks,” 2020, arXiv preprint arXiv:2004.01461.
    [33] H. Robbins, S. Monro “A Stochastic Approximation Method,” The Annals of Mathematical Statistics, Vol. 22, No. 3., pp. 400-407, 1951.
    [34] Ultralytics, “YoloV5,” doi: 10.5281/zenodo.4679653.
    [35] Misc. “Mask Dataset,” Make ML.
    Kaggle https://www.kaggle.com/andrewmvd/face-mask-detection
    [36] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size,” arXiv:1602.07360, Nov. 2016

    下載圖示
    QR CODE