研究生: |
高汎宜 Kao, Fan-Yi |
---|---|
論文名稱: |
MiniNet:密集擠壓之深度可分離卷積於圖像分類 MiniNet: Dense Squeeze with Depthwise Separable Convolutions for Image Classification |
指導教授: |
曾繁勛
Tseng, Fan-Hsun |
學位類別: |
碩士 Master |
系所名稱: |
科技應用與人力資源發展學系 Department of Technology Application and Human Resource Development |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 85 |
中文關鍵詞: | 深度學習 、卷積神經網路 、深度可分離卷積 、密集連接 |
英文關鍵詞: | Deep Learning, Convolutional Neural Network, Depthwise Separable Convolution, Densely Connected Convolutional Networks |
DOI URL: | http://doi.org/10.6345/NTNU202001540 |
論文種類: | 學術論文 |
相關次數: | 點閱:248 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,人工智慧的發展蒸蒸日上,自卷積神經網路被提出之後,深度學習開始蓬勃發展,研究學者們紛紛提出更為優化與創新的技術,相較於其它科學領域,深度學習領域的研究採完全開放的方式進行,Google團隊提出TensorFlow開放原始碼函式庫,並在TensorFlow核心庫中支援高階深度學習框架的Keras,幫助開發者在Keras中建立及訓練深度學習模型。未來人工智慧的應用將無所不在,為普及自動駕駛、無人商店、智慧城市等應用,如何在有限的硬體設備中,提供一個運算快速且低計算成本的神經網路模型已成為一個很重要的研究議題。
本論文基於MobileNet架構,加入密集連接技術與擠壓式的SENet模型,提出一個密集擠壓之深度可分離卷積架構,並將此模型命名為MiniNet。本論文在實驗環境中,使用Keras進行MiniNet的建立與訓練,在五種不同的資料集中,與三個現有的卷積神經網路架構進行比較,實驗結果顯示,本論文提出之MiniNet架構能夠明顯地使用更少的計算參數量並有效地縮短訓練時間,尤其在資料集之種類與資料量較少時,本論文提出之MiniNet架構更能優於現有架構達到最高的準確率。
Artificial intelligence (AI) has been developed vigorously in recent years. Deep learning has made a breakthrough since the convolutional neural network was proposed. Researchers have proposed various improved and innovative techniques. Compared with other research fields, researches in deep learning are conducted with an open-source environment completely. The Google Brain team developed the open-source library TensorFlow, which supports the Keras functional API and helps developers to build and train deep learning models. AI applications will be ubiquitous in the future, such as self-driving cars, unmanned shops, and smart city applications. How to decrease computations and shorten calculation time is a vital research issue.
In the thesis, the dense squeeze with depthwise separable convolutions, viz MiniNet is proposed and based on MobileNet’s architecture. Not only the dense connected technique but also the Squeeze-and-Excitation concept are integrated into MiniNet. To build and train the proposed MiniNet, experiments in the thesis are implemented with Keras. Three existing models are compared to the MiniNet with five different datasets. The experimental results showed that the MiniNet significantly reduces number of parameters and shortens training time efficiently, achieves the highest recognition accuracy when the dataset is small especially.
Basha, S. S., Dubey, S. R., Pulabaigari, V., & Mukherjee, S. (2020). Impact of fully connected layers on performance of convolutional neural networks for image classification. Neurocomputing, 378, 112-119.
Chen, W., Xie, D., Zhang, Y., & Pu, S. (2019). All you need is a few shifts: Designing efficient convolutional neural networks for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 7241-7250).
Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1251-1258).
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). ImageNet: A Large-Scale Hierarchical Image Database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255).
Goodfellow, I. J., Bulatov, Y., Ibarz, J., Arnoud, S., & Shet, V. (2013). Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082.
He, K., & Sun, J. (2015). Convolutional neural networks at constrained time cost. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5353-5360).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
He, T., Zhang, Z., Zhang, H., Zhang, Z., Xie, J., & Li, M. (2019). Bag of tricks for image classification with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 558-567).
Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527-1554.
Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., ... & Adam, H. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708).
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
Jiao, J., Zhao, M., Lin, J., & Ding, C. (2019). Deep coupled dense convolutional network with complementary data for intelligent fault diagnosis. IEEE Transactions on Industrial Electronics, 66(12), 9858-9867.
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
Lin, M., Chen, Q., & Yan, S. (2013). Network in network. arXiv preprint arXiv:1312.4400.
Lin, W., Ding, Y., Wei, H. L., Pan, X., & Zhang, Y. (2020). LdsConv: Learned Depthwise Separable Convolutions by Group Pruning. Sensors, 20(15), 4349.
Liu, L., Shen, C., & van den Hengel, A. (2015). The treasure beneath convolutional layers: Cross-convolutional-layer pooling for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4749-4757)
Liu, Z., Sun, M., Zhou, T., Huang, G., & Darrell, T. (2019). Rethinking the value of network pruning. In International Conference on Learning Representations (ICLR).
McCarthy, J., Minsky, M. L., Rochester, N., & Shannon, C. E. (2006). A proposal for the Dartmouth summer research project on artificial intelligence, august 31, 1955. AI magazine, 27(4), 12-12.
Moore, G. E. (1975, December). Progress in digital integrated electronics. In Electron devices meeting (Vol. 21, pp. 11-13).
Nagi, J., Ducatelle, F., Di Caro, G. A., Cireşan, D., Meier, U., Giusti, A., ... & Gambardella, L. M. (2011, November). Max-pooling convolutional neural networks for vision-based hand gesture recognition. In 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA) (pp. 342-347).
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. nature, 323(6088), 533-536.
Salamon, J., & Bello, J. P. (2017). Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Processing Letters, 24(3), 279-283.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Sun, Y., Xue, B., Zhang, M., & Yen, G. G. (2019). Evolving deep convolutional neural networks for image classification. IEEE Transactions on Evolutionary Computation, 24(2), 394-407.
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017, February). Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-first AAAI conference on artificial intelligence (pp. 4278-4284).
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818-2826).
Tan, M., & Le, Q. V. (2019). MixConv: Mixed Depthwise Convolutional Kernels. arXiv preprint arXiv:1907.09595.
Wang, W., Li, Y., Zou, T., Wang, X., You, J., & Luo, Y. (2020). A novel image classification approach via dense-MobileNet models. Mobile Information Systems, vol. 2020, Article ID 7602384, 8 pages.
Wei, S., Wu, W., Jeon, G., Ahmad, A., & Yang, X. (2020). Improving resolution of medical images with deep dense convolutional neural network. Concurrency and Computation: Practice and Experience, 32(1), e5084.
Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv preprint arXiv:1708.07747.
Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1492-1500).
Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Zhang, Z., Lin, H., ... & Li, M. (2020). ResNeSt: Split-Attention Networks. arXiv preprint arXiv:2004.08955.
Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6848-6856).
Zhao, C., Ni, B., Zhang, J., Zhao, Q., Zhang, W., & Tian, Q. (2019). Variational convolutional neural network pruning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2780-2789).
Zhong, Z., Zheng, L., Kang, G., Li, S., & Yang, Y. (2020). Random Erasing Data Augmentation. In Association for the Advancement of Artificial Intelligence (AAAI) (pp. 13001-13008).