研究生: |
蔡侑霖 Tsai, You-Lin |
---|---|
論文名稱: |
透過可控的知識蒸餾區域改進細粒度影像辨識的準確率 Improve Fine-grained Visual Classification Accuracy by Controllable Location Knowledge Distillation |
指導教授: |
林政宏
Lin, Cheng-Hung |
口試委員: |
林政宏
Lin, Cheng-Hung 劉一宇 Liu, Yi-Yu 賴穎暉 Lai, Ying-Hui |
口試日期: | 2024/07/22 |
學位類別: |
碩士 Master |
系所名稱: |
電機工程學系 Department of Electrical Engineering |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 中文 |
論文頁數: | 47 |
中文關鍵詞: | 細粒度影像辨識 、知識蒸餾 、類別激活映射圖 |
英文關鍵詞: | Fine-grained Visual Classification, Knowledge Distillation, CAM |
研究方法: | 實驗設計法 |
DOI URL: | http://doi.org/10.6345/NTNU202401564 |
論文種類: | 學術論文 |
相關次數: | 點閱:84 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
目前神經網路模型已經達到令人驚艷的表現,但這樣的模型往往有一個共同問題,就是架構過於龐大,不易實現在終端裝置。對於這樣的缺點前人提出了一個突破性的解法──知識蒸餾。知識蒸餾的技術的可以將大型網路架構學習的特徵有效轉移到一個簡單的模型,如此一來有效地降低模型複雜度,並且於過去的做法能達到不錯的轉移效果,但在細粒度的影像分類領域,少有人提出針對細粒度影像辨識領域專用的蒸餾方法。在本篇論文中,我們針對細粒度影像辨識的知識蒸餾過程加入模型對於區域關注度分布的調整策略,讓模型的知識轉移過程中,能更加專注於圖像細粒度的特徵。策略主要對於熱區圖的空間特徵進行調整,可以提升特徵次要區域,更符合作為蒸餾的知識:過濾低回饋的區域,有效降低雜訊,提昇學習專注度。本實驗測試於多個細粒度資料集,其中在CUB200-2011,對比未經蒸餾的原始模型能提昇4.86%的準確率,相對於傳統的知識蒸餾作法也能有1.05%的提昇。
State-of-the-art neural network models exhibit impressive performance but often suffer from large architectures, making them challenging to deploy on edge devices. Knowledge distillation offers a solution by transferring the features learned by a complex network to a simpler model, effectively reducing the model’s complexity. While past approaches have achieved good transfer results, there has been limited exploration of distillation methods specifically tailored for fine-grained image classification. In this paper, we introduce a knowledge distillation process for fine-grained image recognition that incorporates an adjustment strategy for the model's attention distribution on region importance. This allows the model to focus more on the fine-grained features of images. The strategy primarily involves adjusting the spatial characteristics of heatmaps to enhance secondary feature regions, thus better serving as distilled knowledge by filtering out low-feedback areas, effectively reducing noise, and improving learning focus. Our experiments conducted on multiple fine-grained datasets, including CUB200-2011, show that compared to the original non-distilled model, our approach achieves a 4.86% increase in accuracy. Additionally, it outperforms traditional knowledge distillation methods by 1.05%, demonstrating its effectiveness and potential benefits.
(ImageNet) Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International journal of computer vision, 115, 211-252.
(ImageNet) Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee.
(RNN) Sherstinsky, A. (2020). Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena, 404, 132306.
(Detection) Zaidi, S. S. A., Ansari, M. S., Aslam, A., Kanwal, N., Asghar, M., & Lee, B. (2022). A survey of modern deep learning based object detection models. Digital Signal Processing, 126, 103514.
(RPN) Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
(DeepLAC) Lin, D., Shen, X., Lu, C., & Jia, J. (2015). Deep lac: Deep localization, alignment and classification for fine-grained recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1666-1674).
(P-RCNN) Zhang, N., Donahue, J., Girshick, R., & Darrell, T. (2014). Part-based R-CNNs for fine-grained category detection. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13 (pp. 834-849). Springer International Publishing.
Liu, J., Kanazawa, A., Jacobs, D., & Belhumeur, P. (2012). Dog breed classification using part localization. In Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part I 12 (pp. 172-185). Springer Berlin Heidelberg.
Yang, S., Bo, L., Wang, J., & Shapiro, L. (2012). Unsupervised template learning for fine-grained object recognition. Advances in neural information processing systems, 25.
Branson, S., Van Horn, G., Belongie, S., & Perona, P. (2014). Bird species categorization using pose normalized deep convolutional nets. arXiv preprint arXiv:1406.2952.
(PS-CNN) Huang, S., Xu, Z., Tao, D., & Zhang, Y. (2016). Part-stacked CNN for fine-grained visual categorization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1173-1182).
Ge, Z., McCool, C., Sanderson, C., & Corke, P. (2015). Subset feature learning for fine-grained category classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 46-52).
(MixDCNN) Ge, Z., Bewley, A., McCool, C., Corke, P., Upcroft, B., & Sanderson, C. (2016, March). Fine-grained classification via mixture of deep convolutional neural networks. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 1-6). IEEE.
(OR-Loss) Zhang, S., Du, R., Chang, D., Ma, Z., & Guo, J. (2021, July). Knowledge transfer based fine-grained visual classification. In 2021 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1-6). IEEE.
(MGD) Wang, D., Shen, Z., Shao, J., Zhang, W., Xue, X., & Zhang, Z. (2015). Multiple granularity descriptors for fine-grained categorization. In Proceedings of the IEEE international conference on computer vision (pp. 2399-2406).
(AFGC) Sermanet, P., Frome, A., & Real, E. (2014). Attention for fine-grained categorization. arXiv preprint arXiv:1412.7054.
(FCN) Liu, X., Xia, T., Wang, J., Yang, Y., Zhou, F., & Lin, Y. (2016). Fully convolutional attention networks for fine-grained recognition. arXiv preprint arXiv:1603.06765.
(TransFG) He, J., Chen, J. N., Liu, S., Kortylewski, A., Yang, C., Bai, Y., & Wang, C. (2022, June). Transfg: A transformer architecture for fine-grained recognition. In Proceedings of the AAAI conference on artificial intelligence (Vol. 36, No. 1, pp. 852-860).
(MGECNN) Zhang, L., Huang, S., Liu, W., & Tao, D. (2019). Learning a mixture of granularity-specific experts for fine-grained categorization. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8331-8340).
(ADL) Choe, J., & Shim, H. (2019). Attention-based dropout layer for weakly supervised object localization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2219-2228).
(ACoL) Zhang, X., Wei, Y., Feng, J., Yang, Y., & Huang, T. S. (2018). Adversarial complementary learning for weakly supervised object localization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1325-1334).
Ba, J., & Caruana, R. (2014). Do deep nets really need to be deep?. Advances in neural information processing systems, 27.
(KD) Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015).
(CAM) Zhou, Bolei, et al. "Learning deep features for discriminative localization." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
(ResNet) HE, Kaiming, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 770-778.
(Batch Normalization) Ioffe, S., & Szegedy, C. (2015, June). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448-456). pmlr.
(KDsurcvey) Gou, J., Yu, B., Maybank, S. J., & Tao, D. (2021). Knowledge distillation: A survey. International Journal of Computer Vision, 129(6), 1789-1819.
(Pruning) Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28.
(Inception) Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
(VggNet) Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
(BinaryConnect) Courbariaux, M., Bengio, Y., & David, J. P. (2015). Binaryconnect: Training deep neural networks with binary weights during propagations. Advances in neural information processing systems, 28.
(Q-CNN) Wu, J., Leng, C., Wang, Y., Hu, Q., & Cheng, J. (2016). Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4820-4828).
Courbariaux, M., Bengio, Y., & David, J. P. (2015). Binaryconnect: Training deep neural networks with binary weights during propagations. Advances in neural information processing systems, 28.
(DeiT) Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2021, July). Training data-efficient image transformers & distillation through attention. In International conference on machine learning (pp. 10347-10357). PMLR.
You, S., Xu, C., Xu, C., & Tao, D. (2017, August). Learning from multiple teacher networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1285-1294).
(Softmax) Bridle, J. (1989). Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. Advances in neural information processing systems, 2.
(CrossEntropy) Shannon, C. E. (1948). A mathematical theory of communication. The Bell system technical journal, 27(3), 379-423.
Zhang, F., Zhu, X., & Ye, M. (2019). Fast human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3517-3526).
(ViT) Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
(Fitnets) Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C., & Bengio, Y. (2014). Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550.
(FSP) Yim, J., Joo, D., Bae, J., & Kim, J. (2017). A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4133-4141).
(DFV) Lee, S. H., Kim, D. H., & Song, B. C. (2018). Self-supervised knowledge distillation using singular value decomposition. In Proceedings of the European conference on computer vision (ECCV) (pp. 335-350).
(Darkrank) Chen, Y., Wang, N., & Zhang, Z. (2018, April). Darkrank: Accelerating deep metric learning via cross sample similarities transfer. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).
(LP) Chen, H., Wang, Y., Xu, C., Xu, C., & Tao, D. (2020). Learning student networks via feature embedding. IEEE Transactions on Neural Networks and Learning Systems, 32(1), 25-35.
(AT) Zagoruyko, S., & Komodakis, N. (2016). Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928.
(RKD) Park, W., Kim, D., Lu, Y., & Cho, M. (2019). Relational knowledge distillation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3967-3976).
(WRN) Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. arXiv preprint arXiv:1605.07146.
(SelfKD) Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., & Ma, K. (2019). Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3713-3722).
(SAD) Hou, Y., Ma, Z., Liu, C., & Loy, C. C. (2019). Learning lightweight lane detection cnns by self attention distillation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1013-1021).
(SD) Yang, C., Xie, L., Su, C., & Yuille, A. L. (2019). Snapshot distillation: Teacher-student optimization in one generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2859-2868).
(SKD) Hahn, S., & Choi, H. (2019). Self-knowledge distillation in natural language processing. arXiv preprint arXiv:1908.01851.
(Pytorch) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
(CUB200-2011) Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). The caltech-ucsd birds-200-2011 dataset.
(Stanford Dogs) Khosla, A., Jayadevaprakash, N., Yao, B., & Li, F. F. (2011, June). Novel dataset for fine-grained image categorization: Stanford dogs. In Proc. CVPR workshop on fine-grained visual categorization (FGVC) (Vol. 2, No. 1).
(Stanford Cars) Krause, J., Stark, M., Deng, J., & Fei-Fei, L. (2013). 3d object representations for fine-grained categorization. In Proceedings of the IEEE international conference on computer vision workshops (pp. 554-561).
(FGVC Aircraft) Maji, S., Rahtu, E., Kannala, J., Blaschko, M., & Vedaldi, A. (2013). Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151.
(NABirds) Van Horn, G., Branson, S., Farrell, R., Haber, S., Barry, J., Ipeirotis, P., ... & Belongie, S. (2015). Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 595-604).
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026-1034).
(ImageNet) Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International journal of computer vision, 115, 211-252.
(ImageNet) Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee.
(RNN) Sherstinsky, A. (2020). Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena, 404, 132306.
(Detection) Zaidi, S. S. A., Ansari, M. S., Aslam, A., Kanwal, N., Asghar, M., & Lee, B. (2022). A survey of modern deep learning based object detection models. Digital Signal Processing, 126, 103514.
(RPN) Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
(DeepLAC) Lin, D., Shen, X., Lu, C., & Jia, J. (2015). Deep lac: Deep localization, alignment and classification for fine-grained recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1666-1674).
(P-RCNN) Zhang, N., Donahue, J., Girshick, R., & Darrell, T. (2014). Part-based R-CNNs for fine-grained category detection. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13 (pp. 834-849). Springer International Publishing.
Liu, J., Kanazawa, A., Jacobs, D., & Belhumeur, P. (2012). Dog breed classification using part localization. In Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part I 12 (pp. 172-185). Springer Berlin Heidelberg.
Yang, S., Bo, L., Wang, J., & Shapiro, L. (2012). Unsupervised template learning for fine-grained object recognition. Advances in neural information processing systems, 25.
Branson, S., Van Horn, G., Belongie, S., & Perona, P. (2014). Bird species categorization using pose normalized deep convolutional nets. arXiv preprint arXiv:1406.2952.
(PS-CNN) Huang, S., Xu, Z., Tao, D., & Zhang, Y. (2016). Part-stacked CNN for fine-grained visual categorization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1173-1182).
Ge, Z., McCool, C., Sanderson, C., & Corke, P. (2015). Subset feature learning for fine-grained category classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 46-52).
(MixDCNN) Ge, Z., Bewley, A., McCool, C., Corke, P., Upcroft, B., & Sanderson, C. (2016, March). Fine-grained classification via mixture of deep convolutional neural networks. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 1-6). IEEE.
(OR-Loss) Zhang, S., Du, R., Chang, D., Ma, Z., & Guo, J. (2021, July). Knowledge transfer based fine-grained visual classification. In 2021 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1-6). IEEE.
(MGD) Wang, D., Shen, Z., Shao, J., Zhang, W., Xue, X., & Zhang, Z. (2015). Multiple granularity descriptors for fine-grained categorization. In Proceedings of the IEEE international conference on computer vision (pp. 2399-2406).
(AFGC) Sermanet, P., Frome, A., & Real, E. (2014). Attention for fine-grained categorization. arXiv preprint arXiv:1412.7054.
(FCN) Liu, X., Xia, T., Wang, J., Yang, Y., Zhou, F., & Lin, Y. (2016). Fully convolutional attention networks for fine-grained recognition. arXiv preprint arXiv:1603.06765.
(TransFG) He, J., Chen, J. N., Liu, S., Kortylewski, A., Yang, C., Bai, Y., & Wang, C. (2022, June). Transfg: A transformer architecture for fine-grained recognition. In Proceedings of the AAAI conference on artificial intelligence (Vol. 36, No. 1, pp. 852-860).
(MGECNN) Zhang, L., Huang, S., Liu, W., & Tao, D. (2019). Learning a mixture of granularity-specific experts for fine-grained categorization. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8331-8340).
(ADL) Choe, J., & Shim, H. (2019). Attention-based dropout layer for weakly supervised object localization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2219-2228).
(ACoL) Zhang, X., Wei, Y., Feng, J., Yang, Y., & Huang, T. S. (2018). Adversarial complementary learning for weakly supervised object localization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1325-1334).
Ba, J., & Caruana, R. (2014). Do deep nets really need to be deep?. Advances in neural information processing systems, 27.
(KD) Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015).
(CAM) Zhou, Bolei, et al. "Learning deep features for discriminative localization." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
(ResNet) HE, Kaiming, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 770-778.
(Batch Normalization) Ioffe, S., & Szegedy, C. (2015, June). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448-456). pmlr.
(KDsurcvey) Gou, J., Yu, B., Maybank, S. J., & Tao, D. (2021). Knowledge distillation: A survey. International Journal of Computer Vision, 129(6), 1789-1819.
(Pruning) Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28.
(Inception) Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
(VggNet) Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
(BinaryConnect) Courbariaux, M., Bengio, Y., & David, J. P. (2015). Binaryconnect: Training deep neural networks with binary weights during propagations. Advances in neural information processing systems, 28.
(Q-CNN) Wu, J., Leng, C., Wang, Y., Hu, Q., & Cheng, J. (2016). Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4820-4828).
Courbariaux, M., Bengio, Y., & David, J. P. (2015). Binaryconnect: Training deep neural networks with binary weights during propagations. Advances in neural information processing systems, 28.
(DeiT) Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2021, July). Training data-efficient image transformers & distillation through attention. In International conference on machine learning (pp. 10347-10357). PMLR.
You, S., Xu, C., Xu, C., & Tao, D. (2017, August). Learning from multiple teacher networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1285-1294).
(Softmax) Bridle, J. (1989). Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. Advances in neural information processing systems, 2.
(CrossEntropy) Shannon, C. E. (1948). A mathematical theory of communication. The Bell system technical journal, 27(3), 379-423.
Zhang, F., Zhu, X., & Ye, M. (2019). Fast human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3517-3526).
(ViT) Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
(Fitnets) Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C., & Bengio, Y. (2014). Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550.
(FSP) Yim, J., Joo, D., Bae, J., & Kim, J. (2017). A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4133-4141).
(DFV) Lee, S. H., Kim, D. H., & Song, B. C. (2018). Self-supervised knowledge distillation using singular value decomposition. In Proceedings of the European conference on computer vision (ECCV) (pp. 335-350).
(Darkrank) Chen, Y., Wang, N., & Zhang, Z. (2018, April). Darkrank: Accelerating deep metric learning via cross sample similarities transfer. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).
(LP) Chen, H., Wang, Y., Xu, C., Xu, C., & Tao, D. (2020). Learning student networks via feature embedding. IEEE Transactions on Neural Networks and Learning Systems, 32(1), 25-35.
(AT) Zagoruyko, S., & Komodakis, N. (2016). Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928.
(RKD) Park, W., Kim, D., Lu, Y., & Cho, M. (2019). Relational knowledge distillation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3967-3976).
(WRN) Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. arXiv preprint arXiv:1605.07146.
(SelfKD) Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., & Ma, K. (2019). Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3713-3722).
(SAD) Hou, Y., Ma, Z., Liu, C., & Loy, C. C. (2019). Learning lightweight lane detection cnns by self attention distillation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1013-1021).
(SD) Yang, C., Xie, L., Su, C., & Yuille, A. L. (2019). Snapshot distillation: Teacher-student optimization in one generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2859-2868).
(SKD) Hahn, S., & Choi, H. (2019). Self-knowledge distillation in natural language processing. arXiv preprint arXiv:1908.01851.
(Pytorch) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
(CUB200-2011) Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). The caltech-ucsd birds-200-2011 dataset.
(Stanford Dogs) Khosla, A., Jayadevaprakash, N., Yao, B., & Li, F. F. (2011, June). Novel dataset for fine-grained image categorization: Stanford dogs. In Proc. CVPR workshop on fine-grained visual categorization (FGVC) (Vol. 2, No. 1).
(Stanford Cars) Krause, J., Stark, M., Deng, J., & Fei-Fei, L. (2013). 3d object representations for fine-grained categorization. In Proceedings of the IEEE international conference on computer vision workshops (pp. 554-561).
(FGVC Aircraft) Maji, S., Rahtu, E., Kannala, J., Blaschko, M., & Vedaldi, A. (2013). Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151.
(NABirds) Van Horn, G., Branson, S., Farrell, R., Haber, S., Barry, J., Ipeirotis, P., ... & Belongie, S. (2015). Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 595-604).
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026-1034).