研究生: |
黃騰寬 Huang, Teng-Kuan |
---|---|
論文名稱: |
通過對比式學習改進半監督物件偵測 Enhancing Semi-Supervised Object Detection with Contrastive Learning |
指導教授: |
葉梅珍
Yeh, Mei-Chen |
口試委員: |
王鈺強
Wang, Yu-Chiang 康立威 Kang, Li-Wei 葉梅珍 Yeh, Mei-Chen |
口試日期: | 2023/07/24 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 19 |
中文關鍵詞: | 半監督物件偵測 、對比式學習 |
英文關鍵詞: | Semi-supervised Object Detection, Contrastive Learning |
DOI URL: | http://doi.org/10.6345/NTNU202301516 |
論文種類: | 學術論文 |
相關次數: | 點閱:117 下載:7 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本篇論文中,我們研究了如何使用對比式學習幫助半監督物件偵測模型更好的學習。在半監督物件偵測中,使用無標注資料的流程相當於自監督學習的概念,因此我們參考自監督學習常用的對比式學習架構,提出了一種表徵等級的對比式學習。
在半監督物件偵測中,常使用偽標籤法來進行學習,其核心概念是強、弱增強的圖像要有一致的預測,而對比式學習也有類似的假設,只是對於圖像的增強並沒有強、弱的要求。所以我們基於使用偽標籤法的半監督物件偵測模型,加上對比式學習的架構幫助模型學習。該方法通過計算無標注圖像的兩種視覺增強生成特徵的相似度,來改善半監督物件偵測的效果。實驗結果顯示,我們的方法在使用MS-COCO的1%、5%、10%有標注資料時,mAP結果分別提升了2.92%、1.88%、0.99%。
這證明了對比式學習在半監督物件偵測中可以在特徵層面學習到額外的資訊,並且基於原本的半監督物件流程,加上對比式學習的流程並不需要增加太多額外的計算。我們期望這項研究能為半監督學習及物件偵測的未來研究提供新的思路和方向。
In this paper, we investigate how to use contrastive learning to enhance semi-supervised object detection. In semi-supervised object detection, the process of utilizing unlabeled data is similar to the concept of self-supervised learning. Hence, we refer to the commonly used contrastive learning framework in self-supervised learning and propose a representation-level contrastive learning method.
In semi-supervised object detection, pseudo-labeling is commonly used for self-learning. The core concept is that strongly and weakly augmented images should have consistent predictions. Contrastive learning has a similar assumption, except that there is no strong or weak requirement for image augmentation. Therefore, based on the semi-supervised object detection model with pseudo-labeling, we apply a contrastive learning framework to aid model learning with more feature-level information. This method improves the performance of semi-supervised object detection by computing the similarity of features generated by two different augmentations of unlabeled images. The experimental results show that our method improves the mAPs results by 2.92%, 1.88%, and 0.99% when using 1%, 5%, and 10% of the labeled data from MS-COCO, respectively.
This demonstrates that contrastive learning can learn additional information at the feature level in semi-supervised object detection, and adding a contrastive learning process to the original semi-supervised object process does not require much additional computation. We hope that this research can provide new insights and directions for future research in semi-supervised learning and object detection.
Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE international conference on computer vision. 2015.
Liu, Yen-Cheng, et al. "Unbiased teacher for semi-supervised object detection." arXiv preprint arXiv:2102.09480 (2021).
Sohn, Kihyuk, et al. "A simple semi-supervised learning framework for object detection." arXiv preprint arXiv:2005.04757 (2020).
Chen, Binghui, et al. "Dense learning based semi-supervised object detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
Chen, Binbin, et al. "Label matching semi-supervised object detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
Mi, Peng, et al. "Active teacher for semi-supervised object detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
Grill, Jean-Bastien, et al. "Bootstrap your own latent-a new approach to self-supervised learning." Advances in neural information processing systems 33 (2020): 21271-21284.
Chen, Ting, et al. "A simple framework for contrastive learning of visual representations." International conference on machine learning. PMLR, 2020.
Tarvainen, Antti, and Harri Valpola. "Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results." Advances in neural information processing systems 30 (2017).
Zhou, Qiang, et al. "Instant-teaching: An end-to-end semi-supervised object detection framework." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
Xu, Mengde, et al. "End-to-end semi-supervised object detection with soft teacher." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.
He, Kaiming, et al. "Momentum contrast for unsupervised visual representation learning." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
Caron, Mathilde, et al. "Unsupervised learning of visual features by contrasting cluster assignments." Advances in neural information processing systems 33 (2020): 9912-9924.
Wang, Xinlong, et al. "Dense contrastive learning for self-supervised visual pre-training." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
Chen, Xinlei, and Kaiming He. "Exploring simple siamese representation learning." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.
Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015).
Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. What makes for good views for contrastive learning. arXiv preprint arXiv:2005.10243, 2020.
Tian, Zhi, et al. "Fcos: Fully convolutional one-stage object detection." Proceedings of the IEEE/CVF international conference on computer vision. 2019.
David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin A Raffel. Mixmatch: A holistic approach to semi-supervised learning. In Advances in Neural Information Processing Systems (NeurIPS), pp. 5049–5059, 2019.
Sohn, Kihyuk, et al. "Fixmatch: Simplifying semi-supervised learning with consistency and confidence." Advances in neural information processing systems 33 (2020): 596-608.
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Neural Information Processing Systems, 2014