研究生: |
劉又寧 Liu, You-Ning |
---|---|
論文名稱: |
Unsupervised Clustering Based on Alpha-Divergence Unsupervised Clustering Based on Alpha-Divergence |
指導教授: |
黃聰明
Huang, Tsung-Ming |
口試委員: |
黃聰明
Huang, Tsung-Ming 陳建隆 林敏雄 |
口試日期: | 2022/01/25 |
學位類別: |
碩士 Master |
系所名稱: |
數學系 Department of Mathematics |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 英文 |
論文頁數: | 22 |
英文關鍵詞: | Alpha-Divergence, Deep Learning, Deep Clustering, Contrastive Learning, ResNet, Tsallis Entropy, KL Divergence, Shannon Entropy |
DOI URL: | http://doi.org/10.6345/NTNU202200195 |
論文種類: | 學術論文 |
相關次數: | 點閱:222 下載:6 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Recently, many deep learning methods have been proposed to learning representations or clustering without labelled data. Using the famous ResNet[1] backbone as an effective feature extractor, we present a deep efficient clustering method that optimizes the data representation and learn the clustering map jointly. Despite the many successful applications of Kullback–Leibler divergence and Shannon entropy, we use alpha-divergence and Tsallis entropy to be an extension of the common loss functions. For detailed interpretation , we further analyze the relation between the clustering accuracy and the distinct alpha values. Also, we achieve 53.96% test accuracy on CIFAR-10[2] dataset, 27.24% accuracy on CIFAR-100-20[2] dataset in unsupervised tasks
[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image
recognition. arXiv: 1512.03385, 2015.
[2] A. Krizhevsky and G. Hinton. Learning Multiple Layers of Features from Tiny Images. Master’s Thesis, Department of Computer Science, University of Torono, 2009.
[3] A. Coates, A. Ng, and H. Lee. An analysis of single-layer networks in unsupervised feature
learning. In AISTATS, pages 215–223, 2009.
[4] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale
hierarchical image database. In IEEE CVPR, 2009.
[5] J. Yang, D. Parikh, and D. Batra. Joint unsupervised learning of deep representations and
image clusters. In CVPR, pages 5147–5156, 2016.
[6] J. Wang, J. Wang, J. Song, X. Xu, H. Shen, and S. Li. Optimized cartesian k-means. IEEE
Trans. Knowl. Data Eng., 27 (1):180—192, 2015.
[7] L. Zelnik-Manor and P. Perona. Self-tuning spectral clustering. In NIPS, pages 1601—1608,
2004.
[8] K. Gowda and G. Krishna. Agglomerative clustering using the concept of mutual nearest
neighbourhood. Pattern Recognition, 10(2):105—112, 1978.
[9] D. Cai, X. He, X. Wang, H. Bao, and J. Han. Locality preserving nonnegative matrix factorization. In IJCAI, pages 1010—1015, 2009.
[10] Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep
networks. In NIPS, pages 153-–160, 2006.
[11] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.
JMLR, 11 3371-3408 , 2010.
[12] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised Representation Learning with
Deep Convolutional Generative Adversarial Networks. arXiv: 1511.06434, 2015.
[13] M. Zeiler, D. Krishnan, G. Taylor, and R. Fergus. Deconvolutional networks. In CVPR, pages
2528—2535, 2010.
[14] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv:1312.6114,
2013.
[15] Junyuan Xie, Ross Girshick, and Ali Farhadi. Unsupervised deep embedding for clustering
analysis. In ICML, pages 478-–487, 2016.
[16] Jianlong Chang, Lingfeng Wang, Gaofeng Meng, Shiming Xiang, and Chunhong Pan. Deep
adaptive image clustering. In IEEE ICCV, pages 5879—5887, 2017.
[17] Jianlong Wu, Keyu Long, Fei Wang, Chen Qian, Cheng Li, Zhouchen Lin, and Hongbin Zha.
Deep Comprehensive Correlation Mining for Image Clustering. arXiv:1904.06925, 2019.
[18] Jiabo Huang, Shaogang Gong, and Xiatian Zhu. Deep Semantic Clustering by Partition
Confidence Maximisation. In CVPR, pages 8849-–8858, 2020.
[19] Chuang Niu, Jun Zhang, Ge Wang, and Jimin Liang. GATCluster: Self-Supervised GaussianAttention Network for Image Clustering. In ECCV, pages 735-–751, 2020.
[20] Sungwon Han, Sungwon Park, Sungkyu Park, Sundong Kim, and Meeyoung Cha. Mitigating
Embedding and Class Assignment Mismatch in Unsupervised Image Classification. ECCV
2020. Lecture Notes in Computer Science, vol 12369.
[21] Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans, and
Luc Van Gool. SCAN: Learning to Classify Images without Labels. ECCV 2020. Lecture
Notes in Computer Science, vol 12355.
[22] Yaling Tao, Kentaro Takagi, and Kouta Nakata. Clustering-friendly Representation Learning
via Instance Discrimination and Feature Decorrelation. arXiv:2106.00131, ICLR 2021.
[23] Yunfan Li, Peng Hu, Zitao Liu, Dezhong Peng, Joey Tianyi Zhou, and Xi Peng. Contrastive
Clustering. AAAI 2021 Conference Paper.
[24] Zhiyuan Dang, Cheng Deng, Xu Yang, Kun Wei, and Heng Huang. Nearest Neighbor Matching for Deep Clustering. In CVPR, pages 13693-13702, 2021.
[25] Tsung Wei Tsai, Chongxuan Li, and Jun Zhu. MiCE: Mixture of Contrastive Experts for
Unsupervised Image Clustering. ICLR 2021 Conference Paper.
[26] Sungwon Park, Sungwon Han, Sundong Kim, Danu Kim, Sungkyu Park, Seunghoon
Hong,and Meeyoung Cha. Improving Unsupervised Image Clustering With Robust Learning.
CVPR 2021 Conference Paper.
[27] Uri Shaham, Kelly Stanton, Henry Li, Boaz Nadler, Ronen Basri, and Yuval Kluger. SpectralNet: Spectral Clustering using Deep Neural Networks. ICLR 2018 Conference Paper.
[28] Fengfu Li, Hong Qiao, Bo Zhang, and Xuanyang Xi. Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders. arXiv:1703.07980, 2017.
[29] Xifeng Guo, Long Gao, Xinwang Liu, and Jianping Yin. Improved Deep Embedded Clustering
with Local Structure Preservation. IJCAI 2017 Conference Paper.
[30] Xifeng Guo, Long Gao, Xinwang Liu, and Jianping Yin. Associative Deep Clustering: Training a Classification Network with No Labels. GCPR 2018 Conference Paper.
[31] Xu Ji, J. F. Henriques, and Andrea Vedaldi. Invariant Information Clustering for Unsupervised Image Classification and Segmentation. In IEEE, pages 9865–9874, 2021.
[32] Zhirong Wu, Yuanjun Xiong, Stella Yu, and Dahua Lin. Unsupervised feature learning via
nonparametric instance discrimination.. In IEEE, pages 3733—3742, 2018.
[33] Adrian Bulat, Enrique S´anchez-Lozano, and Georgios Tzimiropoulos. Improving memory
banks for unsupervised learning with large mini-batch, consistency and hard negative mining.
arXiv:2102.04442, 2021.
[34] Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum Contrast
for Unsupervised Visual Representation Learning. arXiv:1911.05722, 2019. In CVPR, 2020.
[35] Geoffrey E Hinton and Ruslan R Salakhutdinov. Reducing the dimensionality of data with
neural networks. science, 313 (5786): 504–507, 2006.
[36] Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand
Joulin. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. In
NeurIPS, 2020.
[37] Xinlei Chen, and Kaiming He. Exploring Simple Siamese Representation Learning.
arXiv:2011.10566, 2020.
[38] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709, 2020.
[39] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709, 2020.
[40] Chengyue Gong, Dilin Wang, and Qiang Liu. AlphaMatch: Improving Consistency for Semisupervised Learning with Alpha-divergence. arXiv:2011.11779, 2020.
[41] Raia Hadsell, Sumit Chopra, and Yann LeCun. Dimensionality Reduction by Learning an
Invariant Mapping. In CVPR, 2006.
[42] Uri Shaham, and Roy Lederman. Common Variable Learning and Invariant Representation
Learning using Siamese Neural Networks. Pattern Recognition, 74: 52—63, 2018.
[43] Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation Learning with Contrastive
Predictive Coding. arXiv:1807.03748, 2018.
[44] Chuhan Wu, Fangzhao Wu, and Yongfeng Huang. Rethinking InfoNCE: How Many Negative
Samples Do You Need? arXiv:2105.13003, 2021.
[45] Andrzej Cichocki, and Shun-ichi Amari. Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities. Entropy 2010; 6:1532-1568.
https://doi.org/10.3390/e12061532
[46] Thomas M. Cover and Joy A. Thomas. Elements of information theory. John Wiley & Sons,
2012.
[47] Frank Nielsen. The α-divergences associated with a pair of strictly comparable quasiarithmetic means. arXiv:2001.09660, 2020.
[48] Roberto J. V. dos Santos. Generalization of Shannon’s theorem for Tsallis entropy. J. Math.
Phys. 38, 4104, 1997.
[49] Sumiyoshi Abe. Axioms and uniqueness theorem for Tsallis entropy. Physics Letters A, v.
271, Issues 1—2: 74–79, 2000.
[50] Shinto Eguchi, and Shogo Kato. Entropy and Divergence Associated with Power Function and
the Statistical Application. Entropy 12, no. 2: 262–274.https://doi.org/10.3390/e12020262.
2010.
[51] Feng Wang, Tao Kong, Rufeng Zhang, Huaping Liu, and Hang Li. Self-Supervised Learning
by Estimating Twin Class Distributions. arXiv:2110.07402, 2021.
[52] Diederik P. Kingma, and Jimmy Ba. Adam: A Method for Stochastic Optimization.
arXiv:1412.6980, 2017.
[53] Elie Aljalbout, Vladimir Golkov, Yawar Siddiqui, Maximilian Strobel, and Daniel Cremers.
Clustering with Deep Learning: Taxonomy and New Methods. arXiv:1801.07648, 2018.