簡易檢索 / 詳目顯示

研究生: 劉又寧
Liu, You-Ning
論文名稱: Unsupervised Clustering Based on Alpha-Divergence
Unsupervised Clustering Based on Alpha-Divergence
指導教授: 黃聰明
Huang, Tsung-Ming
口試委員: 黃聰明
Huang, Tsung-Ming
陳建隆 林敏雄
口試日期: 2022/01/25
學位類別: 碩士
Master
系所名稱: 數學系
Department of Mathematics
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 22
英文關鍵詞: Alpha-Divergence, Deep Learning, Deep Clustering, Contrastive Learning, ResNet, Tsallis Entropy, KL Divergence, Shannon Entropy
DOI URL: http://doi.org/10.6345/NTNU202200195
論文種類: 學術論文
相關次數: 點閱:257下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

Recently, many deep learning methods have been proposed to learning representations or clustering without labelled data. Using the famous ResNet[1] backbone as an effective feature extractor, we present a deep efficient clustering method that optimizes the data representation and learn the clustering map jointly. Despite the many successful applications of Kullback–Leibler divergence and Shannon entropy, we use alpha-divergence and Tsallis entropy to be an extension of the common loss functions. For detailed interpretation , we further analyze the relation between the clustering accuracy and the distinct alpha values. Also, we achieve 53.96% test accuracy on CIFAR-10[2] dataset, 27.24% accuracy on CIFAR-100-20[2] dataset in unsupervised tasks

List of Figures ii List of Tables ii 1. Introduction 2 2. Related Works 4 2.1. Contrastive Learning 4 2.2. Deep Clustering 5 2.3. Alpha-Divergence 7 2.4. Tsallis Entropy 8 3. Method 9 3.1. Problem Formulation 9 3.2. Representaion Learning 9 3.3. Cluster Assignment 10 3.4. Loss Function 11 4. Experiment 13 4.1. Datasets 13 4.2. Implementation Details 13 4.3. Evaluation Metrics 13 4.4. Results 14 5. Ablation Study 16 5.1. Effect of each loss term 16 5.2. Effect of α Values 18 6. Conclusion 19 References 20

[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image
recognition. arXiv: 1512.03385, 2015.
[2] A. Krizhevsky and G. Hinton. Learning Multiple Layers of Features from Tiny Images. Master’s Thesis, Department of Computer Science, University of Torono, 2009.
[3] A. Coates, A. Ng, and H. Lee. An analysis of single-layer networks in unsupervised feature
learning. In AISTATS, pages 215–223, 2009.
[4] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale
hierarchical image database. In IEEE CVPR, 2009.
[5] J. Yang, D. Parikh, and D. Batra. Joint unsupervised learning of deep representations and
image clusters. In CVPR, pages 5147–5156, 2016.
[6] J. Wang, J. Wang, J. Song, X. Xu, H. Shen, and S. Li. Optimized cartesian k-means. IEEE
Trans. Knowl. Data Eng., 27 (1):180—192, 2015.
[7] L. Zelnik-Manor and P. Perona. Self-tuning spectral clustering. In NIPS, pages 1601—1608,
2004.
[8] K. Gowda and G. Krishna. Agglomerative clustering using the concept of mutual nearest
neighbourhood. Pattern Recognition, 10(2):105—112, 1978.
[9] D. Cai, X. He, X. Wang, H. Bao, and J. Han. Locality preserving nonnegative matrix factorization. In IJCAI, pages 1010—1015, 2009.
[10] Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep
networks. In NIPS, pages 153-–160, 2006.
[11] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.
JMLR, 11 3371-3408 , 2010.
[12] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised Representation Learning with
Deep Convolutional Generative Adversarial Networks. arXiv: 1511.06434, 2015.
[13] M. Zeiler, D. Krishnan, G. Taylor, and R. Fergus. Deconvolutional networks. In CVPR, pages
2528—2535, 2010.
[14] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv:1312.6114,
2013.
[15] Junyuan Xie, Ross Girshick, and Ali Farhadi. Unsupervised deep embedding for clustering
analysis. In ICML, pages 478-–487, 2016.
[16] Jianlong Chang, Lingfeng Wang, Gaofeng Meng, Shiming Xiang, and Chunhong Pan. Deep
adaptive image clustering. In IEEE ICCV, pages 5879—5887, 2017.
[17] Jianlong Wu, Keyu Long, Fei Wang, Chen Qian, Cheng Li, Zhouchen Lin, and Hongbin Zha.
Deep Comprehensive Correlation Mining for Image Clustering. arXiv:1904.06925, 2019.
[18] Jiabo Huang, Shaogang Gong, and Xiatian Zhu. Deep Semantic Clustering by Partition
Confidence Maximisation. In CVPR, pages 8849-–8858, 2020.
[19] Chuang Niu, Jun Zhang, Ge Wang, and Jimin Liang. GATCluster: Self-Supervised GaussianAttention Network for Image Clustering. In ECCV, pages 735-–751, 2020.
[20] Sungwon Han, Sungwon Park, Sungkyu Park, Sundong Kim, and Meeyoung Cha. Mitigating
Embedding and Class Assignment Mismatch in Unsupervised Image Classification. ECCV
2020. Lecture Notes in Computer Science, vol 12369.
[21] Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans, and
Luc Van Gool. SCAN: Learning to Classify Images without Labels. ECCV 2020. Lecture
Notes in Computer Science, vol 12355.
[22] Yaling Tao, Kentaro Takagi, and Kouta Nakata. Clustering-friendly Representation Learning
via Instance Discrimination and Feature Decorrelation. arXiv:2106.00131, ICLR 2021.
[23] Yunfan Li, Peng Hu, Zitao Liu, Dezhong Peng, Joey Tianyi Zhou, and Xi Peng. Contrastive
Clustering. AAAI 2021 Conference Paper.
[24] Zhiyuan Dang, Cheng Deng, Xu Yang, Kun Wei, and Heng Huang. Nearest Neighbor Matching for Deep Clustering. In CVPR, pages 13693-13702, 2021.
[25] Tsung Wei Tsai, Chongxuan Li, and Jun Zhu. MiCE: Mixture of Contrastive Experts for
Unsupervised Image Clustering. ICLR 2021 Conference Paper.
[26] Sungwon Park, Sungwon Han, Sundong Kim, Danu Kim, Sungkyu Park, Seunghoon
Hong,and Meeyoung Cha. Improving Unsupervised Image Clustering With Robust Learning.
CVPR 2021 Conference Paper.
[27] Uri Shaham, Kelly Stanton, Henry Li, Boaz Nadler, Ronen Basri, and Yuval Kluger. SpectralNet: Spectral Clustering using Deep Neural Networks. ICLR 2018 Conference Paper.
[28] Fengfu Li, Hong Qiao, Bo Zhang, and Xuanyang Xi. Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders. arXiv:1703.07980, 2017.
[29] Xifeng Guo, Long Gao, Xinwang Liu, and Jianping Yin. Improved Deep Embedded Clustering
with Local Structure Preservation. IJCAI 2017 Conference Paper.
[30] Xifeng Guo, Long Gao, Xinwang Liu, and Jianping Yin. Associative Deep Clustering: Training a Classification Network with No Labels. GCPR 2018 Conference Paper.
[31] Xu Ji, J. F. Henriques, and Andrea Vedaldi. Invariant Information Clustering for Unsupervised Image Classification and Segmentation. In IEEE, pages 9865–9874, 2021.
[32] Zhirong Wu, Yuanjun Xiong, Stella Yu, and Dahua Lin. Unsupervised feature learning via
nonparametric instance discrimination.. In IEEE, pages 3733—3742, 2018.
[33] Adrian Bulat, Enrique S´anchez-Lozano, and Georgios Tzimiropoulos. Improving memory
banks for unsupervised learning with large mini-batch, consistency and hard negative mining.
arXiv:2102.04442, 2021.
[34] Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum Contrast
for Unsupervised Visual Representation Learning. arXiv:1911.05722, 2019. In CVPR, 2020.
[35] Geoffrey E Hinton and Ruslan R Salakhutdinov. Reducing the dimensionality of data with
neural networks. science, 313 (5786): 504–507, 2006.
[36] Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand
Joulin. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. In
NeurIPS, 2020.
[37] Xinlei Chen, and Kaiming He. Exploring Simple Siamese Representation Learning.
arXiv:2011.10566, 2020.
[38] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709, 2020.
[39] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709, 2020.
[40] Chengyue Gong, Dilin Wang, and Qiang Liu. AlphaMatch: Improving Consistency for Semisupervised Learning with Alpha-divergence. arXiv:2011.11779, 2020.
[41] Raia Hadsell, Sumit Chopra, and Yann LeCun. Dimensionality Reduction by Learning an
Invariant Mapping. In CVPR, 2006.
[42] Uri Shaham, and Roy Lederman. Common Variable Learning and Invariant Representation
Learning using Siamese Neural Networks. Pattern Recognition, 74: 52—63, 2018.
[43] Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation Learning with Contrastive
Predictive Coding. arXiv:1807.03748, 2018.
[44] Chuhan Wu, Fangzhao Wu, and Yongfeng Huang. Rethinking InfoNCE: How Many Negative
Samples Do You Need? arXiv:2105.13003, 2021.
[45] Andrzej Cichocki, and Shun-ichi Amari. Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities. Entropy 2010; 6:1532-1568.
https://doi.org/10.3390/e12061532
[46] Thomas M. Cover and Joy A. Thomas. Elements of information theory. John Wiley & Sons,
2012.
[47] Frank Nielsen. The α-divergences associated with a pair of strictly comparable quasiarithmetic means. arXiv:2001.09660, 2020.
[48] Roberto J. V. dos Santos. Generalization of Shannon’s theorem for Tsallis entropy. J. Math.
Phys. 38, 4104, 1997.
[49] Sumiyoshi Abe. Axioms and uniqueness theorem for Tsallis entropy. Physics Letters A, v.
271, Issues 1—2: 74–79, 2000.
[50] Shinto Eguchi, and Shogo Kato. Entropy and Divergence Associated with Power Function and
the Statistical Application. Entropy 12, no. 2: 262–274.https://doi.org/10.3390/e12020262.
2010.
[51] Feng Wang, Tao Kong, Rufeng Zhang, Huaping Liu, and Hang Li. Self-Supervised Learning
by Estimating Twin Class Distributions. arXiv:2110.07402, 2021.
[52] Diederik P. Kingma, and Jimmy Ba. Adam: A Method for Stochastic Optimization.
arXiv:1412.6980, 2017.
[53] Elie Aljalbout, Vladimir Golkov, Yawar Siddiqui, Maximilian Strobel, and Daniel Cremers.
Clustering with Deep Learning: Taxonomy and New Methods. arXiv:1801.07648, 2018.

下載圖示
QR CODE