簡易檢索 / 詳目顯示

研究生: 林益民
Yi-Ming Lin
論文名稱: 整合退火演算法與正交實驗設計法改善K-Means演算法之分類
Integrated Simulated Annealing and Orthogonal Experiment Design for Improving the K-Means for Classification
指導教授: 洪欽銘
Hong, Chin-Ming
學位類別: 碩士
Master
系所名稱: 工業教育學系
Department of Industrial Education
論文出版年: 2008
畢業學年度: 97
語文別: 中文
論文頁數: 53
中文關鍵詞: k-means演算法正交實驗設計模擬退火法
英文關鍵詞: k-means algorithm, orthogonal experimental design, simulated annealing algorithm
論文種類: 學術論文
相關次數: 點閱:251下載:15
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在資料探勘研究領域中,資料分群的方法扮演相當重要的角色,被廣泛應在不同的領域中。一個好的分群演算法能快速而正確的分群,進而可以突顯群集的特性,以便後續的資料處理分析動作。k-means是最為常用的分群演算法,雖具有快速分群以及簡單運用之特性,但也存在許多缺點導致其分群效果未必能達成最佳的分群結果。本文為了增進k-means的分群結果,因此提出以模擬退火演算法為基礎的分群方式,此方法簡稱為HSAKM。
    首先找出第一次k-means分群後之中心點,在每群中找最接近原中心點的m個資料點作為新的中心,再經過退火擾動分群動作,可增進K-Means分群的效能。本研究將與K-Means以及與其它的分群分群演算法相互比較,以驗證所提的出方法較能有更好的效能。

    The data clustering acted as crucial role for data mining, and it is used in more different fields extensively. A good clustering algorithm can quickly and correctly for partition the dataset into clusters. Also, it can point out the cluster characteristics, and then may be provided for further data analysis work. The k-means algorithm is one of the most commonly used clustering algorithms. However, the k-means clustering result doesn’t reach the best clustering result, so in order to promote the k-means algorithm to get the best or near-best clustering result. Therefore, we propose an integrated approach based on SA algorithm called HSAKM for data clustering.
    First k-means clustering to find cluster centers. Then search m data with the shortest distance to the cluster center of each cluster as new cluster centers. Moreover, perform the SA clustering to improving the efficiency for the k-means. Furthermore, compare HSAKM algorithm with each other clustering algorithms to demonstrate our proposed clustering algorithm is much better than others.

    中文摘要.......................................................I 英文摘要...................................................... II 目錄.......................................................... III 圖目錄........................................................V 表目錄........................................................ VI 第一章 緒論....................................................1 1.1 研究背景與動機........................................2 1.2 研究目的..............................................3 1.3 研究步驟..............................................4 第二章 相關研究................................................5 2.1 資料分群技術用........................................5 2.2 K-Means 演算法......................................8 2.3 模擬退火演算法........................................9 2.3.1 模擬退火演算法介紹............................. 10 2.3.2 模擬退火演算法的擾動機制.......................11 2.3.3 模擬退火法流程.................................12 第三章 研究方法...............................................14 3.1正交實驗設計...........................................14 3.1.1 正交實驗設計法之理論...........................15 3.1.2 正交表的特性與產生方...........................17 3.1.3 因素分析.......................................18 3.2增強退火演算法搜尋能力.................................19 3.2.1 影響退火演算法效能的因素.......................20 3.2.2 柯西勞倫茲機率分布.............................21 3.2.3 採用實驗設計法的區域搜尋產生新解...............22 3.2.4 HSAKM 流程步驟...............................23 第四章 實驗分析與分析.........................................26 4.1 實驗資料與參數設計....................................26 4.1.1 人工資料.......................................26 4.1.2 實際資料.......................................28 4.2 資料庫操作與結果......................................32 第五章 結論與未來研究方向.....................................48

    [1] UCI(University of California,Irvine) Repository of Machine Learning Databases.
    Available:ftp://ftp.ics.uci.edu/pub/machine-learning-databases/
    [2] W.T.A. Lopes, Madeiro,M.S. Alencar, and B.G. Aguiar Neto, “Simulated Annealing for Robust VQ: Improving Image Transmission through a Fading Channel,”IEEE Computer Society Press,VI Brazilian Symposium on Neural Networks,pp.243-248,Rio de Janeiro,Brazil,2000.
    [3] J. Ning, S. McClean, and K. Cranley, “3D Reconstruction from Two Orthogonal Views Using Simulated Annealing Approach,”IEEE Computer Society Press,Third International Conference on 3-D Digital Imaging and Modeling, pp.309-319, 2001.
    [4] X.Y. Wang, J.M. Garibaldi, “Simulated Annealing Fuzzy Clustering in Cancer Diagnosis,”Information,Vol.29,pp.61-70,Number 2005.
    [5] N. Metropolis, A.W. Rosenbluth, A. Teller, and E. Teller, “Wquation of State Calculation by Fast computing Machines,”Journal of Chemical Physics, Vol.21, No. 6, pp. 1087-1092,1953.
    [6] S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, “Optimization by Simulated Annealing,”Science, vol.220, N0.4589, pp.671-680, 1983.
    [7] Q. Zhang and Y.W. Leung, “An orthogonal genetic algorithm for multimedia multicast routine,”IEEE Trans, Evolutionary Computation, col.3, no.1, pp.41-53, Apr. 1998.
    [8] Y.W. Leung, Y. Wang, “An Orthogonal Genetic algorithm with Quantization for Global Numerical Optimization,”IEEE Trans. Evolutionary Computers. , Vol.5, pp.41-53,Feb. 2001.
    [9] J.T. Tsai, J.H Chou, and T.K. Lin, “Tuning the Structure and Parameters of a Neural Network by using Hybrid Taguchi-genetic Algorithm,”IEEE Trans. Neural Networks,Vol.17, pp.69-80, Jan. 2006.
    [10] S.Y. HO, Y.K. Lin, “OSA: Orthogonal Simulated Annealing Algorithm and Its Application to Designing Mixed / Optimal Controllers,”IEEE Trans. Syst., Man, Cybern. A,Vol. 34,pp.588-600, Sept. 2004.
    [11] S.Y Ho , Y.K. Lin, “Orthogonal Simulated Annealing for Floorplanning,”Artificial Intelligence and Application Conference,pp.469-474, 2002.
    [12] D.C. Montgomery, Design and Analysis of Experiments,3rd ed. New York: Wiley, 1991.
    [13] C.R. Hicks, Fundamental Concepts in the Design of Experiments,4th ed .TX:Saunders College Publishing, 1993.
    [14] T.S. Chen, Y.T. Chen, C.C. Lin,and R.C. Chen, “A combined K-Means and Hierarchical Clustering Method for Improving the Clustering Efficiency of Microarray,” Intelligent Signal Processing and Communications Systems, pp.405-408, 2005.
    [15] M.H. Dunham, Data Ming: Introductory and Advanced Topics, Prentice Hall, 2003.
    [16] R.J. Roiger, M.W. Geatz, Data Mining:A Tutorial-Based Primer,Addision Wesley, 2003.
    [17] H. Szu, R. Hartley,“Fast simulated annealing,”Phys.Lett. , Vol. 122,pp.157-162, 1987.
    [18] V. Petridis, S. Kazarlis, and A. Bakirtzis, “Varying fitness function in genetic algorithm constrained optimization:The cutting stock and unit commitment problems,”IEEE Trans. Syst. , Man,Cybern.B, Vol. 28,pp.629-540, Oct.1998.
    [19] P.S. Shelokar, V. K. Jayaraman, B. D. Kulkarni, “An ant colony approach for cluster,”Analytica Chimica Acta, Vol.509, Issue 2, pp. 187-195,May. 2004.
    [20] Lizhong Xiao, Zhiqing Shao, Gang Liu, “K-means Algorithm Based on Particle Swarm Optimization Algorithm for Anomaly Intrusion Detection,” Intelligent Control and Automation Vol.2, pp.5854-5858, June. 2006
    [21] Fun Ye, C.Y. Chen, “Alternative KPSO-Clustering Algorithm,” Journal of Science and Engineering, Vol. 8, No 2, pp.165- 174, 2005.
    [22] Ujjwal Maulik, Sanghamitra Bandyopadhyay, “Genetic algorithm- based clustering technique,”Pattern Recognition, Vol. 33, pp.1455- 1465, 2000.
    [23] Ujjwal Maulik, Sanghamitra Bandyopadhyay, “Cluster Using Simulated Annealing with Probabilistic Redistribution,” Journal of Pattern Recognition and Artificial Intelligence, Vol.15, pp.269- 285, 2001.
    [24] Zülal Güngör, Alper Ünler, “K-harmonic means data clustering with simulated annealing heuristic,”Applied Mathematics and Computation, Vol. 184, Issue 2, 15, pp.199-209, Jane. 2007.
    [25] Krista Rizman Žalik, “An efficient k′-means clustering algorithm,”Pattern Recognition Letters, Volume 29, Issue 9, 1, pp. 1385-1391,July 2008.
    [26] Adil M. Bagirov, “Modified global k-means algorithm for minimum sum-of-squares clustering problems,” Pattern Recognition, Vol.41, Issue 10, pp. 3192-3199, Oct. 2008.
    [27] 林育臣(2002),群聚技術之研究,碩士論文,朝陽科技大學資訊管理系,台中。

    下載圖示
    QR CODE