研究生: 蔡詠名
Tsai, Yung-Ming
論文名稱: 應用非監督式機器學習於多維度路網資料之探勘
An Unsupervised Machine Learning Approach for Multi-Dimensional Network Data Mining
指導教授: 張國楨
Chang, Kuo-Chen
學位類別: 碩士
系所名稱: 地理學系
Department of Geography
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 48
中文關鍵詞: 路網分析多維度非監督式機器學習K-Medoids
英文關鍵詞: network analysis, multi-dimension, unsupervised machine learning, K-Medoids
DOI URL: http://doi.org/10.6345/NTNU202000822
論文種類: 學術論文
相關次數: 點閱:129下載:0
  • 近年來由於智慧型運輸系統、物聯網科技以及無線網路科技的進步,加上政府機關對於資料開放的支持,目前可取得大量的交通資料。這些資料具有詳細的空間與時間資訊,甚至更複雜的資料維度。為了萃取隱藏在資料當中的重要資訊,勢必需要多維度的資料分析方法。

    In recent years, with the advanced of ITS, IoT and wireless communication technology, and also the positive attitudes toward open data from the government, we can retrieve a big amount of traffic data. These data contain detailed spatial and temporal information, and even features with complicated data dimension. In order to extract useful information hidden within the data, a multi-dimensional data analysis technique are required to extract useful information hidden in the data.
    This study designs an unsupervised machine learning approach for multi-dimensional network data. The algorithm adopts the concepts of network weight matrix and space-time matrix to calculate multi-dimensional distances in the network space. In combine with K-Medoids algorithm, which has the capability of dealing with discrete data, a clustering algorithm is developed. To solve the problems of the sensitivity to initial seeds and K value of K-Medoids algorithm, two methods are adopted. First, a systematic sampling approach for seeds generation is adopted to cut down on the randomness of the algorithm. Cluster splitting and merging method is introduced to compensate the poor seeds selection in the initial phase.
    From the case of highway traffic clustering, the algorithm demonstrates several advantages. First, the algorithm possesses consistency and robustness. Because systematic sampling seeds generation removes the randomness of the algorithm, the results can be expected throughout several experiments giving the same inputs and parameters. The algorithm also demonstrates that it respects the topology of the highway network. Features that are proximate in space but distant in network space will not be assigned as the same clusters. The algorithm can also recognize cross-system traffic patterns. The results of clustering also demonstrate that the algorithm can identify the difference in temporal dimension and the data dimension of traffic. Features with unique temporal and traffic patterns will be grouped together
    This study can provide an approach for systematically analyse space-time or multi-dimensional network data, which can be used in researches like transportation management, logistics and transportation geography. The medoids of the clusters can serve as the rules for traffic patterns. Also, the clusters can be used as operational units for further decision making.

    Chapter 1 Introduction 1 Chapter 2 Literature Review 5 (1) Application of network analysis in geographic information system 5 (2) Data mining 8 (3) Machine learning 9 Chapter 3 Methodology 13 (1) Algorithm 15 Chapter 4 Model Evaluation 21 (1) Study Area and Materials 21 (2) Case Study Design 24 Chapter 5 Results and Discussion 29 (1) Consistency and Robustness 30 (2) Topological Rules 32 (3) Multi-Dimensional Patterns 34 Chapter 6 Conclusion and Future Works 37 (1) Conclusion 37 (2) Future Works 38

