研究生: 劉浩萱
Liu, Hao-Hsuan
論文名稱: AlphaZero演算法結合快贏策略或迫著空間實現於五子棋
AlphaZero Algorithm Combined with Quick Win or Threat Space for Gomoku
指導教授: 林順喜
Lin, Shun-Shii
學位類別: 碩士
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 59
中文關鍵詞: AlpahZero類神經網路快贏策略迫著搜尋
英文關鍵詞: AlpahZero, Neural Network, Quick Win, Threats-space Search
DOI URL: http://doi.org/10.6345/NTNU202001222
論文種類: 學術論文
相關次數: 點閱:192下載:52
  • AlphaZero是一個通用的強化式學習之演算法,除了遊戲規則外毫無人類知識,經過訓練後會有極佳的結果。為了要讓此架構在訓練初期,就能夠成功學習到五子棋所需的獲勝資訊,本研究展示了快贏策略(Quick Win)與迫著空間。

    AlphaZero is a generic reinforcement learning algorithm that achieved superior results after training, given no domain knowledge except the game rules. To get the similar results and let the neural network learn winning information of Gomoku in the beginning of the training, this thesis deals with Quick Win and Threats-space Search methods.
    Quick Win method aims to let the neural network learn how to win faster by choosing the fastest winning move when the walkable moves show the same winning possibilities. Threats-space Search method is to search for the threats for every move, letting the neural network learn how to create threats for shortening the training period.
    In this thesis, we demonstrate four kinds of experiments applied to Gomoku including linear distance weight, exponential distance weight, combining Threats-space Search with distance weight and combining Threats-space Search with Monte Carlo Tree Search. We observe whether the implementations based on AlphaZero algorithm effectively enhances the winning ability because of choosing a faster winning move or a threat move during the game.

    第一章 緒論 1 1.1 研究背景 1 1.2 研究目的 5 1.3 研究的意義 7 第二章 文獻探討 8 2.1 類神經網路 8 2.2 AlphaZero 10 2.3 五子棋 14 2.4 Quick Win 16 2.5 迫著搜尋 19 第三章 方法與步驟 22 3.1 Alpha-zero-general 23 3.2 線性增加距離權重 27 3.3 指數增加距離權重 29 3.4 結合迫著搜尋於距離權重 30 3.5 結合迫著搜尋於蒙地卡羅樹搜索法 33 第四章 實驗與結果 35 4.1 線性增加距離權重實驗結果 35 4.2 指數增加距離權重 37 4.3 結合迫著搜尋於距離權重 39 4.4 結合迫著搜尋於蒙地卡羅樹搜索法 41 第五章 結論與未來工作 43 參考文獻 45 附錄一:Alpha-zero-general 類別與方法 49

