簡易檢索 / 詳目顯示

研究生: 邱宣凱
Chiu, Hsuan-Kai
論文名稱: 開放式學習應用於優化多目標的連子棋類遊戲
Multi-Objective Optimization Based on Open-ended Learning Applies to Connection Games
指導教授: 林順喜
Shun-Shii Lin
口試委員: 林順喜
Shun-Shii Lin
吳毅成
I-Chen Wu
周信宏
Hsin-Hung Chou
陳志昌
Jr-Chang CHEN
顏士淨
Yen, Shi-Jim
口試日期: 2024/07/01
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 50
中文關鍵詞: 連子棋AlphaZero開放式學習
英文關鍵詞: Connection Games, AlphaZero, Open-ended Learning
研究方法: 實驗設計法
DOI URL: http://doi.org/10.6345/NTNU202401569
論文種類: 學術論文
相關次數: 點閱:88下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Open-ended learning是Google DeepMind在2021提出的一種AI,與以前常見的AI不同,Open-ended learning的AI並不會將一種任務做到最佳化,但Open-ended的AI可以做到多種不同的任務,是以多目標最佳化為訴求的AI。目前由於Open-ended learning 是一種非常新的概念,其文獻的數量處於一個相對較少的狀況,實作方面也是在一個較為模糊的階段。故本研究希望使用相對熟悉的技術以及遊戲規則,來嘗試實作出與Open-ended learning類似或是相同的AI。
    連子棋是一種雙人對弈的遊戲,雙方玩家在圍棋棋盤上輪次落子,先將指定顆數的己方的棋子連成任何橫縱斜方向者為勝。而本研究使用的五子棋、四子棋、及三子棋,規則上除了目標棋子數為五顆、四顆和三顆之外,還有縮小了棋盤的大小。
    由於Open-ended learning的AI的訓練資料是由程式生成的,故本研究打算以能透過自我對弈來產生訓練資料的alpha-zero-general,來做為實現Open-ended learning的AI的核心,本實驗透過修改alpha-zero-general中自我對弈的部分來使訓練出來的AI獲得可以下多種棋規的能力。

    Open-ended learning is a type of AI proposed by Google DeepMind in 2021. Unlike traditional AI, which optimizes for a single task, Open-ended learning AI is designed to perform multiple different tasks, aiming for multi-objective optimization. Currently, due to the novelty of the Open-ended learning concept, the number of related literatures is relatively small, and its practical implementation is still in a somewhat ambiguous stage. Therefore, this study aims to use relatively familiar techniques and game rules to attempt to implement an AI similar to or identical to Open-ended learning.
    A connection game is a two-player board game where players take turns placing stones on a board, and the first player to align a specified number of their stones in any horizontal, vertical, or diagonal direction wins. In this study, we use connection game variations with targets of five, four, and three stones to win, and we also reduce the board size accordingly.
    Since the training data for Open-ended learning AI is generated by programs, this study intends to use alpha-zero-general, which can generate training data through self-play, as the core to achieve Open-ended learning AI. This experiment modifies the self-play aspect of alpha-zero-general to enable the trained AI to handle multiple game rules.

    一 緒論 1 1.1 研究背景 1 1.2 研究目的 2 二 連子棋 4 2.1 遊戲規則 4 2.2 遊戲術語 4 三 文獻探討 10 3.1 五子棋迫著策略 10 3.2 AlphaZero 10 3.3 Open-ended learning 11 3.4 Alpha-zero-general 13 3.5 7路五子棋 13 3.6 多策略的MCTS 14 四 方法與步驟 16 4.1 連子棋遊戲實作於alpha-zero-general 16 4.1.1 alpha-zero-general 16 4.1.2 盤面設計與勝負判斷 16 4.1.3 對稱盤面 17 4.1.4 神經網路架構 17 4.2 單一規則連子棋訓練 18 4.2.1 五子棋訓練 18 4.2.2 四子棋訓練 19 4.2.3 三子棋訓練 20 4.3 多規則模型訓練 21 五 實驗與結果分析 23 5.1 設備及參數設置 23 5.2 實驗規劃 25 5.3 雙規則模型open4-5對上純alpha-zero-general 25 5.3.1 五子棋規則 25 5.3.2 四子棋規則 28 5.4 三規則模型open3-4-5對上純alpha-zero-general 23 5.4.1 五子棋規則 23 5.4.2 四子棋規則 25 5.4.3 三子棋規則 27 5.5 三規則模型open3-4-5對上雙規則模型open4-5 29 5.5.1 五子棋規則 29 5.5.2 四子棋規則 40 5.6 實驗結果總結與分析 42 5.6.1 雙規則模型open4-5 42 5.6.2 三規則模型open3-4-5 43 六 未來規劃 46 6.1 動態化訓練資料 46 6.2 分批訓練 47 6.3 規則及盤面擴展 49 參考文獻 50

    黃德彥(2004). 五子棋相關棋類人工智慧之研究,國立交通大學資訊科學與工程研究所碩士論文。
    D. Silver et al., “Mastering Chess and Shogi by Self-Play with a GeneralReinforcement Learning Algorithm,”, arXiv.1712.01815, Dec. 2017.
    Open Ended Learning Team, “Open-Ended Learning Leads to Generally Capable Agents,” arXiv, Jul. 2021.
    Open Ended Learning Team, “Generally capable agents emerge from open-ended play,” Available:https://deepmind.com/blog/article/generally-capable-agents-emerge-from-open-ended-play.
    M. Jaderberg, W. M. Czarnecki, I. Dunning, T. Graepel, L. Marris, “Capture the Flag: the emergence of complex cooperative agents,” Available:https://www.deepmind.com/blog/capture-the-flag-the-emergence-of-complex-cooperative-agents.
    suragnair/alpha-zero-general, suragnair/alpha-zero-general.
    陳昌裕(2013). 五子棋新棋規與五~七路五子棋勝負問題之研究,國立臺灣師範大學資訊科學與工程研究所碩士論文。
    IJCAI International Joint Conference on Artificial Intelligence, ISSN: 1045-0823, Vol: 2019-August, pp. 4704-4710.

    無法下載圖示 電子全文延後公開
    2025/08/19
    QR CODE