簡易檢索 / 詳目顯示

研究生: 余榮泰
Yu, Rong-Tai
論文名稱: 基於深度強化學習之室內空氣品質控制 系統研究
Indoor Air Quality Control System Based On Deep Reinforcement Learning
指導教授: 賀耀華
Ho, Yao-Hua
口試委員: 劉宇倫
Liu, Yu-Lun
賀耀華
Ho, Yao-Hua
陳伶志
Chen, Ling-Jyh
口試日期: 2022/12/09
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 59
中文關鍵詞: 深度強化學習遷移學習暖通空調物聯網
英文關鍵詞: Deep Reinforcement Learning, Transfer Learning, HVAC, Internet of Thing
研究方法: 實驗設計法
DOI URL: http://doi.org/10.6345/NTNU202300030
論文種類: 學術論文
相關次數: 點閱:96下載:27
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來世界各地飽受新冠肺炎的侵擾,許多政府為了防止病毒的擴散頒布了一系列措施以降低傳染病毒的風險,但是有些措施在一些場合無法完全的實施,如於學校中保持安全距離、在餐廳中配戴口罩等等,因此本研究嘗試以不干預人們行為的方式降低感染病毒的風險以及保持空間內的舒適和省電的條件下建立一套自動控制室內空氣品質的系統。
    該系統稱之為室內空氣品質自動控制系統(Indoor Air Quality Auto Control, IAQAC),藉由深度強化學習(Deep Reinforcement Learning)技術使系統能夠自行找出對該空間最佳的策略以保持舒適、低感染風險、省電的效果;另外研究中使用了遷移學習先後在模擬環境和現實環境中訓練以降低過多的時間成本;最後搭建了一套自動收集資料、控制設備的系統以提供必要的資訊和執行系統決策的動作。

    In recent years, the world has been affected by COVID-19. Many governments have enacted a series of measures to prevent the spread of the virus to reduce the risk of infection, but some measures cannot be fully implemented in some situations, such as keeping a safe distance in schools, wearing masks in restaurants, etc. Therefore, this study attempts to create a system that automatically controls indoor air quality without interfering with people's behavior to reduce the risk of virus infection and to keep the space comfortable and energy efficient.
    The system, called Indoor Air Quality Auto Control (IAQAC), uses Deep Reinforcement Learning (DRL) techniques to enable the system to find its own optimal strategy for the space to maintain comfort, low risk of infection, and energy savings. In addition, Transfer Learning was used to train the system in both simulated and realistic environments to reduce excessive time costs; finally, a system was built to automatically collect data and control equipment to provide necessary information and perform system decision making actions.

    第一章 緒論 1 1-1 研究背景與動機 1 1-2 研究目的與貢獻 2 第二章 文獻探討 4 2-1 病毒傳播與室內空氣品質 4 2-1-1 基本傳染數 (R0) 4 2-1-2 Predicted Mean Vote (PMV) 6 2-1-3 二氧化碳 (CO2) 8 2-2 強化學習背景 8 2-2-1 強化學習基本架構 9 2-2-2 Q-learning 13 2-2-3 Deep Q-learning 14 2-2-4 Actor-Critic 17 2-2-5 Soft Actor-Critic (SAC) 18 2-2-6 Soft Actor-Critic for Discrete Action Settings (SACD) 21 2-2-7 基於深度強化學習的暖通空調策略 21 第三章 研究方法 23 3-1 問題描述 24 3-2 模擬資料生成 24 3-3 資料前處理 27 3-4 強化學習應用於暖通空調 28 3-5 於模擬環境訓練流程 30 3-6 硬體開發 31 3-7 遷移學習 35 第四章 實驗分析 36 4-1 實驗設定 36 4-2 模擬環境設定與實驗分析 36 4-2-1 模擬環境設定 36 4-2-2 模擬實驗 38 4-3 現實環境設定與實驗分析 46 4-3-1 現實環境設定 46 4-3-2 現實實驗分析 47 第五章 結論與未來展望 51 第六章 參考文獻 53

    避免群聚感染,建議停辦室內100人以上、室外500人以上集會活動. Retrieved November 14, 2022, from https://www.cdc.gov.tw/Category/ListContent/EmXemht4IT-IRAPrAnyG9A?uaid=OGvZ8a1qdqdNo5mUgeaOqw
    疾病介紹—衛生福利部疾病管制署. Retrieved November 16, 2022, from https://www.cdc.gov.tw/Category/Page/vleOMKqwuEbIMgqaTeXG8A
    van Doremalen, N., Bushmaker, T., Morris, D. H., Holbrook, M. G., Gamble, A., Williamson, B. N., Tamin, A., Harcourt, J. L., Thornburg, N. J., Gerber, S. I., Lloyd-Smith, J. O., de Wit, E., & Munster, V. J. (2020). Aerosol and Su
    National Academies of Sciences, E. (2020). Rapid Expert Consultation on the Possibility of Bioaerosol Spread of SARS-CoV-2 for the COVID-19 Pandemic (April 1, 2020). https://doi.org/10.17226/25769
    Thatiparti, D. S., Ghia, U., & Mead, K. R. (2016). Computational fluid dynamics study on the influence of an alternate ventilation configuration on the possible flow path of infectious cough aerosols in a mock airborne infection
    World Health Organization. (2009). Natural ventilation for infection control in health care settings. World Health Organization. https://apps.who.int/iris/handle/10665/44167
    Zhang, Z., & Lam, K. P. (2018). Practical implementation and evaluation of deep reinforcement learning control for a radiant heating system. Proceedings of the 5th Conference on Systems for Built Environments, 148–157.
    Fang, X., Gong, G., Li, G., Chun, L., Peng, P., Li, W., Shi, X., & Chen, X. (2022). Deep reinforcement learning optimal control strategy for temperature setpoint real-time reset in multi-zone building HVAC system. Applied Thermal
    Fraser, C., Riley, S., Anderson, R. M., & Ferguson, N. M. (2004). Factors that make an infectious disease outbreak controllable. Proceedings of the National Academy of Sciences of the United States of America, 101(16), 6146–6151
    Riley, E. C., Murphy, G., & Riley, R. L. (1978). Airborne spread of measles in a suburban elementary school. American Journal of Epidemiology, 107(5), 421–432. https://doi.org/10.1093/oxfordjournals.aje.a112560
    Rudnick, S., & Milton, D. (2003). Risk of indoor airborne infection transmission estimated from carbon dioxide concentration. Indoor Air, 13(3), 237–245.
    Chen, S., Chang, C., & Liao, C. (2006). Predictive models of control strategies involved in containing indoor airborne infections. Indoor Air, 16(6), 469–481.
    Bazant, M. Z., & Bush, J. W. M. (2021). A guideline to limit indoor airborne transmission of COVID-19. Proceedings of the National Academy of Sciences, 118(17), e2018995118. https://doi.org/10.1073/pnas.2018995118
    Dai, H., & Zhao, B. (2020). Association of the infection probability of COVID-19 with ventilation rates in confined spaces. Building Simulation, 13(6), 1321–1327. https://doi.org/10.1007/s12273-020-0703-5
    Fanger, P. O. & others. (1970). Thermal comfort. Analysis and applications in environmental engineering. Thermal Comfort. Analysis and Applications in Environmental Engineering.
    ANSI/ASHRAE (2017) Standard 55: 2017, Thermal Environmental Conditions for Human Occupancy. ASHRAE, Atlanta.
    American Society of Heating Refrigerating and Air-Conditioning Engineers. (2005). 2005 Ashrae handbook : fundamentals (S.I.). ASHRAE.
    Thermal Comfort. (2002). Innova Air Tech Instruments.
    Schaudienst, Falk, and Frank U. Vogdt. "Fanger’s model of thermal comfort: a model suitable just for men?." Energy Procedia 132 (2017): 129-134.
    Ainsworth, B. E., Haskell, W. L., Whitt, M. C., Irwin, M. L., Swartz, A. M., Strath, S. J., O Brien, W. L., Bassett, D. R., Schmitz, K. H., Emplaincourt, P. O., & others. (2000). Compendium of physical activities: An update of activity
    Dawe, M., Raftery, P., Woolley, J., Schiavon, S., & Bauman, F. (2020). Comparison of mean radiant and air temperatures in mechanically-conditioned commercial buildings from over 200,000 field and laboratory measurements. Energy and Buil
    行政院環境保護署主管法規共用系統-法規沿革-室內空氣品質標準. Retrieved November 16, 2022, from https://oaout.epa.gov.tw/law/LawContentSource.aspx?id=FL068252
    Satish, U., Mendell, M. J., Shekhar, K., Hotchi, T., Sullivan, D., Streufert, S., & Fisk, W. J. (2012). Is CO2 an indoor pollutant? Direct effects of low-to-moderate CO2 concentrations on human decision-making performance. Environmental
    Sutton, R. S. (1984). Temporal credit assignment in reinforcement learning. University of Massachusetts Amherst.
    Cunningham, P., Cord, M., & Delany, S. J. (2008). Supervised Learning. In M. Cord & P. Cunningham (Eds.), Machine Learning Techniques for Multimedia: Case Studies on Organization and Retrieval (pp. 21–49). Springer. https://doi.org/10.
    Puterman, M. L. (1990). Markov decision processes. Handbooks in Operations Research and Management Science, 2, 331–434.
    Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
    Bellman, R. (1966). Dynamic programming. Science, 153(3731), 34–37.
    Spaan, M. T. (2012). Partially observable Markov decision processes. In Reinforcement Learning (pp. 387–414). Springer.
    Sutton, L. K. R. (1996). Model-based reinforcement learning with an approximate, learned model. Proceedings of the Ninth Yale Workshop on Adaptive and Learning Systems, 1996, 101–105.
    Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3), 279–292.
    Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9–44.
    Metropolis, N., & Ulam, S. (1949). The monte carlo method. Journal of the American Statistical Association, 44(247), 335–341.
    Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning (arXiv:1312.5602). arXiv. https://doi.org/10.48550/arXiv.1312.5602
    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., & others. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540),
    Van Hasselt, H., Guez, A., & Silver, D. (2016). Deep reinforcement learning with double q-learning. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1).
    Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., & Freitas, N. (2016). Dueling network architectures for deep reinforcement learning. International Conference on Machine Learning, 1995–2003.
    Schaul, T., Quan, J., Antonoglou, I., & Silver, D. (2015). Prioritized experience replay. ArXiv Preprint ArXiv:1511.05952.
    Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. International Conference on Machine Learning, 1861–1870.
    Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2015). Continuous control with deep reinforcement learning. ArXiv Preprint ArXiv:1509.02971.
    Fujimoto, S., Hoof, H., & Meger, D. (2018). Addressing function approximation error in actor-critic methods. International Conference on Machine Learning, 1587–1596.
    Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., & others. (2018). Soft actor-critic algorithms and applications. ArXiv Preprint ArXiv:1812.05905.
    Haarnoja, T., Ha, S., Zhou, A., Tan, J., Tucker, G., & Levine, S. (2018). Learning to walk via deep reinforcement learning. ArXiv Preprint ArXiv:1812.11103.
    Christodoulou, P. (2019). Soft actor-critic for discrete action settings. ArXiv Preprint ArXiv:1910.07207.
    Fang, X., Gong, G., Li, G., Chun, L., Peng, P., Li, W., Shi, X., & Chen, X. (2022). Deep reinforcement learning optimal control strategy for temperature setpoint real-time reset in multi-zone building HVAC system. Applied Thermal Engine
    Coraci, D., Brandi, S., Piscitelli, M. S., & Capozzoli, A. (2021). Online implementation of a soft actor-critic agent to enhance indoor temperature control and energy efficiency in buildings. Energies, 14(4), 997.
    Roderick, M., MacGlashan, J., & Tellex, S. (2017). Implementing the Deep Q-Network. arXiv. https://doi.org/10.48550/ARXIV.1711.07478
    EnergyPlus. Retrieved November 16, 2022, from https://energyplus.net/
    Zhang, Z., & Lam, K. P. (2018). Practical implementation and evaluation of deep reinforcement learning control for a radiant heating system. Proceedings of the 5th Conference on Systems for Built Environments, 148–157.
    Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). OpenAI Gym. arXiv. https://doi.org/10.48550/ARXIV.1606.01540
    3D Design Software | 3D Modeling on the Web. SketchUp. Retrieved November 22, 2022, from https://www.sketchup.com/page/homepage
    OpenStudio. Retrieved November 22, 2022, from https://openstudio.net/
    Climate.onebuilding.org. Retrieved November 16, 2022, from https://climate.onebuilding.org/
    Journal of Physical and Chemical Reference Data 1982 1988 Lot of 3 Thermodynamic. EBay. Retrieved November 17, 2022, from https://www.ebay.com/itm/185269889303
    Tartarini, F., Schiavon, S., Cheung, T., & Hoyt, T. (2020). CBE Thermal Comfort Tool: Online tool for thermal comfort calculations and visualizations. SoftwareX, 12, 100563. https://doi.org/10.1016/j.softx.2020.100563
    LASS環境感測器網路系統 | 由社群自發建立的環境感測網路系統. Retrieved November 16, 2022, from https://lass-net.org/
    \climatewebsite\WMO_Region_2_Asia\TWN_Taiwan. Retrieved November 22, 2022, from https://climate.onebuilding.org/WMO_Region_2_Asia/TWN_Taiwan/index.html
    Carbon Dioxide | Vital Signs – Climate Change: Vital Signs of the Planet. Retrieved November 22, 2022, from https://climate.nasa.gov/vital-signs/carbon-dioxide/
    觀測資料查詢系統 V7.2. Retrieved December 5, 2022, from https://e-service.cwb.gov.tw/HistoryDataQuery/
    台灣電力股份有限公司. (2017, November 14). 電價表—用戶服務 [Text]. 台灣電力股份有限公司; 台灣電力股份有限公司. https://www.taipower.com.tw/tc/page.aspx?mid=238
    新竹分局. (2009, December 25). 31.Q:何謂【1度電】? [資訊]. 標準檢驗局; 新竹分局. https://www.bsmi.gov.tw/wSite/ct?xItem=20167&ctNode=3320&mp=1

    下載圖示
    QR CODE