研究生: |
洪翊誠 Hung, Yi-Cheng |
---|---|
論文名稱: |
以異質網路圖學習病況事件表示法進行死亡風險預測 Data Representation Learning from Heterogeneous Network of Medical Data for Mortality Prediction |
指導教授: |
柯佳伶
Koh, Jia-Ling |
口試委員: |
吳宜鴻
徐嘉連
柯佳伶
Koh, Jia-Ling |
口試日期: | 2022/01/24 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 93 |
中文關鍵詞: | 異質網路圖 、資料特徵表示法 、死亡預測模型 |
英文關鍵詞: | heterogenous network, data representation, mortality prediction model |
DOI URL: | http://doi.org/10.6345/NTNU202200245 |
論文種類: | 學術論文 |
相關次數: | 點閱:111 下載:7 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來以機器自動學習數據的特徵表示法,已顯示有助於提升預測任務的準確率。本論文以電子病歷資料中相異類型的病況資料,依指定時間區間內病況事件同時發生的關聯,建立病況事件異質網路圖,並搭配不同的病況事件序列生成樣式,從取樣的事件序列中,學習儀器偵測數據特徵的病況事件表示法,用來從加護病房病患入病房後48小時的病況資料,以LSTM類神經網路架構進行死亡風險預測。本論文實驗比較使用同質特徵走訪路徑與異質特徵走訪路徑的擷取策略,所學習到的病況事件表示法對模型預測效果的差異。實驗在院內死亡預測及短期死亡預測的任務,初步顯示由異質特徵走訪路徑中學習的病況事件表示法,對兩個預測模型的預測效果皆有提昇。
In recent years, feature representation learning from data has been shown to be helpful for improving the accuracy of prediction tasks. In this thesis, the various types of attributes combined with the values in the electronic medical record, which implicitly describe patient’s condition, are named clinical events. We constructed a heterogeneous network of clinical events according to their occurring on the same patient within a specified time interval. Then event sequences are sampled by visiting different meta-paths for learning the representations of chart events. The learned representations of chart events are used to input to a framework of LSTM neural network for predicting mortality of ICU patients according to their first 48 hours of in-ICU EMR data. In the experiments, we compared the prediction effectiveness of the learned event representations by changing the time interval of constructing the heterogeneous network and applying homogeneous or heterogeneous meta-path visiting. The preliminary results of experiments show that the representations of chart events learned from the heterogeneous meta-path effectively improve the recall and AUROC on both the tasks of in-hospital mortality prediction and short-term mortality prediction.
[1] Choi, Edward, et al. "Multi-layer representation learning for medical concepts." Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016.
[2] Choi, Edward, et al. "GRAM: graph-based attention model for healthcare representation learning." Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2017.
[3] Choi, Edward, et al. "Retain: An interpretable predictive model for healthcare using reverse time attention mechanism." Advances in Neural Information Processing Systems. 2016.
[4] Ma, Fenglong, et al. "Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks." Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2017.
[5] Ma, Fenglong, et al. "Kame: Knowledge-based attention model for diagnosis prediction in healthcare." Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018.
[6] Hosseini, Anahita, et al. "HeteroMed: Heterogeneous Information Network for Medical Diagnosis." Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018.
[7] Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781(2013).
[8] Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems. 2013.
[9] Perozzi, Bryan, Rami Al-Rfou, and Steven Skiena. "Deepwalk: Online learning of social representations." Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014.
[10] Grover, Aditya, and Jure Leskovec. "node2vec: Scalable feature learning for networks." Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2016.
[11] Harutyunyan, Hrayr, et al. "Multitask learning and benchmarking with clinical time series data." arXiv preprint arXiv:1703.07771 (2017).
[12] Dong, Yuxiao, Nitesh V. Chawla, and Ananthram Swami. "metapath2vec: Scalable representation learning for heterogeneous networks." Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 2017.
[13] Sun, Yizhou, and Jiawei Han. "Mining heterogeneous information networks: a structural analysis approach." Acm Sigkdd Explorations Newsletter 14.2 (2013): 20-28.
[14] Sun, Yizhou, and Jiawei Han. "Mining heterogeneous information networks: principles and methodologies." Synthesis Lectures on Data Mining and Knowledge Discovery 3.2 (2012): 1-159.
[15] Wang, Jizhe, et al. "Billion-scale commodity embedding for e-commerce recommendation in alibaba." Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018.
[16] Rong, Xin. "word2vec parameter learning explained." arXiv preprint arXiv:1411.2738 (2014).
[17] Chaudhry, Basit, et al. "Systematic review: impact of health information technology on quality, efficiency, and costs of medical care." Annals of internal medicine 144.10 (2006): 742-752.
[18] Black, Ashly D., et al. "The impact of eHealth on the quality and safety of health care: a systematic overview." PLoS medicine 8.1 (2011): e1000387.
[19] Goldzweig, Caroline Lubick, et al. "Costs And Benefits Of Health Information Technology: New Trends From The Literature: Since 2005, patient-focused applications have proliferated, but data on their costs and benefits remain sparse." Health affairs 28.Suppl2 (2009): w282-w293.
[20] Jha, Ashish K., et al. "Use of electronic health records in US hospitals." New England Journal of Medicine 360.16 (2009): 1628-1638.
[21] Schuster, Mike, and Kuldip K. Paliwal. "Bidirectional recurrent neural networks." IEEE transactions on Signal Processing 45.11 (1997): 2673-2681.
[22] Dybowski, Richard, et al. "Prediction of outcome in critically ill patients using artificial neural network synthesised by genetic algorithm." The Lancet 347.9009 (1996): 1146-1150.
[23] Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Critical care medicine. 1985;13(10):818–829.
[24] Le Gall JR, Lemeshow S, Saulnier F. A new simplified acute physiology score (SAPS II) based on a European/North American multicenter study. Jama. 1993;270(24):2957–2963.
[25] Vincent JL, De Mendonc¸a A, Cantraine F, Moreno R, Takala J, Suter PM, et al. Use of the SOFA score to assess the incidence of organ dysfunction/failure in intensive care units: results of a multicenter, prospective study. Critical care medicine. 1998;26(11):1793–1800.
[26] KRISHNAN, Gokul S.; KAMATH, S. Sowmya. A Supervised learning approach for ICU mortality prediction based on unstructured electrocardiogram text reports. In: International Conference on Applications of Natural Language to Information Systems. Springer, Cham, 2018. p. 126-134.
[27] HUANG, Gao, et al. Trends in extreme learning machines: A review. Neural Networks, 2015, 61: 32-48.
[28] LEE, Christine K., et al. Development and validation of a deep neural network model for prediction of postoperative in-hospital mortality. Anesthesiology, 2018, 129.4: 649-662.
[29] CHOI, Edward, et al. Doctor ai: Predicting clinical events via recurrent neural networks. In: Machine learning for healthcare conference. PMLR, 2016. p. 301-318.
[30] SONG, Huan, et al. Attend and diagnose: Clinical time series analysis using attention models. In: Thirty-second AAAI conference on artificial intelligence. 2018.
[31] GUPTA, Priyanka, et al. Transfer learning for clinical time series analysis using deep neural networks. Journal of Healthcare Informatics Research, 2020, 4.2: 112-137.
[32] https://zh.wikipedia.org/wiki/ROC%E6%9B%B2%E7%BA%BF