簡易檢索 / 詳目顯示

研究生: 游雅雯
Yu, Ya-Wen
論文名稱: Disease Prediction and Topic Phrase Extraction from Clinical Reports by Attention-based LSTM model
Disease Prediction and Topic Phrase Extraction from Clinical Reports by Attention-based LSTM model
指導教授: 柯佳伶
Koh, Jia-Ling
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 55
中文關鍵詞: disease predictionself-attentionattention interpretation
英文關鍵詞: disease prediction, self-attention, attention interpretation
DOI URL: http://doi.org/10.6345/NTNU202000386
論文種類: 學術論文
相關次數: 點閱:143下載:9
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • In this thesis, we focus on how to predict a certain disease from a given pathology report without the pathologist's diagnosis paragraph. Moreover, we aim to identify relevant diagnostic features within reports' paragraphs and get the determined clinical phrases that serve as clinical interpretations for the prediction model. We use the attention-based LSTM model for binary prediction of a given disease. Next, the attention weights learned from the model are extracted to generate attention terms. These attention terms are grouped under different MeSH terms defined by the United States National Library of Medicine. Moreover, the topic phrases are generated by using the frequency pattern method as representations of each group. The extracted topic phrases could provide as the determined clinical interpretation for the prediction.

    Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Goal 3 1.3 Limitation 3 1.4 Method 4 Chapter 2 Related Works 6 2.1 Clinical Information Extraction (IE) 6 2.2 EHR Outcome Predictions 9 2.3 Representation Learning on Clinical Field 11 Chapter 3 Disease Prediction 12 3.1 Pre-processing 12 3.2 Word Encoding 15 3.3 Attention + biLSTM model 15 Chapter 4 Attention Interpretation 21 4.1 Candidate Attention Term Generation 22 4.2 Grouping Attention Terms 25 4.3 Topic Phrases Generation 28 Chapter 5 Performance Evaluation 31 5.1 Data Description 31 5.2 Disease Prediction Evaluation 32 5.3Attention Terms Evaluation 43 5.4 The Results of Clinical Phrases Extraction for Prediction 48 Chapter 6 Conclusion 50 Reference 51 Appendixes 54

    [1] Aronson, Alan R. "Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program." Proceedings of the AMIA Symposium American Medical Informatics Association, 2001.
    [2] Bengio, Yoshua, Aaron Courville, and Pascal Vincent. "Representation learning: A review and new perspectives." IEEE transactions on pattern analysis and machine intelligence, 2013.
    [3] Choi, Edward, et al. "Doctor ai: Predicting clinical events via recurrent neural networks." Machine Learning for Healthcare Conference. 2016.
    [4] Choi, Edward, et al. "GRAM: graph-based attention model for healthcare representation learning." Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2017.
    [5] Choi, Edward, et al. "Retain: An interpretable predictive model for healthcare using reverse time attention mechanism." Advances in Neural Information Processing Systems. 2016.
    [6] De Vine, Lance, et al. "Medical semantic similarity with a neural language model."
    Proceedings of the 23rd ACM international conference on conference on information and knowledge management. ACM, 2014.
    [7] Esteban, Cristóbal, et al. "Predicting clinical events by combining static and dynamic information using recurrent neural networks." IEEE International Conference on Healthcare Informatics(ICHI). Ieee, 2016.
    [8] Hinton, Geoffrey E. "Learning distributed representations of concepts." Proceedings of the eighth annual conference of the cognitive science society. Vol. 1. 1986.
    [9] Jagannatha, Abhyuday N., and Hong Yu. "Bidirectional RNN for medical event detection in electronic health records." Proceedings of the conference. Association for Computational Linguistics, 2016.
    [10] Jagannatha, Abhyuday N., and Hong Yu. "Structured prediction models for RNN based sequence labeling in clinical text." Proceedings of the conference on empirical methods in natural language processing. conference on empirical methods in natural language processing. Vol. 2016. NIH Public Access, 2016.
    [11] Jin, Bo, et al. "A Treatment Engine by Predicting Next-Period Prescriptions."
    Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018.
    [12] Koh, Wu. "Keyword Extraction and Structuralization for Medical Report." NTNU, KDD Lab. 2016
    51
    [13] Lafferty, John, Andrew McCallum, and Fernando CN Pereira. "Conditional random fields: Probabilistic models for segmenting and labeling sequence data." Proceedings of the 18th International Conference on Machine Learning, 2001
    [14] Lipton, Zachary C., et al. "Learning to diagnose with LSTM recurrent neural networks." Computing Research Repository, 2015.
    [15] Ma, Fenglong, et al. "Risk prediction on electronic health records with prior medical knowledge." Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018.
    [16] Minarro-Giménez, José Antonio, Oscar Marin-Alonso, and Matthias Samwald. "Exploring the application of deep learning techniques on medical text corpora." Studies in health technology and informatics, 2014.
    [17] Nandhakumar, Nidhin, et al. "Clinically Significant Information Extraction from Radiology Reports." Proceedings of the 2017 ACM Symposium on Document Engineering. ACM, 2017.
    [18] Pham, Trang, et al. "Deepcare: A deep dynamic memory model for predictive medicine." Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Cham, 2016.
    [19] Savova, Guergana K., et al. "Discovering peripheral arterial disease cases from radiology notes using natural language processing." AMIA Annual Symposium Proceedings. Vol. 2010. American Medical Informatics Association, 2010.
    [20] Savova, Guergana K., et al. "Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications." Journal of the American Medical Informatics Association, 2010.
    [21] Sethi, Sanjeev, et al. "Mayo clinic/renal pathology society consensus report on pathologic classification, diagnosis, and reporting of GN." Journal of the American Society of Nephrology 27.5 , 2016.
    [22] Sha, Ying, and May D. Wang. "Interpretable predictions of clinical outcomes with an attention-based recurrent neural network." Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. ACM, 2017.
    [23] Shashikumar, Supreeth P., et al. "Detection of paroxysmal atrial fibrillation using attention-based bidirectional recurrent neural networks." Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018.
    [24] Shickel, Benjamin, et al. "Deep EHR: A survey of recent advances in deep learning techniques for electronic health record (EHR) analysis." IEEE journal of biomedical and health informatics 22.5, 2018.
    [25] Singh, Gaurav, et al. "A Neural Candidate-Selector Architecture for Automatic Structured Clinical Text Annotation." Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 2017.
    [26] Wang, Yanshan, et al. "Clinical information extraction applications: a literature review." Journal of biomedical informatics, 2018.
    [27] Wu, Yonghui, et al. "Named entity recognition in Chinese clinical text using deep neural network." Studies in health technology and informatics, 2015.
    [28] Wyatt, Robert J., and Bruce A. Julian. "IgA nephropathy." New England Journal of Medicine368.25, 2013.
    [29] Xu, Yanbo, et al. "RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data." Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018.
    [30] Gers, Felix A., Jürgen Schmidhuber, and Fred Cummins. "Learning to forget: Continual prediction with LSTM." 1999 Ninth International Conference on Artificial Neural Networks ICANN 99, 1999.
    [31] Lin, Zhouhan, et al. "A structured self-attentive sentence embedding." International Conference on Learning Representations, 2017.
    [32] MeSH on Demand: https://meshb.nlm.nih.gov/MeSHonDemand [33] NLM Medical Text Indexer https://ii.nlm.nih.gov/MTI/
    [34] M. Paterson, and V. Dančík. "Longest common subsequences." In Proc. of the Mathematical Foundations of Computer Science (MFCS), 1994.
    [35] Liu, Zengjian, et al. "Entity recognition from clinical texts via recurrent neural network." BMC medical informatics and decision making, 201

    下載圖示
    QR CODE