研究生: 陳佩瑄
Chen, Pei-Hsuan
論文名稱: 以混合式方法自生醫文獻擷取藥物-藥物交互作用之研究
A Hybrid Method for Drug-Drug Interaction Extraction from Biomedical Literature
指導教授: 侯文娟
Hou, Wen-Juan
學位類別: 碩士
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 79
中文關鍵詞: 藥物-藥物交互作用生醫文獻機器學習規則為基
英文關鍵詞: Drug-Drug Interaction, Biomedical Literature, Machine Learning, Rule-based
DOI URL: https://doi.org/10.6345/NTNU202202907
論文種類: 學術論文
  • 一種疾病往往伴隨著許多不同的症狀,而一種症狀通常使用一種藥物治療,例如:感冒時,會有咳嗽、鼻塞或頭痛等症狀,所以就需要多種藥物來治癒該疾病。在服藥期間,若藥物與藥物之間產生不理想之狀況,像是藥效過強或互相抵抗,導致療效失敗,嚴重甚至導致死亡,就是所謂的藥物-藥物交互作用(Drug-Drug Interaction, DDI)。目前許多的藥物-藥物交互作用仍隱藏在大量的生醫文獻中,等著被研究人員挖掘,若利用自然語言處理(Natural Language Processing, NLP)的擷取和分析等技術,將能大量挖掘隱藏的藥物-藥物交互作用以及減少研究人員的挖掘時間。
    論文中所使用的資料來源是由SemEval 2013 Task 9所提供的語料庫,內容包括MedLine的摘要和DrugBank的資料庫,SemEval 2013 Task 9的競賽內容為自生醫文獻中擷取藥物-藥物交互作用(SemEval 2013 Task9:Extraction of Drug-Drug Interactions from Biomedical Texts),將藥物-藥物交互作用分成五類:Advice(建議)、Effect(影響)、Mechanism(機制)、Int(交互作用)和無交互作用,評估的方式為計算辨識和分類的precision、recall和F1-measure。

    A disease is often accompanied by many different symptoms, and a symptom is usually treated with a drug. For example, when someone gets a cold, he or she usually has symptoms such as coughing, stuffy nose or headache, so it leads to need many kinds of drugs to cure the disease. Drug-Drug Interaction (DDI) is happened during the treatment with drugs if unpredictable results are produced. It may increase or decrease the drug effect, even may cause death. At present, many Drug-Drug Interactions are still hidden in a large number of biomedical literature. It takes a lot of time to find out the DDIs for the researchers. Using Natural Language Processing (NLP) extraction and analysis technologies will be able to discover a large number of hidden DDIs and reduce the researchers’ research time.
    The corpus in the thesis is provided by Semeval 2013 Task 9, which includes MedLine abstracts and DrugBank database. Semeval 2013 Task 9 aims to extraction of Drug-Drug Interactions from biomedical texts, and DDIs are classified as the following five types: Advice(ADV),Effect(EFF),Mechanism(MEC),Int(INT) and non-interaction. Evaluation results will be reported using the standard precision、recall and F1-measure.
    This study uses the hybrid method to detect and classify DDIs. The hybrid method includes a machine learning method and a rule-based method. Because the corpus is unbalanced, the study uses two stages to complete the tasks of detection and classification. The first stage is to detect with all the classes (i.e., positive and negative), and the second stage is to make a classification on the positive DDIs (i.e., ADV, EFF, MEC, INT). The experiments show the results of 70.8% F-score in detection, and 62.5% F-score in classification. Though the performance is still worse than FBK-irst team in DDI detection and classification, the performance is higher than the average performance of all teams. In the future, we hope to use the hybrid method in other area of information extraction researches.

    摘要 I Abstract III 附表目錄 VIII 附圖目錄 X 第一章 緒論 1 第一節 研究背景 1 第二節 研究動機與目的 2 第三節 論文架構 3 第二章 文獻探討 4 第一節 SemEval 2013 Task 9 4 第二節 近期藥物-藥物交互作用擷取方法與成果 6 第三節 支持向量機(Support Vector Machine) 9 第四節 不平衡資料(Imbalanced Data)處理 10 第三章 研究方法與步驟 12 第一節 研究架構 12 第二節 資料前處理(Data Preprocessing) 14 第三節 特徵擷取(Feature Extraction) 16 第四節 特徵選取(Feature Selection) 20 第五節 機器學習方法 21 第六節 以規則為基方法 22 第四章 資料來源與評估方式 29 第一節 資料來源 29 第二節 評估方式 32 第五章 實驗結果與討論 34 第一節 辨識後之結果與討論 34 第二節 分類後之結果與討論 50 第三節 綜合辨識和分類後之結果與討論 68 第六章 結論與未來展望 74 參考文獻 76

