國立臺灣師範大學博碩士論文全文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	李柏勳 Lee, Bo-Syun
論文名稱：	生醫文獻中疾病與藥物關係之樣式自動化擷取 Automatic Pattern Extraction of Disease-Drug Association from Biomedical Texts
指導教授：	侯文娟 Hou, Wen-Juan
學位類別：	碩士 Master
系所名稱：	資訊工程學系 Department of Computer Science and Information Engineering
論文出版年：	2017
畢業學年度：	105
語文別：	中文
論文頁數：	57
中文關鍵詞：	疾病-藥物關聯度、樣式擷取、生醫文獻、卡方檢定
英文關鍵詞：	disease-drug association, pattern extraction, biomedical literature, chi-square test
DOI URL：	https://doi.org/10.6345/NTNU202202301
論文種類：	學術論文
相關次數：	點閱：274 下載：5
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本研究嘗試從生醫文獻中找出人類疾病與藥物的關聯度，並在人類疾病與藥物之間得到一些規則或是關聯性。若能自動從文獻中預測疾病與藥物之間的相關性，對於未來生醫研究人員探討疾病與藥物的文獻資料時，就可以利用此關聯性，快速了解疾病與藥物之間的關係，達到快速獲取資訊的目的，既可以節省人力與時間成本，也能加速生物醫學的發展速度。
本研究所使用的資料為Clinical trials (https://clinicaltrials.gov/)網站中提供的一些美國官方已完成的疾病研究和藥物的配對，以及PubMed資料庫(https://www.ncbi.nlm.nih.gov/pubmed/)的生醫文獻摘要。在本論文中，首先從PubMed文章摘要找出含有Clinical trials所提及到的疾病與藥物之句子，視為正向的句子；以及相同疾病卻不同的藥物之句子，視為負向的句子。透過兩種模型，第一種是句子中疾病位置在前、藥物位置在後；第二種則是句子中藥物位置在前、疾病位置在後，以便分析在疾病與藥物之間的動詞、名詞等相關資訊。本研究將這些單字分為純關聯、純無關聯性、混合字，再使用卡方檢定(chi-square test)把符合門檻的中性字再做一次的分類，得到疾病與藥物關係之樣式規則，最後利用這些樣式規則與測試資料做比對與評估，本研究實驗最佳結果Precision為100%、Recall為89%以及F-score為94%。

The objectives of this study are to identify the association between human diseases and medications from the biomedical literatures, and to find the rules or relationships between human diseases and drugs. If the association can be identified automatically from literatures, it will help biomedical researchers who is studying the literatures of diseases and medications use the information understand the relationships between diseases and drugs, and have the benefit of collecting the information more efficiently. It would either save the human resource cost and time cost or accelerate the pace of development of biomedical science.
The data in this study is from the existing studies of diseases and drugs pairs accomplished by the American authorities in the website of Clinical Trial (https://clinicaltrials.gov/) and biomedical literatures in the website of PubMed (https://www.ncbi.nlm.nih.gov/pubmed/). In this thesis, initially we search for the sentences with the terms of diseases and drugs mentioned in the Clinical trials website and identify these sentences as positive sentences. Then find the sentences with relevant diseases but with different medications and identify these sentences as negative sentences. As to analyze the number of verbs and nouns pertinent to diseases and medications, two models with different sentence structures are established.
The first model is for the sentences with the order that word “diseases” precedes the word “medications”. The second model is for the sentences in a reverse order of the first model. Then classify these words into categories of pure association, pure no association and neutrals. Among them, the qualified neutrals are further classified by the method of the chi-square test. The associations between diseases and medications are, as a result, identified which are called patterns later. Finally, use the patterns to test data to extract the disease and drug pairs. The best experimental results show precision value of 100%, Recall value of 89%,and F-score value of 94%.

摘要    i
Abstract    ii
目錄    vi
附表目錄    vii
附圖目錄    viii
第一章    緒論    1
第一節    研究背景    1
第二節    研究目的    2
第三節    論文架構    2
第二章    相關研究探討    3
第一節    文獻探討    3
第二節    疾病介紹    6
第三節    Stanford Parser    8
第四節    Drug Bank    9
第五節    Stemming    10
第三章    方法與步驟    11
第一節    緒論    11
第二節    背景知識庫    11
第三節    前置處理程序    14
第四節    研究方法架構    16
第五節    後置處理程序    20
第四章    實驗與結果    31
第一節    實驗資料    31
第二節    評估測量標準    37
第三節    實驗結果    38
第四節    分析與討論    42
第五章    結論與未來發展    56
參考文獻    57
                                

COPD介紹：http://epaper.ntuh.gov.tw/health/201509/health_2.html
Drug Bank：https://www.drugbank.ca/
Jang, D., Lee, S., Lee, J., Kim, K., & Lee, D. (2016). Inferring new drug indications using the complementarity between clinical disease signatures and drug effects. Journal of biomedical informatics, 59, 248-257.
MeSH terms：https://www.ncbi.nlm.nih.gov/mesh/
Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130-137.
Porter stemmer. Available from https://tartarus.org/martin/PorterStemmer/
PubMed database：https://www.ncbi.nlm.nih.gov/pubmed/
Stanford Parser：http://nlp.stanford.edu/software/lex-parser.shtml

Xu, R., & Wang, Q. (2013). Large-scale extraction of accurate drug-disease treatment pairs from biomedical literature for drug repurposing. BMC bioinformatics, 14(1), 181.
卡方檢定的介紹：http://amebse.nchu.edu.tw/new_page_659.htm
非小細胞肺癌介紹：http://www2.cch.org.tw/lungcancer/LC_path.htm
葉氏連續性修正：http://terms.naer.edu.tw/detail/1312488/

簡易檢索 / 詳目顯示

相關論文