簡易檢索 / 詳目顯示

研究生: 王冠鈜
論文名稱: 以圖形處理器加速近似字串比對之位元平行演算法
Acceleration of Bit Parallel Algorithms for Approximate String Matching Using GPU
指導教授: 林政宏
Lin, Cheng-Hung
學位類別: 碩士
Master
系所名稱: 科技應用與人力資源發展學系
Department of Technology Application and Human Resource Development
論文出版年: 2015
畢業學年度: 103
語文別: 中文
論文頁數: 47
中文關鍵詞: 近似字串比對位元平行演算法階層平行化
英文關鍵詞: approximate string matching, bit-parallelism algorithm, hierarchical- parallelism
DOI URL: https://doi.org/10.6345/NTNU202205595
論文種類: 學術論文
相關次數: 點閱:205下載:11
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近似字串比對被廣泛運用到很多領域,例如:網頁搜尋、網路入侵偵測系統、網路安全、去氧核糖核酸序列匹配等。近似字串比對可在輸入字串與樣式字串間容許因為插入、替代、刪除等操作所造成的誤差。因為近似字串比對是一種資料密集的運算,對於大數據的處理,加速近似字串比對成為非常重要的關鍵。本研究提出階層平行化的架構並實現於圖形處理器,以加速位元平行演算法。階層平行化的架構包括兩個階段。第一個階段使用多串流操作,用來隱藏記憶體IO的延遲,而第二個階段運用圖行處理器來加速位元平行演算法。位元平行演算法相較於CPU的平行化版本,GPU版本可獲得約3.5倍效能的改善。實驗結果顯示,相較於其他相關研究,本研究可以獲得5倍的改善。

    Approximate string matching has been widely used in many areas, such as web searching, network intrusion detection system, network security, and deoxyribonucleic acid sequence matching, etc. Approximate string matching allows difference between a string and a pattern caused by insertion, deletion and substitution. Because approximate string matching is a data-intensive task, accelerating approximate string matching has become crucial for processing big data. In this thesis, we propose a hierarchical parallel architecture to accerelate approximate string matching on GPUs. The hierarchical parallel architecture composes of two levels. The top level is to use mulitiple streams to hide the laterency of data transfer between CPU and GPU while the second level is to accelerate the kernel function of the bit-parallel algorithm on GPUs. The experimental results show that the bit-parallel algorithm performed on GPUs achieves 3.5 times faster than the multithreaded CPU counterparts. Compared to the state-of-the-art approaches, this proposed approach achieves 5 times improvement.

    中文摘要 i ABSTRACT iii 目錄 v 表次 vii 圖次 ix 第一章 緒論 1 第一節 研究背景 1 第二節 研究動機與目的 3 第三節 研究貢獻 4 第二章 位元平行演算法 5 第一節 非確定有限狀態自動機 6 第二節 Bit-Parallel Algorithm by Row 11 第三節 Bit-Parallel Algorithm by Diagonal 17 第四節 BPR與BPD之比較分析 23 第五節 各種GPU近似字串比對演算法 24 第三章 階層平行化架構 25 第一節 平行運算架構 25 第二節 CUDA實作流程 27 第三節 位元平行演算法之平行化方法 29 第四章 實驗方法與結果 35 第一節 實驗環境與簡介 35 第二節 資料片段平行化之片段大小分析 37 第三節 BPR與BPD在CPU和GPU中進行不同編輯距離和字串長度之分析 39 第四節 GPU演算法在單一串流與多串流之比較分析 41 第五節 評估不同的GPU近似字串演算法 42 第五章 結論 43 參考文獻 45

    Asaithambi, V., Zackariah, N., & Nirmala, K. (2012, March). Decision equations for efficient search algorithms for network security. Paper presented at the Advances in Engineering, Science and Management (ICAESM), Nagapattinam, Tamil Nadu.
    Baeza-Yates, R., & Navarro, G. (1999). Faster approximate string matching. Algorithmica, 23(2), 127-158.
    Dong, Y., & Qi, B. (2010, December). The technology of music retrieval by humming and its application in internet music search system. Paper presented at the Information Theory and Information Security (ICITIS), Beijing, China.
    Donghui, L., & Dewei, P. (2011, August). Spelling correction for chinese language based on pinyin-soundex algorithm. Paper presented at the Internet Technology and Applications (iTAP), Wuhan, China.
    Guo, L., Du, S., Ren, M., Liu, Y., Li, J., He, J., ... & Li, K. (2013, July). Parallel algorithm for approximate string matching with k differences. Paper presented at the Networking, Architecture and Storage (NAS), Xi'an, China.
    Ivanko, E. (2006, July). Fast approximate search in strings with rearrangements. Paper presented at the International Conference on Cognitive Informatics (ICCI), Beijing, China.
    Kaplan, K. M., & Kaplan, J. J. (2004, October). Multiple DNA sequence approximate matching. Paper presented at the Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), La Jolla, CA.
    Li, H., Ni, B., Wong, M. H., & Leung, K. S. (2011, June). A fast CUDA implementation of agrep algorithm for approximate nucleotide sequence matching. Paper presented at the Application Specific Processors (SASP), San Diego, CA.
    Lin, S. Y. (2013). A high performance parallel algorithm for approximate string matching on multi-core processor. master’s thesis, National Tsing Hua University, Hsinchu.
    Liu, T., Huang, X., Yang, L., & Zhang, P. (2009, September). Query by humming: comparing voices to voices. Paper presented at the Management and Service Science (MASS). Wuhan, China.
    Liu, Y., Guo, L., Li, J., Ren, M., & Li, K. (2012, May). Parallel algorithms for approximate string matching with k mismatches on CUDA. Paper presented at the Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), Shanghai, China.
    Man, D., Nakano, K., & Ito, Y. (2013, September). The approximate string matching on the hierarchical memory machine, with performance evaluation. Paper presented at the Embedded Multicore Socs (MCSoC), Tokyo, Japan.
    Myers, G. (1999). A fast bit-vector algorithm for approximate string matching based on dynamic programming. Journal of the ACM (JACM), 46(3), 395-415.
    NVIDIA Corporation. (2012). CUDA C Programming Guide. Retrieved March 2, 2014, from http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
    NVIDIA Corporation. (2014). CUDA C Programming Guide. Retrieved December 12, 2014, from http://www.nvidia.com.tw/object/what-is-gpu-computing-tw.html
    Oh, S. I., Lee, I., & Kim, M. S. (2012, October). Fast filtering for intrusion detection systems with the shift-or algorithm. Paper presented at the Communications (APCC), Jeju, Korea.
    Onsjo, M., Aono, Y., & Watanabe, O. (2009). Online approximate string matching with CUDA, Technical Report of Tokyo Institute of Technology. 1-4. Retrieved March 27, 2014, from http://pds13.egloos.com/pds/200907/26/57/pattmatch-report.pdf
    Si, J., Yang, L., Lu, C., Sun, J., & Mei, S. (2009, June). Approximate dynamic programming for continuous state and control problems. Paper presented at the Mediterranean Conference on Control and Automation, Thessaloniki, Greece.
    Smith, T. F., & Waterman, M. S. (1981). Identification of common molecular subsequences. Journal of molecular biology, 147(1), 195-197.
    Tevatia, S., Prasad, R., & Rai, D. (2013, August). An offensive algorithm for multi-pattern parameterized string matching. Paper presented at the Control Computing Communication & Materials (ICCCCM), Allahabad, India.
    Wu, S., & Manber, U. (1992). Fast text searching: allowing errors. Communications of the ACM, 35(10), 83-91.
    Xu, K., Cui, W., Hu, Y., & Guo, L. (2013). Bit-parallel multiple approximate string matching based on GPU. Procedia Computer Science, 17, 523-529.

    下載圖示
    QR CODE