研究生: |
許義淵 |
---|---|
論文名稱: |
客家語拼音存取技術之研究 |
指導教授: | 謝建成 |
學位類別: |
碩士 Master |
系所名稱: |
圖書資訊學研究所 Graduate Institute of Library and Information Studies |
論文出版年: | 2007 |
畢業學年度: | 96 |
語文別: | 中文 |
論文頁數: | 65 |
中文關鍵詞: | 中國餘數定理 、赫序 、負載係數 、客家語 |
英文關鍵詞: | Chinese remainder theorem, hash, loading factor, Hakka dialect |
論文種類: | 學術論文 |
相關次數: | 點閱:155 下載:12 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
拼音存取技術依賴的是一套能夠快速將聲母、韻母、聲調等關鍵字集快速轉換到相對應字元的方法,目前最快速的搜尋方法是赫序(Hash)法則,只要透過相關的數學函數運算,馬上可以找到相對字元的位址。本文以客家語拼音做為研究對象,對其關鍵字集做前置的比較分析,然後以植於中國餘數定理建構出最佳完美的赫序函數。
將中國餘數定理應用到赫序法則內的好處,就是可以避免碰撞(collision)的問題,然而關鍵字集數量大的時候,也會產生常數C值太大的問題,本研究嘗試將關鍵字集做適當分組,控制關鍵字集內的C值大小,所多付出的記憶空間是額外的C值分組參數表,對於負載係數(Loading Factor)稍稍有影響,卻可降低C值之大小。
最後,本文將國語、台語、客家語等台灣三大語系之羅馬拼音統合匯整,建立一個共同的赫序函數,如此就不需要個別建構某語系的赫序函數,但是,國台語三語係合併會讓關鍵字集增多,必須做更多的分組方能使C值成長受到控制,然而關鍵字集如果分組太多會造成記憶空間之利用效率降低。對此,本文對於負載係數、C值、最大C值所用到的質數數量做迴歸分析,可以知道負載係數與C值的關係圖,而不會僅於追求記憶空間效能而忽略C值的大小。
1.李榮主編,「梅縣方言辭典」,江蘇教育出版社,1995年,頁1~32。
2.蔡明志著,「資料結構-使用C語言」,碁峰出版社,2002年,頁10-47~10-51。
3.葉榮木著,「資料結構-使用Visual Basic」,松崗出版社,1999年,頁9-9~9-18。
4.張紹勳著,「資料結構與演算法C++」,旗標出版社,2002年,頁14-1~14-24。
5.謝建成、林祐賢,「植於中國餘數定理之最佳完美文字赫序函數設計研究」,佛光人文社會學院,2002。
6.青雲科技大學駱嘉鵬老師網站 http://w3.cyu.edu.tw/luo/
7.中華民國僑務委員會全球華文網路教育中心客家語教材http://edu.ocac.gov.tw/
8.C.C. Change, “An Ordered Minimal Perfect Hash Scheme Based Upon Euler’s Theorem”, Information Sciences, 32(3),1984, 165-172.
9.C.C. Change, “The Study of an Ordered minimal Perfect Hash Scheme,” Communications of the Association for Computing Machinery, 27(4),Apr. 1984,384-387.
10.C.C. Change, “The Study of a letter Oriented Minimal Perfect Hash Scheme”, the Proceedings of Data organization, Kyoto, Japan, May 1985, 61-65.
11.C.C. Change, and Shieh, J. C.,”On the Design of Letter Oriented Minimal Perfect Hash Functions”, Journal of the Chinese Institute of Engineers, 8(3), 1985, 285-297.
12.C.C. Change, and H.C. Wu(1988):”A Fast Chinese Characters Accessing Technique Using Mandarin Phonetic Transcription, International Journal of Pattern Recongnition and Artificial Intelligence, 2(1), 105-137, 1986.
13.R. Sprugnoli, “Perfect Hash functions: A single probe retrieving methods for static sets, “Comm. ACM, Vol. 20, No.11, Nov. 1977 pp. 841-850.
14.Brian, M.D. and Tarp, A.L., “Near-Perfect Hash of Large Word Sets”, Information Systems, Vol. 5, No. 3, pp.281-290, 1990.
15..Knuth, D. E.,The Art of Computer Programming, Vol. 3 published as 2nd ed, v. 3. Sorting and Searching, Reading, Mass., Addison-Wesley, c1998.
16.Preiss, Bruno R., Data Structures and Algorithms with Object-Oriented Design Patterns in Java, New York, John Wiley, 2000.
17.Sartaj, Sahni, Data Structures, Algorithms, and Applications in Java, Chapter 8, Boston, McGram-Hill, 2000.
18.Majewski, Bohdan S., Wormald, Nicholas C., Havas, George, and Czech, Zbigniew J., “A Family of Perfect Hash Methods”, Computer Journal, 39, pp. 547-554,1996.
19.Ramakrishna, M. V., “Hash Practice-Analysis of Hash and Universal Hash”, Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data, pp. 191-199, Jun. 1988.
20.Shieh, J.C., “General Extensions of Minimal Perfect Hash Schemes for Large Letter-Oriented Data Sets”,Proc. 8th International Symposium Information Management, pp. 776-782, 1997.
21.Edward I., Heath, L.S., Chen, Qi Fan, Daoud, and Amjad M., “Practical Minimal Perfect Hash Functions for Large Database”, Communications of ACM, Vol.35, No.1, pp. 105-121, Jan. 1992.
22.Knuth, D. E., The Art of Computer Programming, Vol. 3 published as 2nd ed, v. 3. Sorting and Searching, Reading, Mass., Addison-Wesley, c1998.
23.Chichelli, R.J., “Minimal Perfect Hash Functions Made Simple”, Communications of ACM, Vol.23, No.1, pp. 17-19, 1980.
24.Sprugnoli, R., “Perfect Hash Functions: An Aingle Probe Retrieving Methods for Static Sets”, Communications of ACM, Vol. 20, No.11,pp.841-850, Nov. 1977.
25.Czech, Zbigniew J., Havas, George, and Majewski, Bohdan S., “An optimal algorithm for generating minimal perfect hash functions”, Information Processing Letters 43, pp. 257-264, 1992.
26.Czech, Zbigniew J., Havas, George, and Majewski, Bohdan S., “Perfect Hash”, Theoretical Computer Science 182, pp. 1-143, 1997.
27.Havas, George and Majewski, Bohdan S., “Optimal Algorithms for Minimal Perfect Hash,” Technical Report 234, The University of Queensland, 1992.
28. Havas, George and Majewski, Bohdan S., “Graph Theoretic Obstacles to Perfect Hash,” Congressus Numerantium 98, pp. 81-93, 1993.
29. Havas, George and Majewski, Bohdan S., Wormald, Nicholas C. and Czech, Zbigniew J. “Graphs, Hypergraphs and Hash”, Graph-Theoretic Concepts in Computer Science, Lecture Notes in Computer Science 790, pp. 153-165, 1994.
30.Sager, T., “A Polynomial Time Generator for Minimal Perfect Hash Functions”, Communications of ACM, Vol.28, No. 5, pp. 523-532, May 1985.
31.Thorup, Mikkel, “Even Strongly Universal Hash is Pretty Fast”, Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 496-497, Feb. 2000.
32.Chang, C.C. and Wu, H.C., “A Letter-oriented Perfect Hash Scheme Based Upon Sparse Table Compress”, Software: Practice and Experience, Vol. 21, No.1, pp. 35-49, Jan. 1991.
33.Alon, Tardos, “Linear Hash Functions”, Journal of the ACM, Vol. 46, No. 5, September 1999.
34.J.C. Shieh,An Efficient Accessing Technique for Taiwanese Phonetic Transcriptions,ACM Transaction on Asia Language Information Processing,Vol.2, No.1, March 2003,P63-77
35.Shieh, J. C. and Lin, S. I. 2002. On the design of minimal perfect Hash functions for Taiwanese phonetic transcriptions. In the Proceedings of the 13th International Symposium on Information Management
36. J.C.Shieh,A fast and efficient Taiwanese Han characters retrieval technique using phonetic transcriptions, CSIM IMP 2006
37. J.C.Shieh,Y.Y.Shiu,J.H.Jiang, An Efficient Accessing Technique for Hakka Dialect Phonetic Transcriptions, IMS 2007