研究生: |
陳奕君 Yi-Chun Chen |
---|---|
論文名稱: |
超大型資料倉儲之設計與建置-以電信業固網通聯記錄為例 The Design and Development of a Super Data Warehouse–Using Telecom Call Records as an Example |
指導教授: |
鄭枸澺
Jeng, Jeu-Yih 林順喜 Lin, Shun-Shii |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2007 |
畢業學年度: | 95 |
語文別: | 中文 |
論文頁數: | 70 |
中文關鍵詞: | 資料倉儲 、關聯式資料庫 、實體化視域 、自我維護 、維護成本 、超大型資料庫 |
英文關鍵詞: | Data warehouse, Relational database, Materialized view, Self-maintainability, Maintenance cost, Very large database, VLDB |
論文種類: | 學術論文 |
相關次數: | 點閱:208 下載:10 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著知識與資訊科技的發達,企業組織所面對的已是一個快速變遷的環境,企業組織的經營者或管理者對決策資訊的需要日益殷切。傳統資料庫因為架構及系統擴充性等限制,已漸無法應付使用者多元化且要求時效性的需求,而且當決策分析所需的資料分布在不同且異質性的資料庫時,整合這些資料將是複雜且費時的程序,在凡事皆講求時效的資訊時代中,因為決策分析的缺乏效率,可能使企業組織因而喪失先機並降低競爭力。正因如此,身居電信業龍頭的中華電信不例外,近年來也一直致力於各式資料倉儲的建設-其中,固網通聯記錄原本以分散方式收集於32台營運處的資料庫,適逢汰換年限,擬改集中於中華電信公司北、中、南分公司三處資料庫。
本研究藉由「類似研究及相關文獻資料蒐集」及「現行舊系統的研究」,整理及歸納出影響資料庫效能的因素,目的在:
1.提出一套建置有效率及穩定的超大型資料倉儲之方法,利用中華電信固網通聯當作實驗標的,進行實作及效能驗證。
2.藉由與現行系統比較執行效能,證實即使在龐大的資料量下,處理效能仍能保持。藉以證明所設計的方法,確實可以建置一個有效率及穩定的超大型資料倉儲。
3.經由固網實際通聯記錄龐大資料量的實作考驗及測試數據驗證效能後,期盼能提供建置超大型資料倉儲系統之參考。
資料庫效能的好壞可由新增、異動、刪除、及查詢四方面來鑑定;本研究設計出一套方法,在有限的資源之下,有效地發揮資料倉儲的作用。利用本研究所設計的方法,已在中華電信固網通聯系統實作,並經測試驗證,根據數據顯示新系統效能顯著優於舊系統,證實本研究所提方法能有效達成效能要求。同時,也一併克服其他同類研究所遇到無法處理的問題。
關鍵詞 : 資料倉儲、關聯式資料庫、實體化視域、自我維護、維護成本、超大型資料庫
Living in an age of knowledge and information explosion, the enterprise organizations need to face a changeable environment. More and more decision-making information is needed by the proprietor or manager of the enterprise organizations day by day. Due to the limit of system expandability, the traditional database has become inefficient to deal with such a gradually changeable user demand. Moreover, it will be more complex to integrate the data for decision making that are widely distributed over heterogeneous databases. When efficiency is highly required, if the decision making lacks for efficiency, it will let the enterprise organization lose their competition ability. As a result, Chunghwa Telecom Co., Ltd. (CHT), the leader of telecommunication industry will not be an exception. In recent years, they also continuously devoted themselves to various types of data mining constructions. Previously, the call records of fixed-line network were distributed over 32 business places originally, which happened to approach the time of the equipment replacement, CHT planed to concentrate all those call records to three databases in the north, middle and south sections in Taiwan.
According to “similar research and related literature about data collection” and “the research of the previous system”, this research tries to derive the factors of effects for improving the database. The main purposes are:
1. Propose an efficient and stable method to construct a super large data warehouse, and regard Chunghwa Telecom Co., Ltd. fixed-line network call records as the experiment objects, and actually carry on realization and efficiency experiment.
2. By comparing the efficiency with the previous system, it proves and confirms that it can still keep the efficiency even in a huge data quantity. According to the experiments, we can show that our method can really build an efficient and stable super large data warehouse.
3. Under lots of call records measurement and efficiency test, we expect to provide a useful reference for establishing a super large data warehouse.
The efficiency of our database can be measured by four operations: add, change, delete, and search. This research proposes an economic method to achieve the efficiency goal of data warehouse under limited resources. The system has already been used for the service of the telecommunication fixed network call records in CHT. The experiments indicate that the efficiency of the new system is notably better than the old one, which confirms that this method can really make it. In the mean time, it also overcomes many problems that other organizations can’t solve.
Keywords: Data warehouse, Relational database, Materialized view, Self-maintainability, Maintenance cost, Very large database, VLDB
[1]張慶賀 (2003),資料倉儲中實體化視域自我維護之研究(碩士論文,朝陽科技大學,2003),全國博碩士論文資訊網,91CYUT5396005。
[2]蔣定安 (2002),資料庫基本理論與實作,台北市:東華,pp.3。
[3]CDMS (2003),資料採礦 Data Mining / CDMS 資料採礦 第六號,PChome電子報。
http://mychannel.pchome.com.tw/channel/class/show_preview.php3/?d=2003-04-07&enname=cdms&t=.htm&fn=main&view=1。
[4]http://searchoracle.techtarget.com/sDefinition/0,,sid41_gci214145,00.html
[5]蔡錦鴻 (2003),資料分割在資料倉儲上的應用 - 以農業計畫基本資料庫為例(碩士論文,臺灣大學,2003),全國博碩士論文資訊網,91NTU00404009。
[6]林裕仁 (2000),資料倉儲應用實例之建置與系統效能分析之研究(碩士論文,屏東科技大學,2000),全國博碩士論文資訊網,88NPUST396004。
[7]賴慶贊 (2002),資料倉儲之建置與資料模式適域性之研究 - 以營業稅申報查核為例(碩士論文,銘傳大學,2002),全國博碩士論文資訊網,90MCU01396007。
[8]王佩瑜 (2004),教師線上知識庫之設計與建置 - 以高中國文科為例,未出版碩士論文,國立台灣師範大學,台北市。
[9]Steve Bobrowski (2001),Oracle 8i for Linux完全導引手冊(張裕益譯),台北市:麥格羅.希爾。
[10]Michael Abbey,Michael J. Corey,IanAbramson (2002),Oracle 9i入門手冊(何致億譯),台北市:麥格羅.希爾。
[11]黃中杰、洪菁懌 (2003)。J2EE系統架構與程式設計入門。台北市:碁峰資訊。
[12]蕭凱文、薛志達、李政輝 (1999),架構企業決策支援系統-Microsoft SQL Server7.0 OLAP Services設計與實務應用,台北市:微軟。
[13]http://www.tpc.org/tpcc/default.asp