研究生: |
楊繡如 |
---|---|
論文名稱: |
網頁表格資訊自動對話模式之研究 Automatic Table Dialog Model on VoiceXML |
指導教授: |
葉耀明
Yeh, Yao-Ming |
學位類別: |
碩士 Master |
系所名稱: |
資訊教育研究所 Graduate Institute of Information and Computer Education |
論文出版年: | 2004 |
畢業學年度: | 92 |
語文別: | 中文 |
論文頁數: | 149 |
中文關鍵詞: | 語音對話系統 、多樣化模式存取網站 、電話語音入口網頁 、轉碼 |
英文關鍵詞: | VoiceXML, Multimodal Interaction, TelePortal, transcoding |
論文種類: | 學術論文 |
相關次數: | 點閱:221 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
摘 要
近來科技的發展,讓人類的日常生活日漸依賴各種網路資訊服務,過去人類透過電腦來使用這些資訊服務,是遷就電腦傳統的輸出輸入介面,例如鍵盤、滑鼠等。現在由於行動上網、語音技術的進步,已逐漸形成使用資訊服務的新趨勢,讓人們可以透過電話和語音來瀏覽網頁和使用資訊服務。除此之外,這些技術也可以造福身心障礙者,尤其是視障者,可以讓他們用語音互動來瀏覽網頁和使用資訊服務。在1990年代,語音按鍵系統開始萌芽,但只能利用錄音技術提供固定的語音服務。後來在西元2000年,新一代的語音技術VoiceXML崛起,不但可以利用語音辨識與語音合成的技術提供更有彈性的語音服務,而且可以整合電話網路與網際網路的資訊服務。
惟VoiceXML內容複雜,開發不易,因此,本論文探討如何將HTML網頁轉換成VoiceXML的理論與技術。本研究由HTML表格資訊切入,研究並分析歸類網頁上的六種表格類型,根據每個類型設計不同的對話模式,並開發了將表格轉成VoiceXML格式的VTG(Voice Table Generator)模組,以及使用表格網頁來製作語音網站的VXPB(VoiceXML Portal Builder)系統。在VTG與VXPB的幫助下,網頁設計者透過簡單的操作,就可以設計出語音網站,讓電話使用者將可藉由電話與語音平台對話互動,使一般網站上能夠看到的表格資訊,也可以在語音瀏覽器上以語音網站的方式來呈現給使用者。除此之外,本研究亦使用VXPB與VTG系統,製作有實際功能之「網路書店」、「系所資訊語音入口網」等查詢系統,來驗證VXPB與VTG系統之功能。
ABSTRACT
Recently, because of the development of technology, people rely more and more on various information services on Internet in their daily life. In the past, people using computers to access information services yielded to traditional Input/Output interface, for example, keyboard and mouse. Now, the appearance of mobile telecommunication and speech technology enable people to browse web pages by their voice and telephone, and this has become a new trend for using information services. Besides, these technologies can help disabilities, especially the sight-impaired people, to browse web pages and access information services by dialog interaction. Since the mid-1990s, the touch-tone interactive voice response (IVR) system was born. IVR systems only provide static voice service by sound recording. In 2000, VoiceXML came up. It not only provides more flexible voice services by speech recognition and speech synthesis but also integrates telecommunication and Internet for information services.
However, VoiceXML is complicated and hard to develop. Consequently, this thesis proposed a methodology to transcode HTML to VoiceXML. This research focuses on transcoding the HTML table information and classifies HTML tables to six types. According to each type of HTML tables, the dialog models corresponding to each type of HTML tables is designed. Also, the VTG (Voice Table Generator) system which converts HTML tables to VoiceXML and VXPB (VoiceXML Portal Builder) system which helps user to create VoiceXML portal are presented. By means of VTG and VXPB, web page designer can build voice portal by easy operation. Telephone users can access voice portal using their voice to obtain the HTML table information. Therefore, people can obtain the information not only by “seeing” the web page but also “listening” the auditory web pages. Moreover, in order to test and verify VXPB and VTG, this research also uses VXPB and VTG to build voice portal with query functionality, such as "Web Bookstore Information" and "Portal of Department Information".
參考文獻
【1】 黃英旗,以語音呈現模式導讀網頁文件之研究, 國立台灣師範大學資訊教育系碩士論文,中華民國九十一年六月。
【2】 國內無障礙網頁標準, http://enable.nat.gov.tw/
【3】 C. M. Huang, M.Y. Jang ,” Surfing the Web using a telephone set ”, IEEE Euromicro Conference, 2000 . Page(s): 126 -133 vol.2
【4】 C. Sharma , J. Kunins; “VoiceXML: Professional Developer's Guide with CDROM” , December 7, 2001
【5】 Evaluation and Repair Tools Working Group, http://www.w3.org/WAI/ER/
【6】 Extensible Markup Language (XML) 1.0 (Second Edition), http://www.w3.org/TR/REC-xml , October 6, 2000.
【7】 J. A. Larson, “VoiceXML and the W3C speech interface framework”, IEEE Multimedia, , Oct. - Dec. 2003 , Page(s): 91 -93
【8】 J. Kleindienst, L. Seredi, P. Kapanen, J. Bergman , ” CATCH-2004 multi-modal browser overview description with usability analysis ” , IEEE Fourth International Conference on Multimodal Interfaces, 2002. Page(s): 442 -447
【9】 M. Mittendorfer, G. Niklfeld, W. Winiwarter , ” Making the VoiceWeb smarter - integrating intelligent component technologies and VoiceXML ”, IEEE Web Information Systems Engineering, 2002 , Page(s): 126 -131 vol.2
【10】Multimodal Interaction Activity, http://www.w3.org/2002/mmi/
【11】Multimodal Requirements for Voice Markup Languages,W3C, http://www.w3.org/TR/multimodal-reqs , W3C Working Draft, July 2000.
【12】S. H. Maes, ” A VoiceXML framework for reusable dialog components ”, IEEE Applications and the Internet, 2002 . Page(s): 28 -30
【13】Speech Recognition Grammar Specification Version 1.0, http://www.w3.org/TR/2002/CR-speech-grammar-20020626/ , June 26, 2002.
【14】 Tablin: an HTML Table linearizer, http://www.w3.org/WAI/References/Tablin/
【15】Z. Shao, R. Capra, M.A.Prez-Quiones , “Transcoding HTML to VoiceXML Using Annotation”, IEEE International Conference on Tools with Artificial Intelligence,2003.
【16】VoiceXML Forum, http://www.voicexml.org/
【17】Voice Extensible Markup Language (VoiceXML) Version 1.0, http://www.w3.org/TR/2000/NOTE-voicexml-20000505/ , May 5, 2000.
【18】Voice Extensible Markup Language (VoiceXML) Version 2.0, http://www.w3.org/TR/voicexml20/ , March 16, 2004.
【19】Voice Extensible Markup Language (VoiceXML) 2.1, W3C Working Draft, http://www.w3.org/TR/voicexml21/, March 23 , 2004.
【20】Voice Browser Working Group, http://www.w3.org/Voice/Group/
【21】W3C Document Object Model, http://www.w3.org/DOM/
【22】WAI HTML Table Linearizer Entry Form, http://www.w3.org/WAI/References/Tablin/form
【23】Web Accessibility Initiative (WAI), http://www.w3.org/WAI/