簡易檢索 / 詳目顯示

研究生: 黃帝維
Di-Wei Huang
論文名稱: 非即時自動導播技術研究
A Research of Off-line Automatic Virtual Director for Lecture Video
指導教授: 李忠謀
Lee, Chung-Mou
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 中文
論文頁數: 72
中文關鍵詞: 教學影片編輯教學影片產生教學影片內容分析虛擬導演
英文關鍵詞: Lecture video, video composition, lecture video content analysis, virtual director
論文種類: 學術論文
相關次數: 點閱:64下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著科技的進步、網路頻寬的快速提升,以網路為基礎的網路教學、會議影片的傳輸及其應用越來越受企業界、學術界的重視。雖然目前有許多系統能夠線上觀看會議及上課的影片、投影片,但往往內容是無趣的。因此如何呈現給觀眾想看的內容,播放的內容是有意義、有趣的,能讓觀眾看重要的影片內容顯得相當重要。目前有不少自動化上課錄影系統,但很少提到能夠選擇適合的內容給觀眾看的方法。本篇提出的方法,在偵測事件發生後,能搭配有限狀態機的序列,選擇何時該播放哪一台攝影機的內容給使用者觀看並給使用者重點提示。
    在實作方面,我們對影片的內容做分析,在影像部分,分析了講者的行為及布幕的事件偵測,聲音部分分析音源方向。接著,評估各事件的重要程度及發生順序,搭配有限狀態機產生的狀態序列選擇播放哪台攝影機的內容,另一方面,在適當的時間加入合適的圖示來提示觀看者可能有重點。
    在實驗結果中顯示我們攝影機切換的時間點、切換到哪一台,都有不錯的效果,能夠讓教學影片內容更有趣、更有意義。

    As technology advances and rapid upgrading network bandwidth, business and the academic attract attention by conference video transmission and internet-based online education video. Although there are many systems could watch online conferences or course video, the content maybe boring. How to present contents to audience what they want. Let audience see meaningful and interesting video is very important.
    In practical implementation, we analyze the contents of video, the image part, analyze the behavior of speakers and the curtain of event detection, analysis of the voice part of the sound source. Then evaluate the importance of the event and the order, use finite state machines generated sequence of states choose the play contents of which camera. On the other hand, add visual hints at the appropriate time to prompt the viewer may be focused.
    The experimental results show that our camera switching time, switch to which one had good results. Let teaching video content more interesting and more meaningful.

    附表目錄 7 附圖目錄 8 第一章 緒論 10 1.1 研究動機 10 1.2 研究目的 11 1.3 研究範圍與限制 12 第二章 文獻探討 14 2.1 MICROSOFT RESEARCH相關研究 14 2.2 AUTOAUDITORIUM相關研究 16 2.3 自動重組運動比賽影片系統 16 2.4 其它重組影片及自動化導播系統 18 第三章 研究方法 20 3.1 系統設置 20 3.2 系統架構與流程 21 3.3 影片內容分析 21 3.3.1 視覺分析 22 3.3.2 聲音分析 41 3.4 影片內容重組 42 3.4.1 切換原則 43 3.4.2 FSM設計 44 3.4.3 輪播模式 51 3.4.4 加入影片內容提示 52 第四章 實驗結果與分析 53 4.1 實驗流程說明 53 4.2 事件偵測評估方式 53 4.3 事件偵測結果與說明 54 4.3.1 講者指向布幕偵測 54 4.3.2 講者走向布幕偵測 56 4.3.3 布幕換頁及動畫偵測 57 4.3.4 觀眾發言偵測 58 4.4 影片重組評估 59 曾使用過相關系統 59 未曾使用過相關系統 59 第五章 討論與未來發展 60 5.1 結論 60 5.2 未來研究 60 參考文獻 61 附錄A 65 附錄B 68

    [1] Q. Liu, Y. Rui, A. Gupta, and J. J Cadiz, “Automating Camera Management for Lecture Room Environments”, Proceedings of the SIGCHI conference on Human factors in computing systems, Seattle, Washington, USA,pp. 442-449, Apr. 2001.

    [2] C. Zhang, Y. Rui, J. Crawford, and L. W. He , “An Automated End-to-End Lecture Capture and Broadcasting System”, ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 4, no. 2, pp.601-623, Jan. 2008.

    [3] Y. Rui., A. Gupta., and J. Grudin, “Videography for Telepresentations”, Proceedings of the SIGCHI conference on Human factors in computing systems, Ft. Lauderdale, Florida, USA, pp 457-464, Apr. 2003.

    [4] M. H. Bianchi, “The AutoAuditorium_System –10 Years of Televising Presentations Without a Crew”, http://www.autoauditorium.com, Sep. 2009.

    [5] M. Bianchi, “Automatic Video Production of Lectures Using an Intelligent and Aware Environment”, Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia, College Park, Maryland U.S.A, pp.117-123, 2004.

    [6] L. Rowe, D. Harley, P Pletcher, and S Lawrence, “BIBS: A Lecture Webcasting System”, Berkeley Multimedia Research Center, 2001.

    [7] J. Wang, C. Xu, E. Chng, H. Lu, and Q. Tian, “Automatic Composition of Broadcast Sports Video”, Multimedia Systems, vol 14, no. 4, pp. 179-193, Mar. 2008.

    [8] J. Wang, C. Xu, E Chng, L. Duan, K. Wan , and Q. Tian, “Automatic Generation of Personalized Music Sports Video”, Proceedings of the 13 the annual ACM International conference on Multimedia, Hilton, Singarpore, pp. 735-744, 2005.

    [9] F. Choi, R. Beales, J. Hearn, S. E. Middleton, M. Addis, and C. Mangos, “A Feature-Augmented Grammar for Automated Media Production”, 2nd International Conference on Automated Production of Cross Media Content for Multi-channel Distribution, Leeds, UK, pp. 315-318, Dec. 2006.

    [10] B. Yu, C. Zhang, Y. Rui, and K. Nahrstedt, “A Three Layer Virtual Director Model for Supporting Automated Multi-site Distributed Education” , IEEE International Conference on Multimedia and Expo, Toronto, Ontario, Canada, pp. 637-640, Jul. 2006.

    [11] J. A. Brotherton, J. R. Bhalodia, and G. D. Abowed, “Automated Capture, Integration, and Visualization of Multiple Media Streams”, International Conference on Multimedia Computing and Systems, Austin, Texas, pp. 54-63, Jul. 1998.

    [12] R. Y. D. Xu, J. S. Jin, and J. G. Allen,“Framewrok for Script Based Virtual Directing and Multimedia Authoring in Live Video Streaming”, International Conference on Multimedia Modeling, Melbourne, Australia,pp 427-432, Jan.2005.

    [13] M AI-Hanes, B. Hornler, R. Muller, J. Schenk, and G. Rigoll, “Automatic Multi-model Meeting Camera Selection for Video-Conferences and Meeting Browsers”, IEEE International Conference on Multimedia and Expo, Bejing, China, pp 2074-2077, Jul. 2007.

    [14] Y. T. Lin, B. J., Yen, and G. C. Lee, “Structuring and Analyzing Low Quality Lecture Videos”, IEEE International Conference on Acoustics, Speech and Signal Processing , Taipei, Taiwan, pp 1925-1928, Apr. 2009.

    [15] C. M. Li, Y. S. Li, S. H. Wang, and X. Q. Zhang, “Moving Human Body Detection in Video Sequences”, Proceedings of the Sixth International Conference on Machine Learning and Cybernetics, Hong Kong,vol. 4, pp. 2188-2192, Aug. 2007.

    [16] P. Geladi, H. Isaksson, L. Lindqvist, S. Wold, and K. Esbensen, “Principal Component Analysis of Multivariate Images”, Chemometrics and Intelligent Laboratory Systems, vol. 5, pp. 209-220. 1989.

    [17] NIST/SEMATECH ,“Mean Vector and Covariance Matrix”, http://www.itl.nist.gov/div898/handbook/pmc/section5/pmc541.htm,.

    [18] C. Fredembach, M. Schröder , and S. Süsstrunk, “Eigenregions for Image Classification”, IEEE transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1645-1649. Dec. 2004

    [19] P. Peer , J. Kovac, and F. Solina,. “Human Skin Colour Clustering for Face Detection”. International Conference on Computer as a Tool EUROCON, Turku, Finland, Sep. 2003.

    [20] V. Vezhnevets, V. Sazonov, and A. Andreeva, “A Survey on Pixel-based Skin Color Detection Techniques”, In Proceedings of the Graphicon, Russian, pp.85-92, 2003.

    [21] N. A. Rahman, K. C. Wei, and J. See, “RGB-H-CbCr Skin Colour Model for Human Face Detection”, “International Symposium on Information and Communications Technology in Social Development”, Jakarta, 2006.

    [22] Mike B, “VOICEBOX:Speech processing toolbox for matlab”, http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.

    [23] J.S. Jang, Speech and Audio Processing Toolbox , “http://mirlab.org/jang”.

    [24] MAT-2000:國語語音資料庫”, 中華民國計算語學學學會, http://www.aclclp.org.tw/

    [25] H. C. Wang, F. Seide, C. Y. Tseng, and L. S. Lee, “MAT-2000-Design, Collection, and validation of a Mandarin 2000-speaker telephone speech database”, In Proceedings of International Conference on Spoken Language Processing, Munich, Germany, 1997.

    [26] E. Machnicki,“Virtual Director: Automating a Webcast”, , Multimedia Computing and Networking, vol.4673, pp 208-225, 2002.

    [27] R. Baecher, “A Principled Design for Scalable Internet Visual Communications with Rich Media, Interactivity, and Structured Archives”, Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research, Toronto, Ontario, Canada,pp 16-29, 2003.

    [28] Peter L., “An Introduction to Formal Languages and Automata”, Narosa Book Distributors, 2007.

    [29] 師範大學-教學影片學習站, http://gtsvr1.dmc.ntnu.edu.tw/

    [30] 成功大學-網路教學系統, http://iteach.ncku.edu.tw/

    [31] 柏克萊大學-線上遠距教學系統http://learn.berkeley.edu/

    下載圖示
    QR CODE