簡易檢索 / 詳目顯示

研究生: 莊淳雅
Chuang, Chun-Ya
論文名稱: 虛擬導播系統
Virtual Director System
指導教授: 陳世旺
Chen, Sei-Wang
學位類別: 碩士
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 中文
論文頁數: 90
中文關鍵詞: 虛擬導播攝影美學多核學習反傳遞類神經網路時空聚集運鏡
英文關鍵詞: virtual director, photographic aesthetics, multiple kernel learning, CPN, STA, entropy, steering motion
DOI URL: https://doi.org/10.6345/NTNU202203719
論文種類: 學術論文
相關次數: 點閱:82下載:5
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 一場完整的演講錄製通常會有兩臺以上的攝影機用來拍攝不同的主體,例如:演講者、聽眾等。而負責選鏡的導播會從其中選出最適合的畫面播放給觀看者。一個專業的導播需要經過長時間的訓練和實際經驗,才能越符合觀看者的期待。為了節省導播訓練的成本,本研究提出一個能模擬實際導播的運作和工作的系統,稱之為「虛擬導播系統」。

    With two or more video cameras filming at different subjects during a lecture period, a complete lecture recording is considered done. A professional director will select a best shot for the target audiences It takes long period of time, training and experience to succeed a professional director in order to provide the most suitable viewing experiences to the audiences. Therefore, a “virtual director system” is proposed to achieve the goal, a system for simulating operations and workings of directors, and to cost down the hiring and training processes of a professional director.
    Two important segments are included in the proposed research, virtual director system, the shot selection and visual instruction. This research will evaluate the contents and classify them into nine standards of viewings. This research uses multiple kernel learning and spatio-temporal aggregation (STA) to train data and simulate a director whom has a unique shooting style.
    This system includes three groups of virtual cameramen to film a speaker, audiences and overview respectively. Visual instruction is a director giving cameramen shooting advices according to frames from different cameraman. This system can define events based on speaker’s gesture, moving points of size and ranges from audience frame and overview frame, then sending to different instructions to recommend steering mode. Through lecture record testing, analyzing and comparing with other methods, this research is more comfortable to view’s expectation.

    第一章 緒論 1 1.1 研究動機 1 1.2 文獻探討 4 1.2.1 影像畫面的評估 5 1.2.2 多重畫面的選鏡 6 1.3 論文架構 7 第二章 系統架構與流程 9 2.1 系統架構 9 2.2 虛擬導播的系統流程 13 第三章 虛擬導播之選鏡 15 3.1 內容分析 15 3.1.1 美學分析 16 3.1.2 光學分析 21 3.1.3 連續性分析 25 3.1.4 動作分析 28 3.2 多重畫面的決策 31 3.2.1 Multiple Kernel Learning(MKL) 32 3.2.2 Counterpropagation Network(CPN) 42 3.3 測試資料用於決策模型 44 3.4 結語 44 第四章 虛擬導播之攝影指導 46 4.1 講者的手部姿勢 46 4.1.1 手部姿勢的樣板資料庫 47 4.1.2 手部姿勢的辨識 50 4.1.3 手部姿勢的類別定義 51 4.2 動點動作大小及範圍 55 4.2.1 Spatio-temporal aggregation(STA) 56 4.2.2 Entropy 61 4.3 事件定義與攝影指導 63 4.4 結語 67 第五章 實驗結果 69 5.1 實驗前的準備工作 69 5.1.1 使用者介面 69 5.1.2 訓練決策模型 71 5.2 實驗設備與初步結果 72 5.2.1 實驗器材與架設方式 73 5.2.2 初步結果 73 5.3 與其它方法的比較和分析 75 5.3.1 實驗一 75 5.3.2 實驗二 77 5.3.3 實驗三 82 5.3.4 實驗四 84 第六章 結論與未來工作 85 6.1 結論 85 6.2 未來工作 86 參考文獻 87

    [Abd10] G. Abdollahian, C. M. Taskiran, Z. Pizlo, and E. J. Delp, “Camera Motion-Based Analysis of User Generated Video,” IEEE Transaction on Multimedia, Vol. 12, No. 1, 2010.
    [Bia98] M. Bianchi, “Auto Auditorium: A Fully Automatic, Multi-camera System to Televise Auditorium Presentations,” Proc. of the Joint DARPA/NIST Workshop on Smart Spaces Technology, 1998.
    [Che95] Y. Cheng, “Mean Shift, Mode Seeking, and Clustering,” IEEE Transaction on PAMI, Vol. 17, No. 8, pp. 790-799, 1995.
    [Cru94] G. Cruz and R. Hill, “Capturing and Playing Multimedia Events with STREAMS,” Proc. ACM Int’l Conf. on Multimedia, pp. 193-200, 1994.
    [Fan03] C. Y. Fang, S. W. Chen, and C. S. Fuh “Automatic Change Detection of Driving Environments in a Vision-Based Driver Assistance System,” IEEE Transactions on Neural Networks, vol. 14, no. 3, pp. 646-657, 2003.
    [Hu15] M. C. Hu, C. W. Chen, W. H. Cheng, C. H. Chang, J. H. Lai, and J. L. Wu, “Real-Time Human Movement Retrieval and Assessment With Kinect Sensor”, IEEE Transactions on Cybernetics, Vol. 45, No. 4, 2015.
    [Kum02] M. Kumano, Y. Ariki, M. Amano, K. Uehara,”Video Editing Support System Based on Video Grammar and Content Analysis,” Proceedings. of the International Conference on Pattern Recognition(ICPR) , vol. 2, pp. 1031-1036, 2002.
    [Li12] C. I. Li, A. C. Luo, C. J. Lu, and Sei-Wang Chen, “Automated Lecture Recording System –the Virtual Cameraman Subsystem,” Proc. of the 25th IPPR Conf. on CVGIP. 2012.
    [Liu01] Q. Liu, Y. Rui, A. Gupta, and J. J. Cadiz, “Automating Camera Management for Lecture Room Environments,” Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, pp. 442-449, 2001.
    [Liu11] T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, and H.Y. Shum, “Learning to Detect a Salient Object, “ IEEE Transactions on PAMI, Vol. 33, No. 2, pp. 353-367, 2011.
    [Lu13] C. J. Lu, A. C. Luo, C. F. Hsu, and S. W. Chen, “Virtual Director - Real-Time Automatic Shot Selection,” Proc. of the 26th IPPR Conf. on CVGIP, Aug. 2013.
    [Luc81] B. D. Lucas, T. Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,” Proceedings of Imaging Understanding Workshop, pp. 121-130, 1981.
    [Mac02] E. Machnicki and L. Rowe, “Virtual director: Automating a webcast,” Multimedia Comput. Network., 2002.
    [Oku07] S. Okuni, S. Tsuruoka, G. P. Rayat, H. Kawanaka, T. Shinogi, “Video Scene Segmentation Using the State Recognition of Blackboard for Blended Learning,” International Conference on Convergence Information Technology, pp. 2437-2442, 2007.
    [Oni04] M. Onishi and K. Fukunaga, “Shooting the Lecture Scene Using Computer-Controlled Cameras based on Situation Understanding and Evaluation of Video Images” Proc. of the 17th International Conference on Mobile and Ubiquitous Multimedia, pp. 781–784, 2004.
    [Ren12] W. Y. Ren, G. H. Li, J. Chen, and H. Z. Liang, “Abnormal Crowd Behavior Detection Using Behavior Entropy Model,” Proceedings of the 2012 International Conference on WAPR, pp. 212-221, 2012.
    [Sho13] J. Shotton, R. Girshick, A. Fitzgibbon, T. Sharp, M. Cook, M. Finocchio, R. Moore, P. Kohli, A. Criminisi, A. Kipman, and A. Blake, “Efficient Human Pose Estimation from Single Depth Images”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, No. 12, 2013.
    [Tav14] M. Tavassolipour, M. Karimian, and Shohreh Kasaei, “Event Detection and Summarization in Soccer Videos Using Bayesian Network and Copula”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 24, No. 2, 2014.
    [Wan09] T. Wang, A. Mansfield, R. Hu, J. Collomosse, “An Evolutionary Approach to Automatic Video Editing,” Proceedings. of the International Conference on Visual Media Production(CVMP), pp. 127-134, 2009.
    [Yan14] H. Yang and C. Meinel, “Content Based Lecture Video Retrieval Using
    Speech and Video Text Information”, IEEE Transactions on Learning Technologies, Vol. 7, No. 2, 2014.
    [Yen04] P. S. Yen, C. Y. Fang, and S. W. Chen, “Motion Analysis of Nearby Vehicles on a Freeway,” IEEE International Conference on Networking, Sensing and Control, Vol.2, pp.903-908, 2004.
