簡易檢索 / 詳目顯示

研究生: 廖軒嘉
Hsuan-Chia Liao
論文名稱: 虛擬觀眾攝影師系統
Virtual Audience Cameraman System
指導教授: 陳世旺
Chen, Sei-Wang
方瓊瑤
Fang, Chiung-Yao
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 中文
論文頁數: 65
中文關鍵詞: 虛擬觀眾攝影師STA(Spatio-Temporal Attention neural model)運鏡專業攝影學
英文關鍵詞: Virtual Audience Cameraman System, STA (Spatiotemporal Attention) neural model, Camera Steering, Professional Photography
論文種類: 學術論文
相關次數: 點閱:99下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究的主旨在於建立一套模擬專業攝影師並以觀眾為拍攝主體的虛擬觀眾攝影師系統。現今社會中許多資訊的傳播都是透過演講方式,而為了讓觀賞者能隨時觀看演講內容,聘請專業攝影團隊紀錄整場演講是最直接的方式。然而現代生活的人力資源成本不斷提升,聘請一組專業攝影團隊的成本並不低,因此本研究發展一套虛擬觀眾攝影師系統來節省人力資源成本,同時提供專業攝影技巧以製作高規格影片。
    本研究以兩台Pan Tilt Zoom Camera(PTZ攝影機)為一組作為實驗設備,一台稱為global-view攝影機,另一台為稱local-view攝影機。Global-view攝影機是用來代表攝影師的雙眼,主要功能是監控畫面與主體偵測並找出畫面中感興趣的區域(Region Of Interesting, ROI);Local-view攝影機則是用來代表攝影師手上的攝影機,在系統決定ROI與運鏡所需要的一切資訊後,local-view攝影機就會實際執行運鏡動作並進行拍攝。
    本系統的主要目的是模仿專業攝影師的拍攝技巧並自動進行運鏡拍攝動作,而為了符合專業攝影師的拍攝技巧與手法,每次運鏡前系統皆需要先決定運鏡方式、景別、主體等要素。首先系統從global-view攝影機所提供的連續影像中擷取具有描述觀眾行為的motion特徵,再將這些特徵經過運算處理並找出畫面中的候選ROI,接著將這些候選ROI輸入STA(spatio-temporal attention neural model),STA能夠紀錄並提供相關資訊來協助系統找出最適合拍攝的ROI。之後系統計算欲拍攝的ROI與鏡頭中心位置的對應關係,並依據輸入的資料輸出最適合該情況的運鏡方式和景別用以啟動local-view攝影機運鏡拍攝;而local-view攝影機所拍攝的主體挑選與拍攝畫面的質感主要是以美學以及光學的特徵分析來做為判斷標準,本研究透過上述流程來模擬專業攝影學的拍攝技巧。
    實驗結果顯示,本系統所運用的方法可以進行即時且流暢的運鏡動作並可準確模擬專業攝影師的拍攝手法,符合專業攝影團隊來拍攝記錄演講錄製的需求。

    This thesis proposes a virtual audience cameraman system to capture the audience videos automatically. Nowadays the contents of lectures can be broadcast widely and rapidly by digital videos, thus to capture digital videos of important lectures for the viewers is an essential work. However, the cost to hire a video-recording team, including professional photographers, to capture good-quality digital videos is very high. Thus this study developed a virtual audience cameraman system which can obtain good-quality digital videos automatically and reduce the cost of hiring a professional video-recording team.
    In this study, two PTZ cameras are mounted together to be a set, one is the global-view camera and the other is the local-view camera. The global-view camera can be regarded as the photographer's eyes. It can be used to monitor the whole audience and help the region of interesting (ROI) detection. The local-view camera can be regarded as the photographer's camera on hand. It can be used to capture the videos from ROI after the system determines the location of ROI.
    Since the purpose of this system is to simulate the camera-control behaviors of professional photographers to capture the audience videos, the proposed system needs to decide the camera steering mode, shot class, and the objects before camera steering. First, the system obtains input videos from the global-view camera and then detects the audience motion features to locate the ROI candidates. The ROI candidates are then input into the spatiotemporal attention (STA) neural model. The STA neural model can record and provide information to help the system to identify the most suitable shooting ROI. Further, the system computes the relative distance between the location of the ROI on the frame and the center of the camera lens, and outputs the appropriate steering mode of the local-view camera. The local-view camera then captures the output videos from the location of ROI by considering the viewpoint of aesthetics and the analysis result of optical characteristics. Through the above process this system can simulate professional photography shooting skills.
    The experimental results show that the proposed method can steer the camera immediately, automatically, and smoothly. It can also simulate the style of professional photographers accurately.

    目錄 i 圖目錄 iii 表目錄 v 第一章 緒論 1 第一節、研究背景與目的 1 第二節、研究困難 4 第三節、研究範圍與限制 6 第四節、論文架構 6 第二章 文獻探討 8 第一節、自動化攝影師系統相關研究 8 第二節、自動化攝影師系統相關技術 11 第一項、主體選取 11 第二項、主體動作偵測 12 第三章 系統概述 15 第一節、系統架設環境 15 第二節、系統流程 18 第四章 ROI選取 22 第一節、觀眾動作分析 22 第二節、候選ROI偵測 23 第三節、ROI挑選 27 第五章 運鏡路線規劃 30 第一節、攝影機控制 30 第二節、運鏡方式與構圖規則 33 第一項、主體大小 34 第二項、影像構圖規則 39 第六章 實驗結果 42 第一節、ROI選取的正確率 42 第一項、第一段影片實驗結果 44 第二項、第二段影片實驗結果 46 第三項、第三段影片實驗結果 48 第二節、本系統運鏡評估 49 第三節、與過去研究拍攝結果比較 53 第四節、本系統對於拍攝場地的適用性 57 第一項、第一段影片實驗結果 57 第二項、第二段影片實驗結果 58 第七章 結論與未來工作 60 第一節、結論 60 第二節、未來工作 60 參考文獻 62

    [Bae03] R. Baecker, “A Principled Design for Scalable Internet Visual Communications with Rich Media, Interactivity, and Structures Archives,” Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research, IBM, pp. 16-19, 2003.
    [Ber06] A. Berengolts and M. Lindenbaum, “On the Distribution of Saliency,” In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28, No. 12, Chicago, pp. 543-549, 2004.
    [Bia98] M. Bianchi, “AutoAuditorium: A Fully Automatic, Multi-camera System to Televise Auditorium Presentations,” Proceedings of the Joint DARPA/NIST Workshop on Smart Spaces Technology, 1998.
    [Bia04] M. Bianchi, “Automatic Video Production of Lectures Using an Intelligent and Aware Environment,” Proceedings of the 3rd International conference on Mobile and Ubiquitous Multimedia, New York, pp. 117-123, 2004.
    [Bou00] B. J. Y, “Pyramidal Implementation of the Lucas Kanade Feature Tracker Description of the algorithm,” Intel Corporation Microprocessor Research Labs report, 2000
    [Cho10] H. Chou, J. Wang, C. Fuh, S. Lin, and S. Chen, “Automated lecture recording system,” Proceedings of the IEEE International conference on System Science and Engineering (ICSSE), Taiwan, pp. 167-172, 2010.
    [Dat06] R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Studying Aesthetics in Photographic Images Using a Computational Approach,” Proceedings of the 9th European conference on Computer Vision, Vol. 3, Heidelberg, pp. 288-301, 2006.
    [Fan03] C. Y. Fang, S. W. Chen, and C. S. Fuh, “Automatic Change Detection of Driving Environments in a Vision-Based Driver Assistance System,” IEEE Transactions on Neural Networks, vol. 14, no. 3, pp. 646-657, 2003.
    [Lan09] C. Lang, D. Xu, and Y. Jiang, “Shot Type Classification in Sports Video Based on Visual Attention,” Proceedings of the International conference. on Computational Intelligence and Natural Computing, Vol. 1, Wuhan, pp. 336-339, 2009.
    [Li05] T. Y. Li and X. Y. Xiao, “An Interactive Camera Planning System for Automatic Cinematographer,” Proceedings of the 11th International conference on Multimedia Modeling, pp. 310-315, 2005.
    [Oni04] M. Onishi and K. Fukunaga, “Shooting the Lecture Scene Using Computer-Controlled Cameras based on Situation Understanding and Evaluation of Video Images” Proceedings of the 17th International conference on Mobile and Ubiquitous Multimedia, pp. 781–784, 2004.
    [Row01] L. A. Rowe, D. Harley, and P. Pletcher, “BIBS: A Lecture Webcasting System,” Berkeley Multimedia Research Center, 2001.
    [Rui01] Y. Rui, L. He, A. Gupta, and Q. Liu, “Building an Intelligent Camera Management System,” Proceedings of the 9th ACM International conference on Multimedia, New York, pp. 2-11, 2001.
    [Rui03] Y. Rui, A. Gupta, and J. Grudin, “Videography for Telepresentations,” Proceedings of the ACM SIGCHI conference on Human Factors in Computing Systems, New York, pp. 457-464, 2003.
    [Rui04] Y. Rui, A. Gupta, J. Grudin, and He Liwei, “Automating Lecture Capture and Broadcast: Technology and Videography,” Multimedia Systems, Vol. 10, No. 1, pp. 3-15, 2004.
    [Sol12] B. Solmaz, B. E. Moore, and M. Shah, “Identifying Behaviors in Crowd Scenes Using Stability Analysis for Dynamical Systems,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, No. 10, pp. 2064-2070, 2012.
    [Vio04] P. Viola, and M. J. Jones, “Robust Real-time Face Detection”, International Journal of Computer Vision, Vol. 57, No. 2, pp. 137-154, 2004.
    [Wu10] S. Wu, B. E. Moore, and M. Shah, “Chaotic Invariants of Lagrangian Particle Trajectories,” Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, San Francisco, pp. 2054-2060, 2010.
    [Zha08] C. Zhang, Y. Rui, J. Crawford, and L. W. He, “An Automated End-to-end Lecture Capture and Broadcasting System,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), New York, Vol. 4, No. 1, pp. 2-11, 2008.
    [郭12] 郭彥佑,“自動化演講錄製系統”,國立台灣師範大學,資訊工程研究所碩士論文,2012/06。
    [李12] 李俊億,“自動化演講錄製系統之虛擬攝影師”,國立台灣師範大學,資訊工程研究所碩士論文,2012/06。
    [蔡13] 蔡侑廷,“以觀眾為拍攝主體之虛擬攝影師系統”,國立台灣師範大學,資訊工程研究所碩士論文,2013/06。
    [呂13] 呂佳儒,“自動化演講錄製系統之虛擬導播子系統”,國立台灣師範大學,資訊工程研究所碩士論文,2013/06。
    [謝95] 謝怡竹,“以光流為基礎之自動化表情辨識系統”,國立中央大學,資訊工程研究所碩士論文,1995/07。
    [1] Ambient Insight Standard Report
    http://www.ambientinsight.com/Resources/Documents/AmbientInsight-2011-2016-Worldwide-Self-paced-eLearning-Market-Premium-Overview.pdf
    [2] 中臺灣展演藝術平台
    http://perform.culture.taichung.gov.tw/place/Details.aspx?Parser=99,5,16,,,,11
    [3] 莎頓國際學院
    http://www.sheltoncollege.edu.sg/cn/aboutus-facilities.html
    [4] 攝影入門之新手必讀─色彩的奧秘
    http://www.sj33.cn/dphoto/syxt/201211/32597.html
    [5] 利用色彩輕鬆拍攝漂亮照片
    http://shijue.me/show_text/4ffef3e0ac1d840d9001c13c
    [6] 拍出不同戲劇效果─善用不同的光向
    http://www.fotobeginner.com/3789/light-direction/
    [7] 【攝影入門教學講義一】淺談構圖原理與主題表現
    http://www.wretch.cc/blog/jhyou/13908453
    [8] 怎麼拍,才精采?基本構圖法演練實戰
    http://www.eprice.com.tw/dc/talk/356/723/1/
    [9] Kinect for windows
    http://www.microsoft.com/en-us/kinectforwindows/
    [10] AXIS Communications
    http://www.axis.com/zh/products/cam_215/index.htm
    [11] 陳慶瀚,類神經網路
    http://ccy.dd.ncu.edu.tw/~chen/course/Neural/ch3/SOM.pdf
    [12] 蘋果新一代iPHONE發表會
    http://blog.chinatimes.com/honey_news/archive/2012/09/13/2984203.html
    [13] 論文發表
    http://www.che.kuas.edu.tw/app/news.php?Sn=241
    [14] 海基會
    http://www.mac.gov.tw/ct.aspxItem=72482&ctNode=6643&mp=110
    [15] 順光、逆光拍攝技巧
    http://www.hkzooo.com/backlight_photography/

    QR CODE