簡易檢索 / 詳目顯示

研究生: 陳彥呈
Albert Y. C. Chen
論文名稱: 藉由主角移動軌跡之分析來測定隱晦不明顯之影片分鏡時間點
Obscure Video-Shot Boundary Determination via Protagonist Trajectory Analysis
指導教授: 李忠謀
Lee, Chung-Mou
學位類別: 碩士
Master
系所名稱: 資訊教育研究所
Graduate Institute of Information and Computer Education
論文出版年: 2007
畢業學年度: 95
語文別: 英文
論文頁數: 56
中文關鍵詞: 影片檢索影片分割分鏡移動物體擷取背景消去
英文關鍵詞: Video Indexing, Video Segmentation, Video-Shot Boundary, Moving Object Segmentation, Background Subtraction
論文種類: 學術論文
相關次數: 點閱:185下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

過去對於影片檢索方面的研究,多半專注於「有明顯分鏡」的影片,像是電影與新聞錄影。此類影片在「分鏡」時,整個畫面的像素通常會有顯著的改變,而系統便可藉此判斷出分鏡的時間,進而對影片進行切割或是編輯索引。
本研究著重於「無顯著分鏡」影片的研究。此類影片雖為一鏡到底,但影片中的內容與主角,通常會隨著時間而有所變動。傳統的演算法,無法偵測出此「不明顯」的分鏡時機,進而必須藉由更高階的資訊來判斷何時分鏡。
本研究所提出的演算法,可概分為「移動物體的擷取」與「主角變換的偵測」兩大模組。移動物體的擷取,可依照相機本身的位移與否,分為兩大類型。本研究的相機為固定式,採取「改良式單一高氏分佈」來建立畫面背景,再藉由背景消去法來獲取屬於前景的移動中物體。取得移動物體後,藉由追蹤「最大前景物體」軸心於橫軸方向的位移量,與主角平均高度,來判斷主角何時換人。當偵測到主角換人時,影片內容應該已有劇烈的改變,因而我們將此時間點稱為「隱諱不明」的分鏡時間點。我們所提的這個演算法,能夠讓此類「無顯著分鏡」影片的切割與索引編輯過程自動化,進而使數位影片的典藏與存取更加的容易。

Previous research related to Video Indexing had primarily centered on videos with obvious video-shot boundaries, such as movies and recordings of news broadcast. When we encounter a video-shot boundary in the types of film mentioned above, the pixels within the whole scene would alter at a noticeable level, thus allowing the computer to automatically segment and index the video according to the video-shot boundaries.
Our research focuses on detecting possible cut points for videos without obvious video-shot boundaries. Although these videos were often recorded continuously without film cuts, the content and people within the film could change rapidly during the recording period. Traditional video-shot boundary detection algorithms cannot successfully detect this kind of content change, which we call as obscure video-shot boundaries, thus we would have to depend on higher level understandings of the film to determine appropriate time points to serve as film cuts.
Our proposed algorithm can be roughly classified into two major modules: the moving object segmentation module, and the protagonist-change detection module. Moving object segmentation algorithms can be classified into two categories based on whether the camera moves or not. Our method is applicable only to videos recorded by a static camera, which uses an improved univariate Gaussian background module to perform background subtraction and to extract the foreground moving objects. The largest moving object would be detected and tracked, while its average height and horizontal moving distance would be exploited to determine protagonist change. Whenever a protagonist change occurs, we can conclude that the video content had changed rapidly and meaningfully, thus the time point can be used to serve as a candidate segmentation point. Our proposed algorithm can automate the segmentation and indexing process of those videos without obvious video-shot boundaries, thus greatly reduce the time and effort needed in digital archiving.

LIST OF TABLES iii LIST OF FIGURES iv CHAPTER 1 - Introduction 1 1.1 Overview of the Problem 1 1.2 Key Terms 2 1.2.1 Video Indexing 2 1.2.2 Video Segmentation 2 1.2.3 Temporal Video Segmentation and Shot Boundary Detection 3 1.2.4 Video Object Segmentation, Foreground Object Segmentation, and Background Segmentation 4 1.3 Research Questions 5 1.4 Thesis Organization 8 CHAPTER 2 - Literature Review 9 2.1 Overview of Temporal Video Segmentation Methods 9 2.1.1 Temporal Video Segmentation in the Uncompressed Domain 10 2.1.2 Temporal Video Segmentation in the Compressed Domain 12 2.2 Review of Spatial Video Segmentation Approaches 15 2.2.1 Moving Object Segmentation in the Uncompressed Domain 17 2.2.2 Moving Object Segmentation in the Compressed Domain 19 2.3 Related Work in Human Motion Understanding 20 2.3.1 Human Posture Recognition 21 CHAPTER 3 - Protagonist Change Detection and Obscure Video-Shot Boundary Determination 23 3.1 Overview of our Proposed Algorithm 23 3.2 Moving Object Segmentation in Noisily Compressed Videos 25 3.2.1 Analyzing the Video Noise in Compressed Videos 25 3.2.2 Moving Object Segmentation via Background Subtraction 28 3.2.3 Noise Elimination based on Voting Mechanism 30 3.2.4 Background Update via Exponentially Weighted Moving Average 33 3.3 Obscure Video-Shot Boundary Determination 36 3.3.1 Protagonist-Change Decision 37 3.3.2 Protagonist Boundary Determination on Projected Histogram 39 3.3.3 Camera Vibration, Lens Occlusion, and Noise Handling 40 CHAPTER 4 - Experiments and Results 42 4.1 Experimental Equipment and Settings 42 4.2 Analysis of the Test Sequences & Experiment Results 44 CHAPTER 5 - Conclusion and Future Work 48 5.1 Conclusion and Contributions 48 5.2 Future Work 50 REFERENCES 52

[1] C. G. M. Snoek and M. Worring, "Multimodal Video Indexing: A Review of the State-of-the-art", Multimedia Tools and Applications, vol. V25, pp. 5-35, 2005.
[2] I. Koprinska and S. Carrato, "Temporal video segmentation: A survey", Signal Processing: Image Communication, vol. 16, pp. 477-500, 2001.
[3] Wikipedia, "Shot (film)", 15 December 2006 03:12 UTC, Wikipedia, The Free Encyclopedia, 8 January 2007 19:10 UTC, <http://en.wikipedia.org/w/index.php?title=Shot_%28film%29&oldid=94433706 >
[4] Wikipedia, "Film editing", 3 January 2007 17:46 UTC, Wikipedia, The Free Encyclopedia, 8 January, <http://en.wikipedia.org/w/index.php?title=Film_editing&oldid=98214720 >
[5] G. C. Lee and M.-N. Tsai, "An efficient slide-changing detection algorithm for MPEG coded lecture videos", Proc. 2nd Int'l Conf. on Visualization, Imaging and Image Processing, Malaga, Spain, 2002.
[6] A. Bovik, "Handbook of Image and Video Processing", in MPEG-1 and MPEG-2 Video Standards, S. A. a. M.-T. Sun, Ed.: Academic Press, 2005, pp. 833-847.
[7] Wikipedia, "YCbCr", Wikipedia, The Free Encyclopedia, 7 June, <http://en.wikipedia.org/w/index.php?title=YCbCr&oldid=134198626>
[8] I. Koprinska and S. Carrato, "Detecting and classifying video shot boundaries in MPEG compressed sequences", Proceedings of IX European Signal Processing Conference (EUSIPCO), Rhodes, 1998, pp. 1729-1732.
[9] C. Kim and J.-N. Hwang, "An integrated scheme for object-based video abstraction", Proc. 8th ACM Int'l Conf. on Multimedia, Marina del Rey, California, United States, 2000, pp. 303-311.
[10] T. Meier and K. N. Ngan, "Automatic segmentation of moving objects for video object plane generation", IEEE Trans. on Circuits and Systems for Video Technology, vol. 8, pp. 525-538, 1998.
[11] G. Kuhne, S. Richter, and M. Beier, "Motion-based segmentation and contour-based classification of video objects", Proc. 9th ACM Int'l Conf. on Multimedia, Ottawa, Canada, 2001, pp. 41-50.
[12] C. Kim and J.-N. Hwang, "Fast and automatic video object segmentation and tracking for content-based applications", IEEE Trans. on Circuits and Systems for Video Technology, vol. 12, pp. 122-129, 2002.
[13] A. Doulamis, N. Doulamis, K. Ntalianis, and S. Kollias, "An efficient fully unsupervised video object segmentation scheme using an adaptive neural-network classifier architecture", IEEE Trans. on Neural Networks, vol. 14, pp. 616-630, 2003.
[14] S.-Y. Chien, Y.-W. Huang, B.-Y. Hsieh, S.-Y. Ma, and L.-G. Chen, "Fast video segmentation algorithm with shadow cancellation, global motion compensation, and adaptive threshold techniques", IEEE Trans. on Multimedia, vol. 6, pp. 732-748, 2004.
[15] T. Yang, S. Z. Li, Q. Pan, and J. Li, "Real-time and accurate segmentation of moving objects in dynamic scene", Proc. ACM 2nd int'l workshop on Video Surveillance & Sensor Networks, New York, NY, USA, 2004, pp. 136-143.
[16] A. Cavallaro, O. Steiger, and T. Ebrahimi, "Multiple video object tracking in complex scenes", Proc. 10th ACM Int'l Conf. on Multimedia, Juan-les-Pins, France, 2002, pp. 523-532.
[17] L. Yi and Y. F. Zheng, "Video object segmentation and tracking using ψ-learning classification", Circuits and Systems for Video Technology, IEEE Transactions on, vol. 15, pp. 885-899, 2005.
[18] M. L. Jamrozik and M. H. Hayes, "A compressed domain video object segmentation system", Proc. IEEE Int'l Conf. on Image Processing, 2002, pp. I-113-I-116 vol.1.
[19] J. Pan, C.-W. Lin, C. Gu, and M.-T. Sun, "A robust video object segmentation scheme with prestored background information", Proc. 2002 IEEE Int'l Symp. on Circuits and Systems, 2002, pp. 803-806.
[20] Z. Wang, G. Liu, and L. Liu, "A fast and accurate video object detection and segmentation method in the compressed domain", Proc. IEEE Int'l Conf. on Neural Networks and Signal Processing, 2003, pp. 1209-1212 Vol.2.
[21] X.-D. Yu, L.-Y. Duan, and Q. Tian, "Robust moving video object segmentation in the MPEG compressed domain", Proc. IEEE Int'l Conf. on Image Processing, 2003, pp. III-933-6 vol.2.
[22] A. M. A. Ahmad, B. M. A. Ahmad, and S.-Y. Lee, "Fast and robust object detection framework in compressed domain", Proc. IEEE 6th Int'l Symposium on Multimedia Software Engineering, 2004, pp. 210-217.
[23] R. V. Babu, K. R. Ramakrishnan, and S. H. Srinivasan, "Video object segmentation: a compressed domain approach", Circuits and Systems for Video Technology, IEEE Transactions on, vol. 14, pp. 462-474, 2004.
[24] P. KaewTraKulPong and R. Bowden, "An improved adaptive background mixture model for real-time tracking with shadow detection", Proc. 2nd European Workshop on Advanced Video Based Surveillance Systems, 2001.
[25] L. Li, W. Huang, I. Y.-H. Gu, and Q. Tian, "Statistical modeling of complex backgrounds for foreground object detection", IEEE Trans. on Image Processing, vol. 13, pp. 1459-1472, 2004.
[26] M. Piccardi, "Background subtraction techniques: a review", Proc. 2004 IEEE Int'l Conf. on Systems, Man and Cybernetics, 2004, pp. 3099-3104 vol.4.
[27] P. W. Power and J. A. Schoonees, "Understanding background mixture models for foreground segmentation", Proc. Image and Vision Computing New Zealand 2002, New Zealand, 2002, p. 267.
[28] C. Stauffer and W. E. L. Grimson, "Adaptive background mixture models for real-time tracking", Proc. 1999 IEEE Conf. on Computer Vision and Pattern Recognition, 1999.
[29] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers, "Wallflower: principles and practice of background maintenance", Proc. 7th Int'l Conf. on Computer Vision, 1999, pp. 255-261 vol.1.
[30] J. L. Barron, D. J. Fleet, and S. S. Beauchemin, "Performance of optical flow techniques", Int'l Journal of Computer Vision, vol. 12, pp. 43-77, 1994.
[31] S. S. Beauchemin and J. L. Barron, "The computation of optical flow", ACM Computing Surveys, vol. 27, pp. 433-466, 1995.
[32] I. Haritaoglu, D. Harwood, and L. S. Davis, "W4: real-time surveillance of people and their activities", Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 22, pp. 809-830, 2000.
[33] Wikipedia, "MPEG-4", Wikipedia, The Free Encyclopedia, 8 January 2007, <http://en.wikipedia.org/wiki/Mpeg4>
[34] A. Bovik, "Handbook of Image and Video Processing", in MPEG-4, H.264/AVC, and MPEG-7: New Standards for the Digital Video Industry, A. D. B. Erol, F. Kossentimi, A. Joch, and G. Sullivan, Ed.: Academic Press, 2005, pp. 849-875.
[35] Wikipedia, "DivX", Wikipedia, The Free Encyclopedia, 8 January 2007, <http://en.wikipedia.org/wiki/Divx>
[36] H. Weiming, T. Tieniu, W. Liang, and S. Maybank, "A survey on visual surveillance of object motion and behaviors", Systems, Man and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, vol. 34, pp. 334-352, 2004.
[37] B. M. Thomas and G. Erik, "A survey of computer vision-based human motion capture", vol. 81, pp. 231-268, 2001.
[38] W. Hu, X. Xiao, Z. Fu, D. Xie, T. Tan, and S. Maybank, "A system for learning statistical motion patterns", Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 28, pp. 1450-1464, 2006.
[39] V. Kastrinaki, M. Zervakis, and K. Kalaitzakis, "A survey of video processing techniques for traffic applications", Image and Vision Computing, vol. 21, pp. 359–381, 2003.
[40] R. Cucchiara, C. Grana, A. Prati, and R. Vezzani, "Probabilistic posture classification for Human-behavior analysis", IEEE Trans. on Systems, Man and Cybernetics, vol. 35, pp. 42-54, 2005.
[41] D. Z. T. National Instruments, "An Introduction to Noise Signals", 1-Feb, 2006, National Instruments, 13-Jun, <http://zone.ni.com/devzone/cda/tut/p/id/3006>
[42] E. Stringa and C. S. Regazzoni, "Real-time video-shot detection for scene surveillance applications", IEEE Trans. on Image Processing, vol. 9, pp. 69-79, 2000.
[43] Wikipedia, "Moving average", 7 June 2007 14:16 UTC, Wikipedia, The Free Encyclopedia, 14 June 2007 02:44 UTC, <http://en.wikipedia.org/w/index.php?title=Moving_average&oldid=136601032>
[44] Wikipedia, "Audio Video Interleave", 12 June 2007 22:50 UTC, Wikipedia, The Free Encyclopedia., 14 June 2007 13:05 UTC, <http://en.wikipedia.org/w/index.php?title=Audio_Video_Interleave&oldid=137776681>
[45] Intel, "Open Source Computer Vision Library", Intel, 14 June 2007 13:16 UTC, <http://www.intel.com/research/mrl/research/opencv/>

QR CODE