Author: |
陳彥呈 Albert Y. C. Chen |
---|---|
Thesis Title: |
藉由主角移動軌跡之分析來測定隱晦不明顯之影片分鏡時間點 Obscure Video-Shot Boundary Determination via Protagonist Trajectory Analysis |
Advisor: |
李忠謀
Lee, Chung-Mou |
Degree: |
碩士 Master |
Department: |
資訊教育研究所 Graduate Institute of Information and Computer Education |
Thesis Publication Year: | 2007 |
Academic Year: | 95 |
Language: | 英文 |
Number of pages: | 56 |
Keywords (in Chinese): | 影片檢索 、影片分割 、分鏡 、移動物體擷取 、背景消去 |
Keywords (in English): | Video Indexing, Video Segmentation, Video-Shot Boundary, Moving Object Segmentation, Background Subtraction |
Thesis Type: | Academic thesis/ dissertation |
Reference times: | Clicks: 155 Downloads: 0 |
Share: |
School Collection Retrieve National Library Collection Retrieve Error Report |
過去對於影片檢索方面的研究,多半專注於「有明顯分鏡」的影片,像是電影與新聞錄影。此類影片在「分鏡」時,整個畫面的像素通常會有顯著的改變,而系統便可藉此判斷出分鏡的時間,進而對影片進行切割或是編輯索引。
本研究著重於「無顯著分鏡」影片的研究。此類影片雖為一鏡到底,但影片中的內容與主角,通常會隨著時間而有所變動。傳統的演算法,無法偵測出此「不明顯」的分鏡時機,進而必須藉由更高階的資訊來判斷何時分鏡。
本研究所提出的演算法,可概分為「移動物體的擷取」與「主角變換的偵測」兩大模組。移動物體的擷取,可依照相機本身的位移與否,分為兩大類型。本研究的相機為固定式,採取「改良式單一高氏分佈」來建立畫面背景,再藉由背景消去法來獲取屬於前景的移動中物體。取得移動物體後,藉由追蹤「最大前景物體」軸心於橫軸方向的位移量,與主角平均高度,來判斷主角何時換人。當偵測到主角換人時,影片內容應該已有劇烈的改變,因而我們將此時間點稱為「隱諱不明」的分鏡時間點。我們所提的這個演算法,能夠讓此類「無顯著分鏡」影片的切割與索引編輯過程自動化,進而使數位影片的典藏與存取更加的容易。
Previous research related to Video Indexing had primarily centered on videos with obvious video-shot boundaries, such as movies and recordings of news broadcast. When we encounter a video-shot boundary in the types of film mentioned above, the pixels within the whole scene would alter at a noticeable level, thus allowing the computer to automatically segment and index the video according to the video-shot boundaries.
Our research focuses on detecting possible cut points for videos without obvious video-shot boundaries. Although these videos were often recorded continuously without film cuts, the content and people within the film could change rapidly during the recording period. Traditional video-shot boundary detection algorithms cannot successfully detect this kind of content change, which we call as obscure video-shot boundaries, thus we would have to depend on higher level understandings of the film to determine appropriate time points to serve as film cuts.
Our proposed algorithm can be roughly classified into two major modules: the moving object segmentation module, and the protagonist-change detection module. Moving object segmentation algorithms can be classified into two categories based on whether the camera moves or not. Our method is applicable only to videos recorded by a static camera, which uses an improved univariate Gaussian background module to perform background subtraction and to extract the foreground moving objects. The largest moving object would be detected and tracked, while its average height and horizontal moving distance would be exploited to determine protagonist change. Whenever a protagonist change occurs, we can conclude that the video content had changed rapidly and meaningfully, thus the time point can be used to serve as a candidate segmentation point. Our proposed algorithm can automate the segmentation and indexing process of those videos without obvious video-shot boundaries, thus greatly reduce the time and effort needed in digital archiving.
[1] C. G. M. Snoek and M. Worring, "Multimodal Video Indexing: A Review of the State-of-the-art", Multimedia Tools and Applications, vol. V25, pp. 5-35, 2005.
[2] I. Koprinska and S. Carrato, "Temporal video segmentation: A survey", Signal Processing: Image Communication, vol. 16, pp. 477-500, 2001.
[3] Wikipedia, "Shot (film)", 15 December 2006 03:12 UTC, Wikipedia, The Free Encyclopedia, 8 January 2007 19:10 UTC, <http://en.wikipedia.org/w/index.php?title=Shot_%28film%29&oldid=94433706 >
[4] Wikipedia, "Film editing", 3 January 2007 17:46 UTC, Wikipedia, The Free Encyclopedia, 8 January, <http://en.wikipedia.org/w/index.php?title=Film_editing&oldid=98214720 >
[5] G. C. Lee and M.-N. Tsai, "An efficient slide-changing detection algorithm for MPEG coded lecture videos", Proc. 2nd Int'l Conf. on Visualization, Imaging and Image Processing, Malaga, Spain, 2002.
[6] A. Bovik, "Handbook of Image and Video Processing", in MPEG-1 and MPEG-2 Video Standards, S. A. a. M.-T. Sun, Ed.: Academic Press, 2005, pp. 833-847.
[7] Wikipedia, "YCbCr", Wikipedia, The Free Encyclopedia, 7 June, <http://en.wikipedia.org/w/index.php?title=YCbCr&oldid=134198626>
[8] I. Koprinska and S. Carrato, "Detecting and classifying video shot boundaries in MPEG compressed sequences", Proceedings of IX European Signal Processing Conference (EUSIPCO), Rhodes, 1998, pp. 1729-1732.
[9] C. Kim and J.-N. Hwang, "An integrated scheme for object-based video abstraction", Proc. 8th ACM Int'l Conf. on Multimedia, Marina del Rey, California, United States, 2000, pp. 303-311.
[10] T. Meier and K. N. Ngan, "Automatic segmentation of moving objects for video object plane generation", IEEE Trans. on Circuits and Systems for Video Technology, vol. 8, pp. 525-538, 1998.
[11] G. Kuhne, S. Richter, and M. Beier, "Motion-based segmentation and contour-based classification of video objects", Proc. 9th ACM Int'l Conf. on Multimedia, Ottawa, Canada, 2001, pp. 41-50.
[12] C. Kim and J.-N. Hwang, "Fast and automatic video object segmentation and tracking for content-based applications", IEEE Trans. on Circuits and Systems for Video Technology, vol. 12, pp. 122-129, 2002.
[13] A. Doulamis, N. Doulamis, K. Ntalianis, and S. Kollias, "An efficient fully unsupervised video object segmentation scheme using an adaptive neural-network classifier architecture", IEEE Trans. on Neural Networks, vol. 14, pp. 616-630, 2003.
[14] S.-Y. Chien, Y.-W. Huang, B.-Y. Hsieh, S.-Y. Ma, and L.-G. Chen, "Fast video segmentation algorithm with shadow cancellation, global motion compensation, and adaptive threshold techniques", IEEE Trans. on Multimedia, vol. 6, pp. 732-748, 2004.
[15] T. Yang, S. Z. Li, Q. Pan, and J. Li, "Real-time and accurate segmentation of moving objects in dynamic scene", Proc. ACM 2nd int'l workshop on Video Surveillance & Sensor Networks, New York, NY, USA, 2004, pp. 136-143.
[16] A. Cavallaro, O. Steiger, and T. Ebrahimi, "Multiple video object tracking in complex scenes", Proc. 10th ACM Int'l Conf. on Multimedia, Juan-les-Pins, France, 2002, pp. 523-532.
[17] L. Yi and Y. F. Zheng, "Video object segmentation and tracking using ψ-learning classification", Circuits and Systems for Video Technology, IEEE Transactions on, vol. 15, pp. 885-899, 2005.
[18] M. L. Jamrozik and M. H. Hayes, "A compressed domain video object segmentation system", Proc. IEEE Int'l Conf. on Image Processing, 2002, pp. I-113-I-116 vol.1.
[19] J. Pan, C.-W. Lin, C. Gu, and M.-T. Sun, "A robust video object segmentation scheme with prestored background information", Proc. 2002 IEEE Int'l Symp. on Circuits and Systems, 2002, pp. 803-806.
[20] Z. Wang, G. Liu, and L. Liu, "A fast and accurate video object detection and segmentation method in the compressed domain", Proc. IEEE Int'l Conf. on Neural Networks and Signal Processing, 2003, pp. 1209-1212 Vol.2.
[21] X.-D. Yu, L.-Y. Duan, and Q. Tian, "Robust moving video object segmentation in the MPEG compressed domain", Proc. IEEE Int'l Conf. on Image Processing, 2003, pp. III-933-6 vol.2.
[22] A. M. A. Ahmad, B. M. A. Ahmad, and S.-Y. Lee, "Fast and robust object detection framework in compressed domain", Proc. IEEE 6th Int'l Symposium on Multimedia Software Engineering, 2004, pp. 210-217.
[23] R. V. Babu, K. R. Ramakrishnan, and S. H. Srinivasan, "Video object segmentation: a compressed domain approach", Circuits and Systems for Video Technology, IEEE Transactions on, vol. 14, pp. 462-474, 2004.
[24] P. KaewTraKulPong and R. Bowden, "An improved adaptive background mixture model for real-time tracking with shadow detection", Proc. 2nd European Workshop on Advanced Video Based Surveillance Systems, 2001.
[25] L. Li, W. Huang, I. Y.-H. Gu, and Q. Tian, "Statistical modeling of complex backgrounds for foreground object detection", IEEE Trans. on Image Processing, vol. 13, pp. 1459-1472, 2004.
[26] M. Piccardi, "Background subtraction techniques: a review", Proc. 2004 IEEE Int'l Conf. on Systems, Man and Cybernetics, 2004, pp. 3099-3104 vol.4.
[27] P. W. Power and J. A. Schoonees, "Understanding background mixture models for foreground segmentation", Proc. Image and Vision Computing New Zealand 2002, New Zealand, 2002, p. 267.
[28] C. Stauffer and W. E. L. Grimson, "Adaptive background mixture models for real-time tracking", Proc. 1999 IEEE Conf. on Computer Vision and Pattern Recognition, 1999.
[29] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers, "Wallflower: principles and practice of background maintenance", Proc. 7th Int'l Conf. on Computer Vision, 1999, pp. 255-261 vol.1.
[30] J. L. Barron, D. J. Fleet, and S. S. Beauchemin, "Performance of optical flow techniques", Int'l Journal of Computer Vision, vol. 12, pp. 43-77, 1994.
[31] S. S. Beauchemin and J. L. Barron, "The computation of optical flow", ACM Computing Surveys, vol. 27, pp. 433-466, 1995.
[32] I. Haritaoglu, D. Harwood, and L. S. Davis, "W4: real-time surveillance of people and their activities", Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 22, pp. 809-830, 2000.
[33] Wikipedia, "MPEG-4", Wikipedia, The Free Encyclopedia, 8 January 2007, <http://en.wikipedia.org/wiki/Mpeg4>
[34] A. Bovik, "Handbook of Image and Video Processing", in MPEG-4, H.264/AVC, and MPEG-7: New Standards for the Digital Video Industry, A. D. B. Erol, F. Kossentimi, A. Joch, and G. Sullivan, Ed.: Academic Press, 2005, pp. 849-875.
[35] Wikipedia, "DivX", Wikipedia, The Free Encyclopedia, 8 January 2007, <http://en.wikipedia.org/wiki/Divx>
[36] H. Weiming, T. Tieniu, W. Liang, and S. Maybank, "A survey on visual surveillance of object motion and behaviors", Systems, Man and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, vol. 34, pp. 334-352, 2004.
[37] B. M. Thomas and G. Erik, "A survey of computer vision-based human motion capture", vol. 81, pp. 231-268, 2001.
[38] W. Hu, X. Xiao, Z. Fu, D. Xie, T. Tan, and S. Maybank, "A system for learning statistical motion patterns", Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 28, pp. 1450-1464, 2006.
[39] V. Kastrinaki, M. Zervakis, and K. Kalaitzakis, "A survey of video processing techniques for traffic applications", Image and Vision Computing, vol. 21, pp. 359–381, 2003.
[40] R. Cucchiara, C. Grana, A. Prati, and R. Vezzani, "Probabilistic posture classification for Human-behavior analysis", IEEE Trans. on Systems, Man and Cybernetics, vol. 35, pp. 42-54, 2005.
[41] D. Z. T. National Instruments, "An Introduction to Noise Signals", 1-Feb, 2006, National Instruments, 13-Jun, <http://zone.ni.com/devzone/cda/tut/p/id/3006>
[42] E. Stringa and C. S. Regazzoni, "Real-time video-shot detection for scene surveillance applications", IEEE Trans. on Image Processing, vol. 9, pp. 69-79, 2000.
[43] Wikipedia, "Moving average", 7 June 2007 14:16 UTC, Wikipedia, The Free Encyclopedia, 14 June 2007 02:44 UTC, <http://en.wikipedia.org/w/index.php?title=Moving_average&oldid=136601032>
[44] Wikipedia, "Audio Video Interleave", 12 June 2007 22:50 UTC, Wikipedia, The Free Encyclopedia., 14 June 2007 13:05 UTC, <http://en.wikipedia.org/w/index.php?title=Audio_Video_Interleave&oldid=137776681>
[45] Intel, "Open Source Computer Vision Library", Intel, 14 June 2007 13:16 UTC, <http://www.intel.com/research/mrl/research/opencv/>