簡易檢索 / 詳目顯示

研究生: 杜珮瑩
Tu Pei-Ying
論文名稱: PDF文件影片關鍵畫面擷取之研究
Key-Frames Extraction of PDF Document Videos
指導教授: 李忠謀
Lee, Chung-Mou
學位類別: 碩士
Master
系所名稱: 資訊教育研究所
Graduate Institute of Information and Computer Education
論文出版年: 2006
畢業學年度: 94
語文別: 中文
論文頁數: 58
中文關鍵詞: 關鍵畫面片段偵測視訊切割運動向量
英文關鍵詞: key-frame, shot detection, video segmentation, motion vector, entropy
論文種類: 學術論文
相關次數: 點閱:190下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本篇論文提出一個偵測影片關鍵畫面的方法,可自動擷取使用PDF文件為教材之影片發生換頁的畫面。偵測方法分為兩階段,第一階段找出可能發生換頁之處,利用P、B畫面運動向量方向、位移量及所佔比例,找出換頁的位置並過濾不為換頁的畫面,如:移動程度較小、區域移動、移動偏於某一方向的畫面;第二階段為去除多餘畫面,利用畫面DC值直方圖的熵找出畫面顯著區域,再藉由其間的差異判斷其是否為換頁或其他多餘畫面,如:縮放、移動距離較大的畫面,經過上述兩階段存留下來的畫面即為關鍵畫面。本研究分別針對使用不同參數設定之影片、使用不同類型的PDF文件、不同移動速率及動作的PDF文件進行偵測分析,實驗結果證明所提出的方法不論在處理速度及結果均有不錯之效果。

    This thesis proposes an automatic key-frame extraction method for detecting PDF document change in videos. The algorithm consists two major steps. First, candidate key-frames are extracted by analyzing angle, magnitude and proportion of motion vectors. Second, the entropy of the DC-value histogram is computed in order to find salient objects as main information of a frame. Redundant frames are removed from candidate key-frames by noting changes in higher entropy histogram profiles of a frame. From above methods the true key-frame of the particular shot are extracted. Experiments were conducted using the video clips which involve different argument sets, different contents, motion rate and action of PDF document. The experimental results demonstrate that our method is able to detect key-frames accurately in reasonable amount of time.

    圖目錄 iv 表目錄 vi 第一章 緒論 1 1.1 研究動機 1 1.2 研究目的 2 1.3 研究範圍與限制 5 1.4 論文架構 6 第二章 相關技術及文獻探討 7 2.1 視訊的組成要素 7 2.2 MPEG編碼格式 9 2.3 PDF文件檔案格式 10 2.4 視訊切割技術探討 11 2.4.1 計算像素間的差異 11 2.4.2 計算直方圖間的差異 12 2.4.3 計算區塊間的差異 13 2.4.4 以DCT係數為主的方法 14 2.4.5 以區塊組編碼模式為主的方法 16 2.4.6 以運動向量為主的方法 17 2.5 綜合討論 20 第三章 擷取方法與技術 21 3.2 偵測可能的關鍵畫面 23 3.2.1 畫面間變化的類型 23 3.2.2 擷取運動向量 26 3.2.3 消去移動程度小及部分區域移動之畫面 28 3.2.3 統計運動向量方向分佈情形 29 3.2.4 判斷運動向量方向分佈情形 32 3.3 消除多餘畫面 32 3.3.1 計算特徵值 33 3.3.2 找出畫面顯著區域 35 3.3.3 計算畫面顯著區域的差異 37 3.4 偵測關鍵畫面的演算法 39 第四章 實驗結果與討論 41 4.1 實驗方法及評估方式 41 4.2 實驗結果與討論 42 4.2.1 不同參數設定之偵測結果 42 4.2.2 不同類型文件之偵測結果 46 4.2.3 不同文件移動速率及動作其結果 50 4.3 實驗結果總結 55 第五章 結論與未來研究 56 5.1 結論 56 5.2 未來研究 57 參考文獻 59

    [1]W. Niblack, X. Zhu, J. L. Hafner, T. Breuer, D. B. Ponceleon, D. Petkovic, M. D. Flickner, E. Upfal, S. I. Nin, S. Sull, B. E. Dom, B. L. Yeo, S. Srinivasan, D. Zivkovic, and M. Penner, “Updates to the QBIC System,” Proceeding of IS&T/SPIE International Conference on Storage and Retrieval for Image and Video Databases, vol. 5, pp. 150-161, San Jose, CA, 1997.
    [2]F. Arman, A. Hsu, and M. Y. Chiu, “Feature Management for Large Video Databases,” Storage and Retrieval for Image and Video Databases, vol. SPIE-1908, pp. 2-12, 1993.
    [3]A. Hanjalic, R. L. Lagendijk, and J. Biemond, “A New Method for Key-Frame based Video Content Representation,” Image Database and Multimedia Search, pp. 97-107, Singapore, 1997.
    [4]H. J. Zhang, C. Y. Low, and S. W. Smoliar, “Video Parsing and Browsing Using Compressed Data,” Multimedia Tools Application, vol. 1, pp. 89-111, 1995.
    [5]廖桂華, “MPEG I/II影片自動擷取主要畫面研究” ,碩士論文,臺灣師範大學資訊教育研究所,民國九十年六月。
    [6]G. C. Lee and M. N. Tsai, “An Efficient Slide-Changing Detection Algorithm for MPEG Coded Lecture Videos,” Proceeding of The IASTED International Conference on Visualization, Imaging and Image Processing, Malaga, Spain, 2002.
    [7]J. W. Saye, “Technology in the Classroom: The Role of Depositions in Teacher Gate Keeping,” Journal of Curriculum and Supervision, vol. 13, pp. 210-234, 1995.
    [8]http://www.adobe.com/, Adobe Systems Incorporated.
    [9]ISO/IEC 11172-2: 1993/Cor 2: 1999: Information technology-, “Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5Mbits/s,”-Part 2: Video.
    [10]J. Mitchell, W. Pennebaker, C. Fogg, and D. LeGall, “MPEG Video: Compression Standard”, Chapman and Hall, New York, NY, 1997.
    [11]D. Bordwell and K. Thompson, “Film Art: An Introduction,” 2nd ed., Random House, 1986.
    [12]R. Zabih, J. Miller, and K. Mai, “A Feature-Based Algorithm for Detecting and Classifying Production Effects,” Multimedia Systems, vol. 7, pp. 119-128, 1999.
    [13]A. Chianese, V. Moscato, G. Boccignone, and A. Picariello, “Foveated Shot Detection for Video Segmentation,” IEEE Transaction on Circuits and System for Video Technology, vol. 15, pp. 365-377. 2005.
    [14]A. Hanjalic and H. J. Zhang, “An Intergrated Scheme for Automated Video Abstraction Based on Unsupervised Cluster-Validity Analysis,” IEEE Transaction on Circuits and System for Video Technology, vol. 9, pp. 365-377, 1999.
    [15]F. Idris and S. Panchanathan, “Review of Image and Video Indexing Techniques,” Journal of Visual Communication and Image Representation, vol. 8, pp. 146-166, 1997.
    [16]F. Arman, R. Depommier, A. Hsu, and M. Y. Chiu, “Content-Based Browsing of Video Sequences,” Proceeding of ACM Conference on Multimedia, pp. 97-103, San Francisco, CA, 1994.
    [17]J. S. Boreczky, and L. D. Wilcox, “A Hidden Markov Model Framework for Video Segmentation Using Audio and Image Features,” Proceeding of International Conference on Acoustics, Speech, and Signal Processing, vol. 6, pp. 3741-3744, Seattle, WA, 1998.
    [18]S. F. Chang, W. Chen, H. J. Meng, H. Sundaram, and D. Zhong, “VideoQ: An Automated Content Based Video Search System Using Visual Cues,” in Proceeding of ACM Conference on Multimedia, pp.313-324, Seattle, WA, 1997.
    [19]A. M. Ferman and A. M. Tekalp, “Efficient Filtering and Clustering for Temporal Video Segmentation and Visual Summarization,” Journal of Visual Communication and Image Representation, vol. 9, pp. 336-351, 1998.
    [20]U. Gargi, R. Kasturi, and S. Antani, “Performance Characterization and Comparison of Video Indexing Algorithms,” Proceeding of Conference on Computer Vision and Pattern Recognition, pp. 559-565, Santa Barbara, CA,1998.
    [21]B. Günsel, A. M. Ferman, and A. M. Tekalp, “Temporal Video Segmentation Using Unsupervised Clustering and Semantic Object Tracking,” Journal of Electronic Imaging, vol. 7, pp. 592-604 ,1998.
    [22]T. Kikukawa and S. Kawafuchi, “Development of an Automatic Summary Editing System for the Audio-Visual Resources,” IEEE Transaction on Electronics and Information, vol. J75-A, pp. 204-212, 1992.
    [23]M. J. Swain, “Interactive Indexing into Image Databases,” Proceeding of IS&T/SPIE International Conference on Storage and Retrieval for Image and Video Databases, pp.173-187, San Jose, CA, 1993.
    [24]H. J. Zhang, A. Kankanhalli, and S. W. Smoliar, “Automatic Partitioning of Full-Motion Video,” Multimedia Systems, vol. 1, pp. 10-28, 1993.
    [25]J. S. Boreczky and L. A. Rowe, “Comparison of Video Shot Boundary Detection Techniques,” Journal of Electronic Imaging, vol. 5, pp. 122-128, 1996.
    [26]M. Mentzelopoulos and A. Psarrou, “Key-Frame Extraction Algorithm using Entropy Difference,” Proceeding of ACM SIGMM International Workshop on Multimedia Information Retrieval, pp. 39-45, New York, NY, 2004.
    [27]Z. Černeková, I. Pitas, and C. Nikou, “Information Theory-Based Shot Cut/Fade Detection and Video Summarization,” IEEE Transaction on Circuits and Systems for Video Technology, vol. 16, pp. 82-91, 2006.
    [28]B. Shahraray, “Scene Change Detection and Content-Based Sampling of Video Sequences,” Proceeding of IS&T/SPIE International Symposium on Electronic Imaging, vol. 2419, pp. 2-13, San Jose, CA, 1995.
    [29]R. Kasturi and R. Jain, “Dynamic Vision, in Computer Vision: Principles,” IEEE Computer Society Press, pp. 469-480, Los Alamitos, CA, 1991.
    [30]A. Nagasaka and Y. Tanaka, “Automatic Video Indexing and Full-Video Search for Object Appearances,” in Visual Database Systems II, pp. 113-127, 1995.
    [31]F. Arman, A. Hsu, and M. Y. Chiu, “Image Processing on Compressed Data for Large Video Databases,” Proceeding of ACM Conference on Multimedia, pp. 267-272, Anaheim, CA, 1993.
    [32]H. J. Zhang, C. Y. Low, Y. H. Gong, and S. W. Smoliar, “Video Parsing Using Compressed Data,” Proceeding of SPIE Conference on Image and Video Processing II, vol. 2182, pp. 142-149, San Jose, CA,1994.
    [33]B. Yeo and B. Liu, “Rapid Scene Analysis on Compressed Video,” IEEE Transaction on Circuits and Systems for Video Technology, vol. 5, pp. 533-544, 1995.
    [34]C. Taskiran, and E. Delp, “Video Scene Change Detection Using the Generalized Sequence Trace,” Proceeding of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2961-2964, Seattle, WA, 1998.
    [35]J. Meng, Y. Juan, and S. F. Chang, “Scene Change Detection in a MPEG Compressed Video Sequence,” Proceeding of IS&T/SPIE International Symposium on Electronic Imaging, vol. 2417, pp. 14-25, San Jose, CA, 1995.
    [36]H. C. Liu and G. L. Zick, “Automatic Determination of Scene Changes in MPEG Compressed Video,” Proceeding ISCAS-IEEE International Symposium on Circuits and Systems, pp. 764–767, Seattle, WA, 1995.
    [37]S. C. Pei and Y. Z. Chou, “Efficient MPEG Compressed Video Analysis Using Macroblock Type Information,” IEEE Transaction on Multimedia, vol. 1, pp. 321-333, 1999.
    [38]A. Divakaran., R. Radhakrishnan, and K. A. Peker, “Motion Activity-Basad Extraction of Key-Frames from Video Shots,” Proceeding of IEEE International Conference on Image Processing, vol. 1, pp. 22-25, New York, NY, 2002.
    [39]W. Wolf, “Key Frame Selection by Motion Analysis,” Proceeding IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 1228-1231, Atlanta, GA, 1996.
    [40]H. J. Zhang, C. Y. Low, and S. W. Smoliar, “Video Parsing and Browsing using Compressed Data,” Multimedia Tools and Applications, vol. 1, pp. 89-111, 1995.
    [41]B. C. Song and J. B. Ra, “Automatic Shot Change Detection Algorithm Using Multi-stage Clustering for MPEG-Compressed Videos,” Journal of Visual Communication and Image Representation, vol. 12, pp.364-385, 2001.
    [42]T. Liu, H. J. Zhang, and F. Qi, “A Novel Video Key-Frame-Extraction Algorithm Based on Perceived Motion Energy Model,” IEEE Transaction on Circuits and Systems for Video Technology, vol. 13, pp.1006-1013, 2003.
    [43]C. W. Ngo, Y. F. Ma, and H. J. Zhang, “Video Summarization and Scene Detection by Graph Modeling,” IEEE Transaction on Circuits and Systems for Video Technology, vol. 15, pp. 296-305, 2005.
    [44]J. Astola, P. Haavisto, and Y. Neuvo, “Vector Median Filters ,” Proceeding of IEEE, vol. 78, pp. 678-687, 1990.
    [45]T. Kadir and M. Brady. “Scale, saliency and image description,” International Journal Computer Vision, vol. 45, pp. 83-105, 2001.
    [46]C. E. Shannon, “A Mathematical Theory of Communication,” Bell System Technical Journal, vol. 27, pp. 379-423 and 623-656, 1948.

    QR CODE