研究生: |
謝克瑤 Ke-Yao Hsieh |
---|---|
論文名稱: |
以區塊特徵為基礎進行不完整視訊文件影像之比對 A Block-based Approach for Partial Document Image Matching |
指導教授: |
李忠謀
Lee, Chung-Mou |
學位類別: |
碩士 Master |
系所名稱: |
資訊教育研究所 Graduate Institute of Information and Computer Education |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 中文 |
論文頁數: | 58 |
中文關鍵詞: | 影像比對 |
英文關鍵詞: | image matching, feature-based |
論文種類: | 學術論文 |
相關次數: | 點閱:359 下載:8 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究針對視訊文件影像提出快速有效的比對方法,在不需辨識出文字的情況下,進行不完整視訊文件影像與原始文件影像之比對。視訊文件影像的來源是由視訊影片所擷取出,其中視訊影片是經由數位攝影機,拍攝投影在布幕上的文件檔案畫面而得。
比對之方法分為三個階段:首先,須找出有用的文件內容為前景資訊,例如文件中的文字或圖形在該影像中所在之區域,即前景與背景的分離;第二階段,對文件內容做分析,將前景影像分割成各個區塊(bounding box),再以區塊為特徵進行取樣,每張影像皆由一組特徵向量所組成;第三階段,從任兩張影像間的特徵向量進行相似度計算,以相似度最高的影像作為比對結果。
本研究利用27組簡報檔案,共有1020張投影片影像,進行比對實驗,利用本研究所提出的方法,在區塊百分比為25%的情況下,比對正確率為92%;區塊百分比為50%時,正確率為96%;當區塊百分比為75%以上時,正確率即可達到98%。本研究亦針對PDF文件影像進行不完整影像的比對實驗,其中利用7組PDF文件,共98張PDF文件影像,在影像內容只出現下半部份的影像時,正確率為89%;當影像內容為上半部份和中間部份的情況下,正確率可達到92%以上。
利用本研究所提出之比對方法,對文件影像進行區域的特徵取樣,可有效地解決不完整影像比對之問題。
We propose a fast algorithm for matching videotaped document images against original document images. Without recognizing the characters, we compare the original images with the videotaped partial images, captured by tapping the video output from a computer connected to a projector.
The algorithm contains three steps. First, in foreground-background separation, we extract useful foreground information from images, such as texts or figures. Second, page layout analysis segments the foreground image into each block (bounding box). For each image, we take feature vectors from blocks. Finally, the algorithm matches the feature vectors and computes the least square error for each matching image.
Experiment uses twenty-seven sets of lecturing slides, which consist of 1020 images. The results show that a 92% precision rate can be attained when block percentage is only 25%; the precision rate is 96% when block percentage is 50%. If block percentage is more than 75%, the precision rate can be up to 98%. We also match partial images matching against PDF images, and use seven sets of PDF files, which consisted of 98 PDF file images. When image only appears the below part, the precision rate is 89%, and if image appears the above part or the middle part, the precision rate can be up to 92%.
Our algorithm can solve the partial image matching problem effectively by sampling local feature from every image.
參考文獻
[1] K. H. Lee, Y. C. Choy, and S. B. Cho, “Geometric Structure Analysis of Document Images: A Knowledge-based Approach,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 11, pp. 1224-1240, 2000.
[2] J. P. Bixler, “Tracking Text in Mixed-Mode Document,” Proceedings of ACM Conference on Document Processing Systems, New Mexico, United States, pp. 177-185, 1988.
[3] H. Peng, F. Long, Z. Chi, and W. C. Siu, “Document Image Template Matching Based on Component Block List,” Pattern Recognition Letters, Vol. 22, pp. 1033-1042, 2001.
[4] K. Y. Jeong, K. Jung, E. Y. Kim, and H. J. Kim, “Neural Network-based Text Location for News Video Indexing,” Proceedings of IEEE International Conference on Image Processing, Kobe, Japan, Vol. 3, pp. 319-323, 1999.
[5] C. Strouthopoulos, N. Papamarkos, and A. E. Atsalakis, “Text Extraction in Complex Color Documents,” Pattern Recognition, Vol. 35, pp. 1743-1758, 2002.
[6] D. Chetverikov, J. Liang, J. Komuves, and R. M. Haralick, “Zone Classification Using Texture Features,” Proceedings of the 13th IEEE International Conference on Pattern Recognition, Vienna, Austria, Vol. 3, pp. 676-680, 1996.
[7] V. Wu, R. Manmatha, and E. M. Riseman, “Textfinder: An Automatic System to Detect and Recognize Text in Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No. 11, 1999.
[8] H. Wang, S. Z. Li, and S. Ragupathi, “Document Segmentation and Classification with Top-Down Approach,” Proceedings of the 1st IEEE International Conference on Knowledge-Based Intelligent Electronic Systems, Adelaide, South Australia, Vol. 1, pp. 243-247, 1997.
[9] S. Khedekar, V. Ramanaprasad, S. Setlur, and V. Govindaraju, “Text-Image Separation in Devanagari Documents,” Proceedings of the 7th IEEE International Conference on Document Analysis and Recognition, Edinburgh, Scottland, United Kingdom, pp. 1265-1269, 2003.
[10] F. Cesarini, M. Gori, S. Marinai, and G. Soda, “Structured Document Segmentation and Representation by the Modified X-Y Tree,” Proceedings of the 5th IEEE International Conference on Document Analysis and Recognition, Bangalore, India, pp. 563-566, 1999.
[11] S. W. Lee, D. S. Ryu, “Parameter-Free Geometric Document Layout Analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 23, No. 11, pp. 1240-1256, 2001.
[12] J. Duong, H. Emptoz, and M. Cote, “Features for Printed Document Image Analysis,” Proceedings of the 16th IEEE International Conference on Pattern Recognition, Quebec, Canada, Vol. 3, pp. 245-248, 2002.
[13] Q. Ye, W. Gao, W. Wang, and W. Zeng, “A Robust Text Detection Algorithm in Images and Video Frames,” Proceedings of the 4th IEEE Pacific Rim Conference on Multimedia, Information, Communications and Signal Processing, Singapore, Vol. 2, pp. 802-806, 2003.
[14] C. L. Wang, T. S. Chen, Y. K. Chan, R. H. Hwang, and W. W. Huang, “Chinese Document Image Retrieval System Based on Proportion of Black Pixel Area in a Character Image,” Proceedings of the 6th IEEE International Conference on Advanced Communication Technology, Korea, Vol. 1, pp. 25-29, 2004.
[15] M. E. Ansari, L. Masmoudi, and L. Radouane, “A New Region Matching Method for Stereoscopic Images,” Pattern Recognition Letters, Vol. 21, pp. 283-294, 2000.
[16] L. Georgy, Gimel'Farb, and A. K. Jain, “On Retrieving Textured Images from an Image Database,” Pattern Recognition, Vol. 29, pp. 1461-1483, 1996.
[17] J. S. Greenfeld, S. S. Hinsken, and W. Muller, “A Strategy for Automated Stereo Model Orientation Using a Feature-based Matching Procedure,” Proceedings of the American Society for Photogrammetry and Remote Sensing, Baltimore, Maryland, United States, Vol. 5, pp. 131-140, 1991.
[18] F. Wang, C. W. Ngo, and T. C. Pong, “Synchronization of Lecture Videos and Electronic Slides by Video Text Analysis,” Proceedings of the 11th ACM International Conference on Multimedia, Berkeley, California, United States, pp. 315-318, 2003.
[19] J. W. Han, L. Guo, “A Shape-based Image Retrieval Method Using Salient Edges,“ Signal Processing: Image communication, Vol. 18, pp. 141-156, 2003.
[20] G. Pass, R. Zabih, “Histogram Refinement for Content-based Image Retrieval,” Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision, Sarasota, Florida, United States, pp 96-102, 1996.
[21] O. D. Vel, S. Aeberhard, “Object Recognition Using Random Image-lines,” Image and Vision Computing, Vol. 18, pp. 193-198, 2000.
[22] B. Erol, J. J. Hull, and D. S. Lee, “Linking Multimedia Presentations with their Symbolic Source Documents: Algorithm and Applications,” Proceedings of the 11th ACM International Conference on Multimedia, Berkeley, California, United States, pp. 498-507, 2003.
[23] Y. Lu, W. Gao, and F. Wu, “Sprite Generation for Frame-based Video Coding,” Proceedings of IEEE International Conference on Image Processing, Thessaloniki, Greece, Vol. 1, pp. 473-476, 2001
[24] G. C. Lee, M. Y. Chou, and H. F. Yang, “Using Ellipsoidal Lattice in Matching of Projected Slides,” Proceedings of the 6th Asian Conference on Computer Vision, Jeju, Korea, Vol. 2, pp. 1146-1151, 2004.
[25] J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries,” Pattern Analysis and Machine Intelligence, Vol. 23, pp. 947-963, 2001.
[26] T. Kawanishi, T. Kurozumi, K. Kashino, and S. Takagi, “A Fast Template Matching Algorithm with Adaptive Skipping Using Inner-Subtemplates’ Distances,” Proceedings of the 17th International Conference on Pattern Recognition, Japan, Vol. 3, pp. 654-657, 2004.
[27] J. Duong, H. Emptoz, and C. Y. Suen, “Extraction of Text Areas in Printed Document Images,” Proceedings of the 2001 ACM Symposium on Document Engineering , Atlanta, United States, pp. 157-165, 2001.