國立臺灣師範大學博碩士論文全文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	謝克瑤 Ke-Yao Hsieh
論文名稱：	以區塊特徵為基礎進行不完整視訊文件影像之比對 A Block-based Approach for Partial Document Image Matching
指導教授：	李忠謀 Lee, Chung-Mou
學位類別：	碩士 Master
系所名稱：	資訊教育研究所 Graduate Institute of Information and Computer Education
論文出版年：	2005
畢業學年度：	93
語文別：	中文
論文頁數：	58
中文關鍵詞：	影像比對
英文關鍵詞：	image matching, feature-based
論文種類：	學術論文
相關次數：	點閱：549 下載：8
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本研究針對視訊文件影像提出快速有效的比對方法，在不需辨識出文字的情況下，進行不完整視訊文件影像與原始文件影像之比對。視訊文件影像的來源是由視訊影片所擷取出，其中視訊影片是經由數位攝影機，拍攝投影在布幕上的文件檔案畫面而得。
比對之方法分為三個階段：首先，須找出有用的文件內容為前景資訊，例如文件中的文字或圖形在該影像中所在之區域，即前景與背景的分離；第二階段，對文件內容做分析，將前景影像分割成各個區塊(bounding box)，再以區塊為特徵進行取樣，每張影像皆由一組特徵向量所組成；第三階段，從任兩張影像間的特徵向量進行相似度計算，以相似度最高的影像作為比對結果。
本研究利用27組簡報檔案，共有1020張投影片影像，進行比對實驗，利用本研究所提出的方法，在區塊百分比為25%的情況下，比對正確率為92%；區塊百分比為50%時，正確率為96%；當區塊百分比為75%以上時，正確率即可達到98%。本研究亦針對PDF文件影像進行不完整影像的比對實驗，其中利用7組PDF文件，共98張PDF文件影像，在影像內容只出現下半部份的影像時，正確率為89%；當影像內容為上半部份和中間部份的情況下，正確率可達到92%以上。
利用本研究所提出之比對方法，對文件影像進行區域的特徵取樣，可有效地解決不完整影像比對之問題。

We propose a fast algorithm for matching videotaped document images against original document images. Without recognizing the characters, we compare the original images with the videotaped partial images, captured by tapping the video output from a computer connected to a projector.
The algorithm contains three steps. First, in foreground-background separation, we extract useful foreground information from images, such as texts or figures. Second, page layout analysis segments the foreground image into each block (bounding box). For each image, we take feature vectors from blocks. Finally, the algorithm matches the feature vectors and computes the least square error for each matching image.
Experiment uses twenty-seven sets of lecturing slides, which consist of 1020 images. The results show that a 92% precision rate can be attained when block percentage is only 25%; the precision rate is 96% when block percentage is 50%. If block percentage is more than 75%, the precision rate can be up to 98%. We also match partial images matching against PDF images, and use seven sets of PDF files, which consisted of 98 PDF file images. When image only appears the below part, the precision rate is 89%, and if image appears the above part or the middle part, the precision rate can be up to 92%.
Our algorithm can solve the partial image matching problem effectively by sampling local feature from every image.

目錄

附圖目錄...iii

附表目錄...v

第一章  緒論...1
1.1研究動機...1
1.2研究目的...3
1.3研究範圍與限制...5
1.4 論文架構...6
第二章  文獻探討...7
2.1文件內容分析...7
2.2文件影像組成要素及特性...9
2.2.1文件影像組成要素...9
2.2.2文件影像特性...11
2.3文件影像比對的相關研究...13
2.3.1一般文件影像的比對技術...13
2.3.2視訊影像與投影片影像間的比對技術...15
2.4綜合討論...17
第三章  文件影像分割...18
3.1系統架構...18
3.2文件影像分割之技術...19
3.2.1文件影像之前景背景分離...19
3.2.2文件影像內容分析...25
3.3結論...28
第四章  文件影像特徵萃取與比對...29
4.1文件影像之特徵取樣方法...29
4.2利用特徵向量進行文件影像比對...33
4.2.1取得文件影像的特徵向量...33
4.2.2計算特徵向量間的最小差異值...34
4.2.3可信度值...35
4.3結論...36
第五章  實驗結果與討論...37
5.1實驗資料來源...37
5.2實驗驗證...42
5.3總結...56
第六章  結論與未來研究...57
6.1結論...57
6.2未來研究...57
參考文獻...59

附圖目錄

圖1.1    師大資訊系提供的線上學習系統，左上角為播放的視訊影像，右半部為視訊影像所對應到的教材內容...2
圖1.2    投影片影像 (a)原始文件影像 (b)不完整視訊影像...4
圖1.3    PDF文件影像 (a)原始文件影像 (b)不完整視訊影像...5
圖2.1    文件影像 (a)原始文件影像 (b)經二元化和歪斜校正後所取出的文字區塊 (c)區塊所紀錄的特徵值...14
圖2.2    Component Block List的比對方法...14
圖2.3    不同類型的投影片範例 (a)ROI-Txt (b)ROI-N (c)N-Txt (d)N-N...16
圖3.1    系統流程圖...18
圖3.2    (a)(b)(c)為原始投影片影像，(d)為背景影像，(e)(f)(g)分別為(a)(b)(c)的前景影像...21
圖3.3    (a)-(j)為投影片影像 (k)為依3.1式計算後所得的背景影像 (l)為實際簡報檔設計範本的影像...23
圖3.4    二元化影像的水平方向投影和垂直方向投影...25
圖3.5    (a)範例影像和水平投影長條圖 (b)使用RXYC方法所分割出的影像...27
圖3.6    投影片影像中，編輯內容在排版上所具有的特性...27
圖3.7    (a)純文字影像取出之前景區塊 (b)非純文字影像取出之前景區塊...28
圖4.1    前景影像所分割出的區塊...30
圖4.2    定義一條線段通過 和 兩點， 和 位在影像中物件的輪廓，淺灰色區域代表此物件，深灰色區域代表此線段所交過的區塊...31
圖4.3    取區塊的垂直投影量為特徵向量， 和 兩點為影像之最左邊界與最右邊界...32
圖4.4    以特徵向量進行影像比對之流程圖...33
圖4.5    以特徵向量進行文件影像比對之流程圖...35
圖5.1    投影片影像 (a)校正前之視訊影像 (b)校正後之視訊影像...39
圖5.2    PDF文件影像 (a)校正前之視訊影像 (b)校正後之視訊影像...40
圖5.3    投影片影像內容分類，(a)為純文字內容，(b)為非純文字內容，其中左欄為原始文件影像，右欄為視訊影像...41
圖5.4    PDF文件影像之分類，(a)為原始文件影像，(b) (c) (d)分別為上半部分、中間部分和下半部分的視訊影像...42
圖5.5    投影片影像的比對正確率...45
圖5.6    PDF文件影像的比對正確率...46
圖5.7    特徵向量維度與比對正確率之關係...48
圖5.8    特徵向量維度與比對正確率之關係...49
圖5.9    投影片影像張數與計算時間關係圖...    50
圖5.10    投影片影像為純文字內容與比對正確率之關係...54
圖5.11    投影片影像為非純文字內容與比對正確率之關係...55


附表目錄

表2.1    不同類型影像所使用的比對方法...17
表5.1    簡報檔案分類及投影片張數...37
表5.2    PDF文件檔案分類及檔案頁數...38
表5.3    實驗投影片影像，依區塊百分比來分類影像數目...41
表5.4    投影片影像的比對正確率...45
表5.5    PDF文件影像的比對正確率...46
表5.6    特徵向量維度與比對正確率之關係...48
表5.7    特徵向量維度與比對正確率之關係...49
表5.8    投影片影像之比對正確率...51
表5.9    PDF文件影像之比對正確率...52
表5.10    投影片影像為純文字內容與比對正確率之關係...54
表5.11    投影片影像為非純文字內容與比對正確率之關係...55
                                

參考文獻
[1] K. H. Lee, Y. C. Choy, and S. B. Cho, “Geometric Structure Analysis of Document Images: A Knowledge-based Approach,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 11, pp. 1224-1240, 2000.
[2] J. P. Bixler, “Tracking Text in Mixed-Mode Document,” Proceedings of ACM Conference on Document Processing Systems, New Mexico, United States, pp. 177-185, 1988.
[3] H. Peng, F. Long, Z. Chi, and W. C. Siu, “Document Image Template Matching Based on Component Block List,” Pattern Recognition Letters, Vol. 22, pp. 1033-1042, 2001.
[4] K. Y. Jeong, K. Jung, E. Y. Kim, and H. J. Kim, “Neural Network-based Text Location for News Video Indexing,” Proceedings of IEEE International Conference on Image Processing, Kobe, Japan, Vol. 3, pp. 319-323, 1999.
[5] C. Strouthopoulos, N. Papamarkos, and A. E. Atsalakis, “Text Extraction in Complex Color Documents,” Pattern Recognition, Vol. 35, pp. 1743-1758, 2002.
[6] D. Chetverikov, J. Liang, J. Komuves, and R. M. Haralick, “Zone Classification Using Texture Features,” Proceedings of the 13th IEEE International Conference on Pattern Recognition, Vienna, Austria, Vol. 3, pp. 676-680, 1996.
[7] V. Wu, R. Manmatha, and E. M. Riseman, “Textfinder: An Automatic System to Detect and Recognize Text in Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No. 11, 1999.
[8] H. Wang, S. Z. Li, and S. Ragupathi, “Document Segmentation and Classification with Top-Down Approach,” Proceedings of the 1st IEEE International Conference on Knowledge-Based Intelligent Electronic Systems, Adelaide, South Australia, Vol. 1, pp. 243-247, 1997.
[9] S. Khedekar, V. Ramanaprasad, S. Setlur, and V. Govindaraju, “Text-Image Separation in Devanagari Documents,” Proceedings of the 7th IEEE International Conference on Document Analysis and Recognition, Edinburgh, Scottland, United Kingdom, pp. 1265-1269, 2003.
[10] F. Cesarini, M. Gori, S. Marinai, and G. Soda, “Structured Document Segmentation and Representation by the Modified X-Y Tree,” Proceedings of the 5th IEEE International Conference on Document Analysis and Recognition, Bangalore, India, pp. 563-566, 1999.
[11] S. W. Lee, D. S. Ryu, “Parameter-Free Geometric Document Layout Analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 23, No. 11, pp. 1240-1256, 2001.
[12] J. Duong, H. Emptoz, and M. Cote, “Features for Printed Document Image Analysis,” Proceedings of the 16th IEEE International Conference on Pattern Recognition, Quebec, Canada, Vol. 3, pp. 245-248, 2002.
[13] Q. Ye, W. Gao, W. Wang, and W. Zeng, “A Robust Text Detection Algorithm in Images and Video Frames,” Proceedings of the 4th IEEE Pacific Rim Conference on Multimedia, Information, Communications and Signal Processing, Singapore, Vol. 2, pp. 802-806, 2003.
[14] C. L. Wang, T. S. Chen, Y. K. Chan, R. H. Hwang, and W. W. Huang, “Chinese Document Image Retrieval System Based on Proportion of Black Pixel Area in a Character Image,” Proceedings of the 6th IEEE International Conference on Advanced Communication Technology, Korea, Vol. 1, pp. 25-29, 2004.
[15] M. E. Ansari, L. Masmoudi, and L. Radouane, “A New Region Matching Method for Stereoscopic Images,” Pattern Recognition Letters, Vol. 21, pp. 283-294, 2000.
[16] L. Georgy, Gimel'Farb, and A. K. Jain, “On Retrieving Textured Images from an Image Database,” Pattern Recognition, Vol. 29, pp. 1461-1483, 1996.
[17] J. S. Greenfeld, S. S. Hinsken, and W. Muller, “A Strategy for Automated Stereo Model Orientation Using a Feature-based Matching Procedure,” Proceedings of the American Society for Photogrammetry and Remote Sensing, Baltimore, Maryland, United States, Vol. 5, pp. 131-140, 1991.
[18] F. Wang, C. W. Ngo, and T. C. Pong, “Synchronization of Lecture Videos and Electronic Slides by Video Text Analysis,” Proceedings of the 11th ACM International Conference on Multimedia, Berkeley, California, United States, pp. 315-318, 2003.
[19] J. W. Han, L. Guo, “A Shape-based Image Retrieval Method Using Salient Edges,“ Signal Processing: Image communication, Vol. 18, pp. 141-156, 2003.
[20] G. Pass, R. Zabih, “Histogram Refinement for Content-based Image Retrieval,” Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision, Sarasota, Florida, United States, pp 96-102, 1996.
[21] O. D. Vel, S. Aeberhard, “Object Recognition Using Random Image-lines,” Image and Vision Computing, Vol. 18, pp. 193-198, 2000.
[22] B. Erol, J. J. Hull, and D. S. Lee, “Linking Multimedia Presentations with their Symbolic Source Documents: Algorithm and Applications,” Proceedings of the 11th ACM International Conference on Multimedia, Berkeley, California, United States, pp. 498-507, 2003.
[23] Y. Lu, W. Gao, and F. Wu, “Sprite Generation for Frame-based Video Coding,” Proceedings of IEEE International Conference on Image Processing, Thessaloniki, Greece, Vol. 1, pp. 473-476, 2001
[24] G. C. Lee, M. Y. Chou, and H. F. Yang, “Using Ellipsoidal Lattice in Matching of Projected Slides,” Proceedings of the 6th Asian Conference on Computer Vision, Jeju, Korea, Vol. 2, pp. 1146-1151, 2004.
[25] J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries,” Pattern Analysis and Machine Intelligence, Vol. 23, pp. 947-963, 2001.
[26] T. Kawanishi, T. Kurozumi, K. Kashino, and S. Takagi, “A Fast Template Matching Algorithm with Adaptive Skipping Using Inner-Subtemplates’ Distances,” Proceedings of the 17th International Conference on Pattern Recognition, Japan, Vol. 3, pp. 654-657, 2004.
[27] J. Duong, H. Emptoz, and C. Y. Suen, “Extraction of Text Areas in Printed Document Images,” Proceedings of the 2001 ACM Symposium on Document Engineering , Atlanta, United States, pp. 157-165, 2001.

簡易檢索 / 詳目顯示

相關論文