簡易檢索 / 詳目顯示

研究生: 黃少遠
Huang, Shao-Yuan
論文名稱: 以電腦視覺為基礎之圖書數位化自動化流程之研究
A Study of Computer Vision-Based Automated Processes for Book Digitization
指導教授: 吳順德
Wu, Shuen-De
口試委員: 李坤彥
Lee, Kung-Yen
呂有勝
Lu, Yu-Sheng
吳順德
Wu, Shuen-De
口試日期: 2024/07/15
學位類別: 碩士
Master
系所名稱: 機電工程學系
Department of Mechatronic Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 62
中文關鍵詞: 電腦視覺自動化圖書數位化
英文關鍵詞: Computer Vision, Automation, Book Digitization
研究方法: 行動研究法
DOI URL: http://doi.org/10.6345/NTNU202401285
論文種類: 學術論文
相關次數: 點閱:38下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究的圖書數位化工作,使用高拍儀掃描紙本圖書,將拍攝的書頁影像打包成電子書。然而,在數位化的過程中,拍攝的影像往往會有多餘的區域需要裁切,或影像出現歪斜需要糾正。此外,修正完成的書頁圖片需要進行格式轉檔,同時需要透過文字辨識來加速索引數位化。為了達成這些需求,本研究運用了電腦視覺技術,結合影像處理方法,如輪廓查找和影像膨脹,開發了一套自動化輔助處理流程。這套自動化流程大幅提升了工作效率,使效率達到原來的五倍,成功減少了圖書數位化所需的人力需求。

    This study focuses on the digitization of books using book scanners to capture images of book pages and compile them into electronic books. However, during the digitization process, the captured images often contain extraneous areas that need to be cropped, or the images may be skewed and require correction. Additionally, the corrected book page images need to be converted to different formats and subjected to optical character recognition (OCR) to expedite index digitization. To meet these requirements, this study employs computer vision techniques combined with image processing methods such as contour detection and image dilation to develop an automated processing workflow. This automated workflow significantly enhances efficiency, increasing it fivefold and substantially reducing the manpower required for book digitization.

    第一章 緒論 1 1.1 前言 1 1.2 研究動機與目標 2 1.3 文獻探討 3 1.3.1 使用濾波器查找黑白交界 5 1.3.2 透過膨脹處理偵測頁面歪斜角度 6 第二章 研究內容與方法 8 2.1 研究方法 8 2.2 影像處理方法 10 2.2.1 大津二值化 10 2.2.2 影像膨脹 11 2.3 論文章節介紹 11 第三章 電腦視覺自動化修正 12 3.1 資料夾遍歷模組 12 3.1.1 單資料夾遍歷 12 3.1.2 多資料夾遍歷 13 3.2 黑色背景切除 15 3.3 自動糾正 19 3.3.1 梯形變換 21 3.3.2 純旋轉 24 3.4 線書裁切 26 第四章 TIFF格式轉檔與文字辨識視窗 33 4.1 TIFF格式轉檔 33 4.1.1 單資料夾TIFF轉檔 33 4.1.2 多資料夾批次TIFF轉檔 35 4.2 文字辨識與編輯之視窗程式 37 4.2.1 可縮放自適應影像顯示元件 40 4.2.2 指定頁調出檢視功能 43 第五章 實驗結果與討論 45 5.1 自動處理正確率 45 5.1.1 自動處理正確率–黑色背景切除 46 5.1.2 自動處理正確率–糾正功能 47 5.1.3 自動處理正確率–線書裁切 50 5.2 自動處理耗時 51 5.2.1 自動處理耗時–黑色背景切除 52 5.2.2 自動處理耗時–糾正功能 53 5.2.3 自動處理耗時–線書裁切 54 5.3 使用者回饋數據 55 5.4 實驗結果討論 56 第六章 結論 59 6.1 結論 59 6.2 未來展望 59 參考文獻 61

    [1] C. H. Liu, “A Program Implements for Documents Captured Using Camera Instead of a Scanner”, CUST, Graduate Thesis, 2015.
    [2] Stephen Pollard and Maurizio Pilu, “Building Cameras for Capturing Documents,” International Journal on Document Analysis and Recognition, Volume 7, Pages 123-137, July 2005.
    [3] P. H. Li, “Document Image Analysis on a Scanned Journal Page,” YZU, Graduate Thesis, 2012.
    [4] How to automatically deskew(straighten) a text image using OpenCV, [Online], https://becominghuman.ai/how-to-automatically-deskew-straighten-a-text-image-using-opencv-a0c30aed83df
    [5] X. Xu, S. Xu, L. Jin and E. Song, “Characteristic Analysis of Otsu Threshold and Its Applications,” Pattern Recognition Letters, Volume 32, Pages 956-961, May 2011.
    [6] N. Quan, X. Zhou and S. Chen, “Scan Paperback Books by a Camera,” 2016 IEEE International Conference on Information and Automation (ICIA), Pages 579-584, 2016.
    [7] J. Fan, X. Lin and S. Simske, “A Comprehensive Image Processing Suite for Book Re-Mastering,” Eighth International Conference on Document Analysis and Recognition (ICDAR'05), Volume 1, Pages 447-451, 2005.
    [8] Python 使用 OpenCV 自動裁切掃描文件白邊、修正傾斜角度, [Online], https://blog.gtwang.org/programming/python-opencv-auto-crop-and-rotate-scanned-image-tutorial/amp/
    [9] Scanning Documents from Photos Using OpenCV, [Online], https://bretahajek.com/2017/01/scanning-documents-photos-opencv/
    [10] T. H. Wu, “The Perspective Rectification for Non-planar Documents,” NCU, Graduate Thesis, 2011.
    [11] Y. K. Chen, “Skew Detection and Reconstruction Based on Scanning Line,” NCKU, Graduate Thesis, 1996.
    [12] Scikit-learn, Robust linear model estimation using RANSAC, [Online], https://scikit-learn.org/stable/auto_examples/linear_model/plot_ransac.html
    [13] K A M Said and A B Jambek, “Analysis of Image Processing Using Morphological Erosion and Dilation,” Journal of Physics: Conference Series, Volume 2071, 2021.
    [14] G. Meng, S. Xiang, N. Zheng and C. Pan, “Nonparametric Illumination Correction for Scanned Document Images via Convex Hulls,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 35, Number 7, Pages 1730-1743, July 2013.
    [15] Jeff Kabachinski, “TIFF, GIF, and PNG: Get the Picture,” Biomedical Instrumentation & Technology, July 2007.
    [16] Google Cloud Vision API, Detect text in images, [Online], https://cloud.google.com/vision/docs/ocr
    [17] Abdelmoumen TAIR, “Application of Python with Pyqt in the Security of Information by Detected the Face Using Webcam,” University of Biskra, 2020.
    [18] GitHub, harupy, snipping-tool, [Online], https://github.com/harupy/snipping-tool

    下載圖示
    QR CODE