簡易檢索 / 詳目顯示

研究生: 陳璽文
Chen, Xi-Wen
論文名稱: 結合頭部姿態估計與補償的視線追蹤
Gaze Tracking with Head Pose Estimation and Compensation
指導教授: 高文忠
Kao, Wen-Chung
口試委員: 高文忠
Kao, Wen-Chung
陳建隆
Chern, Jann-Long
范育成
Fan, Yu-Cheng
口試日期: 2025/01/22
學位類別: 碩士
Master
系所名稱: 電機工程學系
Department of Electrical Engineering
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 70
中文關鍵詞: 凝視追蹤頭部姿態估計3D 眼球模型深度學習
英文關鍵詞: Gaze Tracking, Pose Estimation, 3D Eye Model, Deep learning
研究方法: 實驗設計法比較研究觀察研究文件分析法內容分析法
論文種類: 學術論文
相關次數: 點閱:8下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本文提出了一種基於可見光影像的視線追蹤系統,採用單一高速相機,取代傳統依賴紅外光源或專用傳感器的方案,從而顯著提升了使用者體驗。然而,這種設置在補償頭部移動方面面臨更大的挑戰。為解決此問題,我們設計了一種新型視線追蹤系統,結合了精確的頭部姿態估計方法。該方法通過識別臉部特徵點並解決 2D 到 3D 的對應問題,獲取特徵點的 3D 坐標,進而估算頭部運動。該系統能夠實時更新眼球模型並準確計算虹膜區域的初始位置。實驗結果表明,當使用者進行輕微頭部移動或旋轉時,該系統能有效提高視線追蹤的精度與準確性。

    This paper proposes a visible-light-based gaze tracking system that utilizes a single high-speed camera, replacing traditional systems that rely on infrared light sources or dedicated sensors, thereby significantly enhancing user experience. However, this configuration poses greater challenges in compensating for head movements. To address this issue, we designed a novel gaze tracking system that integrates an accurate head pose estimation method. The method identifies facial feature points and resolves the 2D-to-3D correspondence problem to obtain the 3D coordinates of these points, which are then used to estimate head motion. The system is capable of real-time updates to the eye model and precise calculation of the initial position of the iris region. Experimental results demonstrate that the system effectively improves gaze tracking accuracy and precision when users perform slight head movements or rotations.

    致謝 i Abstract ii 目錄 iv 圖目錄 vii 表目錄 ix 第一章 緒論 1 1.1 研究背景 1 1.2 研究問題及目的 2 1.3 研究目標 3 第二章 文獻探討 4 2.1 3D 眼球模型與虹膜匹配 4 2.2 眼球中心定位 9 2.3 頭部姿態估計 12 2.3.1 基於幾何特徵的方法 12 2.3.2 基於深度學習的方法 13 2.3.3 基於三維模型的方法 14 2.4 虹膜分割模型 15 2.5 螢幕與視線的映射公式 17 2.5.1 映射關係建立方式 18 2.5.2 回歸分析 19 2.5.3 偵錯點方法 20 第三章 系統架構與設計 22 3.1 臉部特徵點的定位與提取 22 3.2 基於頭部姿態的眼球中心校正 25 3.2.1 頭部姿態估算技術 25 3.2.2 眼球中心初始計算 28 3.2.3 動態校正方法 29 3.3 虹膜區域分割 30 3.3.1 虹膜區域的偵測方法 31 3.3.2 分割後的數據清理 33 3.3.3 虹膜參數提取 34 3.4 虹膜特徵匹配 35 3.4.1 搜尋範圍優化 36 3.4.2 算分區域 38 3.4.3 虹膜匹配 41 3.5 視線方向與螢幕位置的映射關係 42 第四章 實驗結果 45 4.1 實驗環境與系統設置 45 4.2 頭部姿態估計實驗 46 4.3 虹膜分割模型實驗與結果 48 4.3.1 數據集處理與修改 48 4.3.2 模型預處理和訓練 49 4.3.3 量化評估與結果 49 4.4 虹膜匹配優化實驗 51 4.4.1 階層搜尋方法 51 4.4.2 搜尋範圍優化對計算效率的影響 51 4.5 凝視估計系統性能評估 52 4.5.1 像素視角轉換 52 4.5.2 系統精準度 52 4.5.3 系統精密度 53 4.6 熱區圖結果對比 53 4.6.1 九點驗證 53 4.6.2 十七點驗證 57 4.6.3 映射方法在校準的差異 60 第五章 結論與未來展望 63 5.1 結論 63 5.2 未來展望 63 References 65 自傳 68 學術成就 70

    X. Wang, J. Zhang, H. Zhang, S. Zhao, and H. Liu, “Vision-based gaze estimation: A review,” IEEE Transactions on Cognitive and Developmental Systems, vol. 14, no. 2, pp. 316–332, 2021.
    E. Arcoverde, R. Duarte, R. Barreto, J. Magalhaes, C. Bastos, T. Ing Ren, and G. Cavalcanti, “Enhanced real-time head pose estimation system for mobile device,” Integrated Computer Aided Engineering, vol. 21, pp. 281–293, 04 2014.
    N. Ruiz, E. Chong, and J. M. Rehg, “Fine-grained head pose estimation without keypoints,” in Proceedings of the IEEE conference on computer vision and pattern
    recognition workshops, 2018, pp. 2074–2083.
    V. Blanz and T. Vetter, “A morphable model for the synthesis of 3d faces,” in Seminal Graphics Papers: Pushing the Boundaries, Volume 2, 2023, pp. 157–164.
    A. K. Chaudhary, R. Kothari, M. Acharya, S. Dangi, N. Nair, R. Bailey, C. Kanan, G. Diaz, and J. B. Pelz, “Ritnet: Real-time semantic segmentation of the eye for gaze tracking,” in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE, 2019, pp. 3698–3702.
    Z. Wang, J. Chai, and S. Xia, “Realtime and accurate 3d eye gaze capture with dcnn-based iris and pupil segmentation,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 1, pp. 190–203, 2021.
    S.-J. Baek, K.-A. Choi, C. Ma, Y.-H. Kim, and S.-J. Ko, “Eyeball model-based iris center localization for visible image-based eye-gaze tracking systems,” IEEE Transactions on Consumer Electronics, vol. 59, no. 2, pp. 415–421, 2013.
    W.-C. Kao, K.-J. Huang, and Y.-C. Chiu, “Eyeball model construction with head movement compensation for gaze tracking systems,” in 2020 IEEE International Conference on Consumer Electronics (ICCE), 2020, pp. 1–2.
    W.-C. Kao, J.-Y. Li, S.-C. Lin, and Y.-C. Chiu, “High precision canthus alignment for visible-spectrum gaze tracking system,” in 2019 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW), 2019, pp. 1–2.
    R. Valle, J. M. Buenaposada, and L. Baumela, “Multi-task head pose estimation in-the-wild,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 8, pp. 2874–2881, 2020.
    Y. Feng, H. Feng, M. J. Black, and T. Bolkart, “Learning an animatable detailed 3D face model from in-the-wild images,” ACM Transactions on Graphics, (Proc. SIGGRAPH), vol. 40, no. 8, 2021. [Online]. Available: https://doi.org/10.1145/3450626.3459936
    O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer, 2015, pp. 234–241.
    G. Huang, Z. Liu, and K. Q. Weinberger, “Densely connected convolutional networks,” CoRR, vol. abs/1608.06993, 2016. [Online]. Available: http://arxiv.org/abs/1608.06993
    S. J. Garbin, Y. Shen, I. Schuetz, R. Cavin, G. Hughes, and S. S. Talathi, “Openeds:Open eye dataset,” arXiv preprint arXiv:1905.03702, 2019.
    F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <1mb model size,” CoRR, vol. abs/1602.07360, 2016. [Online]. Available: http://arxiv.org/abs/1602.07360
    K. Fornalczyk and A. Wojciechowski, “Robust face model based approach to headpose estimation,” in 2017 Federated Conference on Computer Science and Information Systems (FedCSIS), 2017, pp. 1291–1295.
    H. Proenca, S. Filipe, R. Santos, J. Oliveira, and L. A. Alexandre, “The ubiris.v2: A database of visible wavelength iris images captured on-the-move and at-a-distance,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 8, pp.1529–1535, 2010.
    M. Arsalan, R. A. Naqvi, D. S. Kim, P. H. Nguyen, M. Owais, and K. R. Park, “Irisdensenet: Robust iris segmentation using densely connected fully convolutional networks in the images by visible light and near-infrared light camera sensors,” Sensors, vol. 18, no. 5, 2018. [Online]. Available: https://www.mdpi.com/1424-8220/18/5/1501

    下載圖示
    QR CODE