簡易檢索 / 詳目顯示

研究生: 簡文浩
Jian, Wen-Hau
論文名稱: 基於3D人臉辨識之擴增實境技術改善臉盲症社交輔助系統
A Social Assistance System for Augmented Reality Technology to Redound Face Blindness with 3D Face Recognition
指導教授: 陳美勇
Chen, Wei-Yung
口試委員: 蘇順豐
Su, Shun-Feng
練光祐
Lian, Kuang-Yow
方瓊瑤
Fang, Chiung-Yao
陳美勇
Chen, Mei-Yung
口試日期: 2021/07/22
學位類別: 碩士
Master
系所名稱: 機電工程學系
Department of Mechatronic Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 104
中文關鍵詞: 臉盲症卷積神經網絡3D 可變模型人臉辨識擴增實境
英文關鍵詞: Prosopagnosia, Convolutional Neural Network, 3D Morphable Model, Face Recognition, Augmented Reality
研究方法: 實驗設計法
DOI URL: http://doi.org/10.6345/NTNU202101188
論文種類: 學術論文
相關次數: 點閱:140下載:12
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文目標為開發一套 AR(Augmented Reality,擴增實境)眼鏡輔助系 統,協助臉盲症患者在社交上、對於生活中他人的辨識。本研究主要貢獻為 提出以三維人臉模型作為人臉辨識之資料擴增基礎,滿足對於臉盲症患者實 際社交情境之實用性,並且將各軟體與硬體平台之優勢進行系統整合與設 計,實現可讓患者立即投入使用之社交輔助工具。具體架構包含以下內容: 第一,以結構光技術(Structured Light)結合立體視覺攝影機,經由 Structured Light 與2D RGB 輸入,2D 資料通過深度神經網路(Deep Neural Network)進行 人臉的提取,並確認三維空間中人臉之座標,運用深度學習將3D 點雲資訊 和2D 影像進行實時三維人臉密集重建,並取得人臉正面、側面 等7個角度 之人臉資訊,提高人臉對於側臉與大動態辨識的準確度。第二,藉由第一部 分產生之人臉資訊,輸入卷積神經網路進行運算,卷積神經網路以輸出128 維之特徵向量取代傳統高維分類器作為人臉特徵依據,將計算之特徵向量與 系統內 SQL(Structured Query Language)資料庫,進行歐式距離計算並比對, 取得最小歐式距離並對應該人臉的姓名資料。第三,將人臉標籤資訊、空間 中之人臉座標點,藉由相機模型投影,實現 AR 眼鏡中顯示即時人臉辨識標 籤及人臉 Bounding Box。本論文希望臉盲症患者在戴上 AR 眼鏡後,AR 眼 鏡能夠即時從環境中掃描人臉,並從既有資料庫之中,辨別出對應之身分, 將該人的位置與人名標註至 AR 眼鏡中,幫助臉盲症患者能夠辨別出生活中 每個人之身分,不因認不出臉而產生困惑,突破社交上的阻礙,降低而因社 交上之阻礙,導致產生自閉症等心理疾病之可能性。

    The paper aims to develop an AR (Augmented Reality) eyewear system to help people with face blindness to recognize others in their life socially. This specific framework consists of the following: First, Structured Light technology is used to combine stereoscopic cameras with 2D RGB input through Structured Light, 2D data is extracted from the face through Deep Neural Network, and the coordinates of the face are identified in 3D space . In addition, the 3D point cloud information and 2D images are reconstructed bvin real time by deep learning, and the face information is obtained from seven angles, including front and side faces, to improve the accuracy of face recognition for side faces and large movements. Second, the face information generated in the first part is input to the convolutional neural network for computation. The convolutional neural network replaces the traditional high-dimensional classifier with a 128-dimensional feature vector as the basis for face characteristics. The minimum European distance is obtained and the name of the face is matched. This paper hopes that the AR glasses can instantly scan the face from the environment and identify the corresponding identity from the existing database, and label the location and name of the person into the AR glasses after the face blindness patient wears the AR glasses. The AR glasses help people with face blindness to identify everyone in their lives, so that they will not be confused by their inability to recognize faces, and to break through social barriers and reduce the possibility of autism and other psychological disorders caused by social barriers.

    摘要 i Abstract ii 致謝 iii 目錄 iv 表目錄 vii 圖目錄 viii 第一章 緒論 1 1.1 前言 1 1.2 文獻回顧 7 1.3 研究動機與目的 14 1.4 本研究之貢獻 15 1.5 論文架構 16 第二章 理論基礎 17 2.1 卷積神經網路 17 2.1.1 卷積層(Convolution Layer) 18 2.1.2 卷積核(Kernel Map) 18 2.1.3 卷積運算 19 2.1.4 邊緣填充(Padding) 20 2.1.5 權重共享 21 2.1.6 步幅(Stride) 22 2.1.7池化層(Pooling Layer) 23 2.1.8平坦層(Flatten Layer) 24 2.1.9 全連接層(Fully Connected Layer) 24 2.2 雙目立體視覺 25 2.2.1 針孔相機模型 26 2.2.2 雙目視覺模型 28 2.2.3 基礎矩陣 30 第三章 系統設計 32 3.1 系統架構 32 3.2 軟體架構 37 3.2.1 軟體依賴關係 37 3.2.2 系統程式流程 40 3.3 三維人臉重建 46 3.3.1 改良之快速PNCC影像編碼生成方法 46 3.3.2 基於 3DDFA 卷積網路之人臉重建 50 3.3.3 人臉重建之損失函數 51 3.3.4 三維人臉重建 51 3.4 人臉色彩投影 52 3.5 多角度人臉生成 54 3.6 人臉資料庫 Face Detection Label 58 3.7 基於 FaceNet 之卷積人臉辨識 59 3.7.1 卷積層架構 59 3.7.2 1x1 Convolution Filter 60 3.7.3 張量合併 60 3.7.4 L2 Regularization 61 3.7.5 特徵向量嵌入與辨識 62 3.7.6 人臉辨識之損失函數 63 3.8 AR 影像生成與投影 64 第四章 實驗結果與討論 65 4.1 實驗設備 65 4.1.1 Intel RealSense D435i 景深攝影機 65 4.1.2 NVIDIA Jetson Nano 嵌入式電腦 67 4.1.3 Dream Glass 4K 擴增實境眼鏡 68 4.1.4 軟體配置 69 4.2 實驗方法 71 4.2.1 配戴方式 71 4.2.2 測試集 74 4.3 實驗結果 77 4.3.1 3D 人臉重建與多角度人臉投影結果 77 4.3.2 人臉大姿態即時辨識結果 83 4.3.3 人臉大姿態辨識結果之比較 93 4.3.4 辨識與性能比較 96 4.3.5 擴增實境影像即時輸出與投影結果 97 4.3.6 實用距離測試結果 99 第五章 結論與未來展望 101 參考文獻 103

    [1] S. L. Corrow, K. A. Dalrymple and J. JS. Barton, “Prosopagnosia: current perspectives,” Eye and brain, vol. 8, pp. 165-175, Sep. 2016.

    [2] I. Kennerknecht, N. Plümpe, S. Edwards and R. Raman, “Hereditary prosopagnosia (HPA): the first report outside the Caucasian population,” Journal of human genetics, vol. 52, pp. 230-236, Feb. 2007.

    [3] I. Kennerknecht, N. Y. Ho and C. N. Wong, “Prevalence of hereditary prosopagnosia (HPA) in Hong Kong Chinese population,” American journal of medical genetics Part A, vol. 146A, no. 22, pp. 2863-2870, Nov. 2008.

    [4] “New Collaboration between IKEA and Apple,” IKEA, 7 June 2017, about.ikea.com/en/newsroom/2017/06/07/new-collaboration-between-ikea-and-apple.

    [5] “Slimmed-down F-35 Gen III Helmet to Be Introduced Sooner.” FlightGlobal, 30 Mar. 2016, www.flightglobal.com/slimmed-down-f-35-gen-iii-helmet-to-be-introduced-sooner/120125.article.

    [6] “bsolutely real: Virtual and augmented reality open new avenues in the BMW Group Production System,” BMWGROUP, 25 Apr. 2019, www.press.bmwgroup.com/argentina/article/detail/T0295123ES/absolutamente-real:-la-realidad-virtual-y-aumentada-abre-nuevas-v%C3%ADas-en-el-sistema-de-producci%C3%B3n-de-bmw-group?language=es.

    [7] “一次了解人臉辨識的各種應用【2021 最新版】,” FaceMe, 28 Jan. 2021, tw.cyberlink.com/faceme/insights/articles/236/how_is_facial_recognition_used_in_2021.

    [8] I. Masi, Y. Wu, T. Hassner and P. Natarajan, “Deep Face Recognition: A Survey,” in 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Parana, 2018, pp. 471-478.

    [9] C. Szegedy et al., “Going deeper with convolutions,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 1-9.

    [10] F. Schroff, D. Kalenichenko and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 815-823.

    [11] V. Blanz and T. Vetter, “A morphable model for the synthesis of 3D faces,” in SIGGRAPH 99: 26th International Conference on Computer Graphics and Interactive Techniques, Los Angeles, CA, July 1999, pp.187-194.

    [12] X. Zhu, Z. Lei, X. Liu, H. Shi and S. Z. Li, "Face Alignment Across Large Poses: A 3D Solution," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2016, pp.146-155.

    [13] X. Zhu, X. Liu, Z. Lei and S. Z. Li, "Face Alignment in Full Pose Range: A 3D Total Solution," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 78-92, 1 Jan. 2019

    [14] “Depth Camera D435i,” Intel REALSENSE, www.intelrealsense.com.

    [15] “Jetson Nano,” NVIDIA Developer, www. developer.nvidia.com.

    [16] “Dream Glass 4K,” Dream Glass, www. dreamworldvision.com.

    下載圖示
    QR CODE