簡易檢索 / 詳目顯示

研究生: 陳薪鴻
Chen, Hsin-Hung
論文名稱: 應用於自動化生產及分揀之物件姿態估測系統
Object Pose Estimation System for Pick and Place Automation
指導教授: 許陳鑑
Hsu, Chen-Chien
王偉彥
Wang, Wei-Yen
學位類別: 碩士
Master
系所名稱: 電機工程學系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 75
中文關鍵詞: 深度學習機器人作業系統物件姿態估測資料集生成機械手臂圖形使用者介面
英文關鍵詞: deep learning, ROS, object pose estimate, synthetic data, robotic arm, GUI
DOI URL: http://doi.org/10.6345/NTNU202001192
論文種類: 學術論文
相關次數: 點閱:259下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 誌 謝 i 摘 要 ii ABSTRACT iii 目 錄 iv 表 目 錄 vi 圖 目 錄 vii 第一章 緒論 1 1.1 研究背景與動機 1 1.2 文獻探討 2 1.2.1 卷積神經網路 2 1.2.2 物件辨識 3 1.2.3 姿態估測 5 1.2.4 機械手臂系統 8 1.2.5 圖形使用者介面 9 1.3 論文架構 10 第二章 實驗平台及軟硬體介紹 11 2.1 實驗平台 11 2.2 硬體設備環境 12 2.3 軟體使用介紹 16 第三章 物件姿態估測 19 3.1 DOPE深度物件姿態估測 19 3.2 物件3D模型之建立 23 3.3 訓練資料集之建置 25 3.3.1 虛幻引擎(Unreal Engine) 25 3.3.2 NDDS插件 26 3.3.3 匯入模型與場景建立 28 第四章 應用於自動化生產及分揀之物件姿態估測系統 32 4.1 機械手臂系統 33 4.2 圖形使用者介面 33 第五章 實驗結果 36 5.1 訓練資料集 36 5.2 神經網路訓練 38 5.3 實驗驗證 40 第六章 結論 65 6.1 結論 65 6.2 未來展望 65 參考文獻 67 自傳 72 學術成就 74

    [1] K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological cybernetics, vol. 36, no. 4, pp. 193-202, Apr. 1980.
    [2] D. H. Hubel and T. N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,” The Journal of physiology, vol. 160, no. 1, pp. 106-154, Jan. 1962.
    [3] Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning applied to document recognition,” in Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
    [4] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504-507, Jul. 2006.
    [5] A. Krizhevsky, I. Sutskever and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems 25, pp. 1097-1105, Apr. 2013.
    [6] C. Cortes, and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, no. 3, pp. 273-297, Sep. 1995.
    [7] K. Simonyan, A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
    [8] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going deeper with convolutions,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, Jun. 7-12, 2015, pp. 1-9.
    [9] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, Jun. 26-Jul. 1, 2016, pp. 770-778.
    [10] D. G. Lowe, “Object recognition from local scale-invariant features,” in Proc. of the Seventh IEEE International Conference on Computer Vision (ICCV), Kerkyra, Greece, Sep. 20-25, 1999, pp. 1150-1157 vol. 2.
    [11] H. Bay, T. Tuytelaars, and L. V. Gool, “Surf: Speeded up robust features,” in 9th European Conference on Computer Vision (ECCV), Graz, Austria, May 7-13, 2006, pp. 404-417.
    [12] R. Girshick, J. Donahue, T. Darrell and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Oh, USA, Jun. 23-28, 2014, pp. 580-587.
    [13] J. R. Uijlings, K. E. Van De Sande, T. Gevers, and A. W. Smeulders, “Selective search for object recognition,” International journal of computer vision, vol. 104, no. 2, pp. 154-171, Apr. 2013.
    [14] R. Girshick, “Fast R-CNN,” in Proc. of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec. 7-13, 2015, pp. 1440-1448.
    [15] S. Ren, K. He, R. Girshick and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems 28, pp. 91-99, Jun. 2016.
    [16] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in 14th European conference on computer vision (ECCV), Amsterdam, The Netherland, Oct. 11-14, 2016, pp. 21-37.
    [17] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, Jun. 26-Jul. 1, 2016, pp. 779-788.
    [18] J. Redmon, and A. Farhadi, “YOLO9000: better, faster, stronger,” in Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, USA, Jul. 21-26, 2017, pp. 7263-7271.
    [19] J. Redmon, and A. Farhadi, “Yolov3: An incremental improvement”, arXiv preprint arXiv:1804.02767, 2018.
    [20] C. Choi and H. I. Christensen, “RGB-D object pose estimation in unstructured environments,” Robotics and Autonomous Systems, vol. 75, pp. 595-613, Jan. 2016.
    [21] 邱駿展,“三維物件之辨識與姿態估測”,國立臺北科技大學自動化科技研究所碩士論文,98年1月。
    [22] I. Gordon and D. G. Lowe, “What and where: 3D object recognition with accurate pose,” Toward category-level object recognition, Springer, Berlin, Heidelberg, pp. 67-82, 2006.
    [23] B. Drost, M. Ulrich, N. Navab and S. Ilic, “Model Globally, Match Locally: Efficient and Robust 3D Object Recognition,” in Proc. of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, Jun. 13-18, 2010, pp. 998-1005.
    [24] C. Choi, Y. Taguchi, O. Tuzel, M. Liu and S. Ramalingam, “Voting-based pose estimation for robotic assembly using a 3D sensor,” in 2012 IEEE International Conference on Robotics and Automation (ICRA), Saint Paul, MN, USA, May 14-18, 2012, pp. 1724-1731.
    [25] E. Brachmann, A. Krull, F. Michel, S. Gumhold, J. Shotton, and C. Rother, “Learning 6d object pose estimation using 3D object coordinates,” in European Conference on Computer Vision (ECCV), Zurich, Switzerland, Sep. 6-12, 2014, pp. 536-551.
    [26] E. Brachmann, F. Michel, A. Krull, M. Y. Yang, and S. Gumhold, “Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, Jun. 26-Jul. 1, 2016, pp. 3364-3372.
    [27] A. Doumanoglou, R. Kous kouridas, S. Malassiotis, and T. K. Kim, “Recovering 6D object pose and predicting next-best-view in the crowd,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, Jun. 26-Jul. 1, 2016, pp. 3583-3592.
    [28] A. Tejani, R. Kouskouridas, A. Doumanoglou, D. Tang and T. Kim, “Latent-Class Hough Forests for 6 DoF Object Pose Estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 1, pp. 119-132, Jan. 2018.
    [29] A. Krull, E. Brachmann, F. Michel, M. Y. Yang, S. Gumhold and C. Rother, “Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images,” in Proc. of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec. 7-13, 2015, pp. 954-962.
    [30] P. Wohlhart, and L. Vincent, “Learning descriptors for object recognition and 3d pose estimation,” in Proc. of the IEEE conference on computer vision and pattern recognition(CVPR), Boston, MA, USA, Jun. 7-12, 2015, pp. 3109-3118.
    [31] M. Rad and V. Lepetit, “BB8: A Scalable, Accurate, Robust to Partial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth,” in Proc. of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, Oct. 22-29, 2017, pp. 3848-3856.
    [32] M. Schwarz, H. Schulz and S. Behnke, “RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features,” in 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, May 26-30, 2015, pp. 1329-1335.
    [33] W. Kehl, F. Milletari , F. Tombari, S. Ilic, and N. Navab, “Deep learning of local RGB-D patches for 3D object detection and 6D pose estimation,” in European conference on computer vision (ECCV), Amsterdam, The Netherland, Oct. 11-14, 2016, pp. 205-220.
    [34] W. Kehl, F. Manhardt, F. Tombari, S. Ilic and N. Navab, “SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again,” in Proc. of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, Oct. 22-29, 2017, pp. 1530-1538.
    [35] Y. Xiang, T. Schmidt, V. Narayanan, and D. Fox, “Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes,” arXiv preprint arXiv:1711.00199, 2017.
    [36] B. Calli, A. Singh, A. Walsman, S. Srinivasa, P. Abbeel and A. M. Dollar, “The YCB object and Model set: Towards common benchmarks for manipulation research,” in 2015 International Conference on Advanced Robotics (ICAR), Istanbul, Turkey, Jul. 27-31, 2015, pp. 510-517.
    [37] M. Rad, M. Oberweger, and V. Lepetit, “Feature mapping for learning fast and accurate 3d pose inference from synthetic images,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, Jun. 18-23, 2018, pp. 4663-4672
    [38] J. Tremblay, T. To, B. Sundaralingam, Y. Xiang, D. Fox, and S. Birchfield “Deep object pose estimation for semantic robotic grasping of household objects,” arXiv preprint arXiv:1809.10790, 2018.
    [39] S. E. Wei, V. Ramakrishna, T. Kanada and Y. Sheikh. “Convolutional pose machines,” in Proc. of the IEEE conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas, NV, USA, Jun. 26-Jul. 1, 2016, pp. 4724-4732.
    [40] Z. Cao, T. Simon, S. E. Wei and Y. Sheikh, “Realtime multi-person 2d pose estimation using part affinity fields,” in Proc. of the IEEE conference on Computer Vision and Pattern Recognition(CVPR), Honolulu, HI, USA, Jul. 21-26, 2017, pp. 7291-7299.
    [41] P. J. Hwang, C. C. Hsu, and W. Y. Wang, “Development of a mimic robot: Learning from human demonstration to manipulate a coffee maker as an example,” in 2019 IEEE 23rd International Symposium on Consumer Technology (ISCT), Melbourne, Australia, May 29-Jun. 1, 2019, pp. 124-127.
    [42] J. H. Chen, G. Y. Lu, Y. Y. Chien, H. H. Chiang, W. Y. Wang and C. C. Hsu, “Toward the Flexible Automation for Robot Learning from Human Demonstration Using Multimodal Perception Approach,” in proc. of 2019 International Conference on System Science and Engineering (ICSSE), Dong Hoi City, Vietnam, Jul. 19-21, 2019, pp. 148-153.
    [43] P. J. Hwang, W. Y. Wang, and C. C. Hsu, “Development of a mimic robot-learning from demonstration incorporating object detection and multiaction recognition,” IEEE Consumer Electronics Magazine, vol. 9, no. 3, pp. 79-87, May 2020.
    [44] J. Tremblay, T. To and S. Birchfield, “Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation,” in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, Jun. 18-22, 2018, pp. 2119-21193.
    [45] 李昀融,“基於深度學習之單視覺物件姿態估測應用於移動機器手臂之物件挑揀”,淡江大學機械與機電工程學系碩士班碩士論文,108年7月。
    [46] T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in European conference on computer vision (ECCV), Zurich, Switzerland, Sep. 6-12, 2014, pp. 740-755.
    [47] T. To, J. Tremblay, D. McKay, Y. Yamaguchi, K. Leung, A. Balanon, J. Cheng, and S. Birchfield, NDDS:NVIDIA deep learning dataset synthesizer. (2018) [Online]. Available: https://github.com/NVIDIA/Dataset_Synthesizer
    [48] 高吾凱,“基於深度姿態估測網路之無紋理模型拼裝規劃”,淡江大學電機工程學系機器人工程碩士班碩士論文,108年7月。
    [49] 郭佳文,“機器人整合3D物體辨識與夾取系統應用於工廠自動化”,國立臺灣大學電機工程學研究所碩士論文,106年7月。
    [50] 吳柏辰,“精準六自由度物體姿態之估測與追蹤”,國立臺灣大學電子工程學研究所博士論文,107年5月。
    [51] 蘇健霖,“3D多物件掃描與辨識演算法之研究與實作”,實踐大學資訊科技與管理學系碩士班碩士論文,106年6月。
    [52] 洪文斌,“自動化生產系統開發”,國立中正大學機械工程系研究所碩士論文,106年7月。
    [53] https://medium.com/zylapp/review-of-deep-learning-algorithms-for-object-detection-c1f3d437b852
    [54] https://research.nvidia.com/publication/2018-09_Deep-Object-Pose
    [55] https://robots.ieee.org/robots/unimate/?gallery=photo1
    [56] https://technews.tw/2016/08/11/xerox-alto/
    [57] https://pangoly.com/en/review/intel-core-i7-8700-oem
    [58] https://www.gigabyte.com/tw/Graphics-Card/GV-N108TAORUSX-W-11GD-rev-10-11#kf
    [59] https://www.logitech.com/zh-tw/product/c922-pro-stream-webcam
    [60] KINOVA JACO Assistive robot User Guide
    [61] https://www.python.org
    [62] https://pytorch.org
    [63] https://www.ros.org
    [64] https://zh.wikipedia.org/wiki/PyQt

    無法下載圖示 電子全文延後公開
    2025/08/31
    QR CODE