簡易檢索 / 詳目顯示

研究生: 林政佑
Lin, Cheng-You
論文名稱: 基於網宇實體系統整合的資料重用與工作替換
Cyber-Physical Systems Integration for Data Reuse and Tasks Replacements
指導教授: 王超
Wang, Chao
口試委員: 斯國峰
Ssu, Kuo-Feng
林均翰
Lin, Chun-Han
口試日期: 2021/08/05
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 47
中文關鍵詞: AIoT異質運算GPU響應時間工作替換資料重用網宇實體 系統
英文關鍵詞: AIoT, Heterogeneous, GPU, Response time, Task Replacement, Data reuse, Cyber-Physical Systems
DOI URL: http://doi.org/10.6345/NTNU202101059
論文種類: 學術論文
相關次數: 點閱:62下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 一個多層次的系統需要整合伺服器和 IoT 設備,將 AI 應用整合到多層次系統中在現代是很有利的,而 AIoT 代表了在 IoT 設備中整合 GPU 進行 AI 計算,然而這會面臨幾個問題。首先,由於 AIoT 設備的規模限制,其設備也有計算限制。
    大多數 AI 算法都有巨大的重複性浮點數計算,以獲得基於機率的收斂近似解。GPU 使 AI 計算適合對大量的浮點數計算進行平行計算。其次,AIoT 設備還必須考慮工作的延遲要求,使 AIoT 設備控制的系統執行器在工作的關鍵事件中不會延遲,例如自動駕駛車輛中的剎車。第三,可以通過進一步整合邊緣伺服器、AIoT設備和現有的基礎設施來構建多層次系統,使數據重用具可行性且高效,多層次的 AIoT 系統在異質資源中會有更多的用途。
    本論文的貢獻是通過 CPS 概念將 IoT 平台計算和 AIoT 應用結合起來,然後探索 AIoT 設備在整合架構中的計算限制和解決方案,以實證方式來研究 AIoT 設備運行 AI 應用時適當的資源配置方法,在多層次系統中,不同的計算資源會導致不同的響應時間以滿足延遲要求。本論文的概念是數據重用,它來自於現有的基礎設施,並使用於新增加的 AIoT 設備,同時當代的 AIoT 設備配備的 GPU 對於AI 算法的局限性也是通過實證評估來衡量的。為了克服這些限制,本論文的架構提出工作替換和邊緣伺服器工作卸載的可行性。

    A multi-tier system needs integration of servers and IoT devices. Modern, it is advantageous to integrate AI applications into a multi-tier system, which in AIoT terms stands for integrating GPUs in IoT devices for AI computing. However, there are several issues that need to be handled. First, AIoT devices also have computing limitations because of the scale of devices. Most AI algorithms have enormous repetitive floating-point computations for convergent approximate solutions based on probability. Using GPU, AI computing is suitable for parallel computing on enormous floating-point computations. Second, AIoT devices must also consider the latency requirements of the tasks, and system actuators controlled by AIoT devices will not get delayed in critical events. For example, the brake in autonomous vehicles. Third, multi-tier systems can be constructed by further integrating edge servers, AIoT devices, and existing infrastructure that makes data reuse feasible and efficient. Multi-tier AIoT systems will be more versatile in heterogeneous resources.
    The contribution of this thesis is to combine IoT platform computing and AIoT applications by CPS concepts, and then explore AIoT devices’ computing limitation and resolution in integrating architectures. The empirical results were investigated for the appropriate resource usage configuration for AIoT devices to run AI applications. In the multi-tier system, different computing resources result in different response time to meet latency requirements. The concept of this thesis is data reuse, which comes from the existing infrastructure, for the newly added AIoT devices. The limitations of contemporary AIoT devices equipped with GPU for AI algorithms were measured by empirical evaluation. For overcoming these limitations, this thesis concludes with the feasibility of using task replacement and edge server offloading.

    Chapter 1 Introduction 1 Section 1.1 Reuse Data and Infrastructure 2 Section 1.2 GPU Characteristic and Task Replacement 3 Section 1.3 Thesis Contributions and Organization 5 Chapter 2 Related Work 6 Section 2.1 Reuse Data and Infrastructure 6 Section 2.1.1 Object Detection 7 Section 2.2 Latency in IoT Systems 8 Section 2.3 Edge Computing and Task Allocation 8 Chapter 3 System Model 10 Section 3.1 GPU Scheduler 12 Section 3.2 Performance of Executing Multiple GPU Tasks 15 Section 3.3 Accuracy Metrics of Task replacement 16 Chapter 4 Multi-Tier CPS Architecture 18 Section 4.1 Architecture 18 Section 4.2 Proposed Design 20 Chapter 5 Empirical Evaluation 22 Section 5.1 Experiment Setup 23 Section 5.2 Data Reuse 24 Section 5.3 Multi-Tier System Integration 28 Section 5.3.1 Benchmark of Platform-Tier Computing and Edge-Tier Computing 29 Section 5.3.2 Edge-Tier GPU Computing with Competing Task 32 Section 5.3.3 Platform-Tier Computing with Competing Task 36 Section 5.4 Task Replacement 36 Section 5.4.1 Empirical Motivation 37 Section 5.4.2 Feasibility of GPU/CPU Task Replacement 39 Chapter 6 Conclusion and Future Work 42 Section 6.1 Future Work 43 References 44

    [1] T. Yamazato, M. Kinoshita, S. Arai, E. Souke, T. Yendo, T. Fujii, K. Kamakura, and H. Okada, “Vehicle motion and pixel illumination modeling for image sensor based visible light communication,” IEEE Journal on Selected Areas in Communications, vol. 33, no. 9, pp. 1793–1805, 2015.
    [2] N. Otterness, M. Yang, S. Rust, E. Park, J. H. Anderson, F. D. Smith, A. Berg, and S. Wang, “An evaluation of the nvidia tx1 for supporting real-time computer-vision workloads,” in 2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 2017, pp. 353–364.
    [3] N. Otterness, M. Yang, T. Amert, J. Anderson, and F. D. Smith, “Inferring the scheduling policies of an embedded cuda gpu,” in Workshop on Operating Systems Platforms for Embedded Real Time Systems Applications (OSPERT), 2017.
    [4] N. Roy, A. Dubey, and A. Gokhale, “Efficient autoscaling in the cloud using predictive models for workload forecasting,” in 2011 IEEE 4th International Conference on Cloud Computing. IEEE, 2011, pp. 500–507.
    [5] L. D. Xu and L. Duan, “Big data for cyber physical systems in industry 4.0: a survey,” Enterprise Information Systems, vol. 13, no. 2, pp. 148–169, 2019.
    [6] S. S. Ogden and T. Guo, “Characterizing the deep neural networks inference performance of mobile applications,” arXiv preprint arXiv:1909.04783, 2019.
    [7] I. Fedorov, R. P. Adams, M. Mattina, and P. N. Whatmough, “Sparse: Sparse architecture search for cnns on resource-constrained microcontrollers,” arXiv preprint arXiv:1905.12107, 2019.
    [8] W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi, “A survey of deep neural network architectures and their applications,” Neurocomputing, vol. 234, pp. 11–26, 2017.
    [9] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg,“Ssd: Single shot multibox detector,” Lecture Notes in Computer Science, p. 21–37, 2016.
    [10] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
    [11] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” 2020.
    [12] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510–4520.
    [13] T. Lu, W. Xia, X. Zou, and Q. Xia, “Adaptively compressing iot data on the resourceconstrained edge,” in 3rd {USENIX} Workshop on Hot Topics in Edge Computing (HotEdge 20), 2020.
    [14] H. Li, K. Ota, and M. Dong, “Learning iot in edge: Deep learning for the internet of things with edge computing,” IEEE network, vol. 32, no. 1, pp. 96–101, 2018.
    [15] W. Yu, F. Liang, X. He, W. G. Hatcher, C. Lu, J. Lin, and X. Yang, “A survey on the edge computing for the internet of things,” IEEE access, vol. 6, pp. 6900–6919, 2017.
    [16] D. Mrozek, A. Koczur, and B. Ma lysiak-Mrozek, “Fall detection in older adults with mobile iot devices and machine learning in the cloud and on the edge,” Information Sciences, vol. 537, pp. 132–147, 2020.
    [17] X. Liu, P. Ghosh, O. Ulutan, B. Manjunath, K. Chan, and R. Govindan, “Caesar: cross-camera complex activity recognition,” in Proceedings of the 17th Conference on Embedded Networked Sensor Systems, 2019, pp. 232–244.
    [18] T. N. Gia, L. Qingqing, J. P. Queralta, Z. Zou, H. Tenhunen, and T. Westerlund,“Edge ai in smart farming iot: Cnns at the edge and fog computing with lora,” in 2019 IEEE AFRICON. IEEE, 2019, pp. 1–6.
    [19] D. Roy, W. Chang, S. K. Mitter, and S. Chakraborty, “Tighter dimensioning of heterogeneous multi-resource autonomous cps with control performance guarantees,”in 2019 56th ACM/IEEE Design Automation Conference (DAC). IEEE, 2019, pp. 1–6.
    [20] K. Gai, M. Qiu, H. Zhao, and X. Sun, “Resource management in sustainable cyberphysical systems using heterogeneous cloud computing,” IEEE Transactions on Sustainable Computing, vol. 3, no. 2, pp. 60–72, 2017.
    [21] A. Bhat, S. Samii, and R. Rajkumar, “Practical task allocation for software faulttolerance and its implementation in embedded automotive systems,” Real-Time Systems, vol. 55, no. 4, pp. 889–924, 2019.
    [22] S. Deng, H. Zhao, W. Fang, J. Yin, S. Dustdar, and A. Y. Zomaya, “Edge intelligence: the confluence of edge computing and artificial intelligence,” IEEE Internet of Things Journal, vol. 7, no. 8, pp. 7457–7469, 2020.
    [23] Industrial Internet Consortium, “Industrial Internet Reference Architecture,” Industrial Internet Consortium, Jun 2019. [Online]. Available: https://www.iiconsortium.org/IIRA.htm
    [24] NVIDIA, P. Vingelmann, and F. H. Fitzek, “Cuda, release: 10.2.89,” 2020. [Online]. Available: https://developer.nvidia.com/cuda-toolkit
    [25] T. Amert, N. Otterness, M. Yang, J. H. Anderson, and F. D. Smith, “Gpu scheduling on the nvidia tx2: Hidden details revealed,” in 2017 IEEE Real-Time Systems Symposium (RTSS). IEEE, 2017, pp. 104–115.
    [26] G. A. Elliott, B. C. Ward, and J. H. Anderson, “Gpusync: A framework for realtime gpu management,” in 2013 IEEE 34th Real-Time Systems Symposium. IEEE, 2013, pp. 33–44.
    [27] S. Kato, K. Lakshmanan, R. Rajkumar, and Y. Ishikawa, “Timegraph: Gpu scheduling for real-time multi-tasking environments,” 01 2011, p. 17.
    [28] M. Levandowsky and D. Winter, “Distance between sets,” Nature, vol. 234, no. 5323, pp. 34–35, 1971.
    [29] R. Lienhart and J. Maydt, “An extended set of haar-like features for rapid object detection,” in Proceedings. international conference on image processing, vol. 1. IEEE, 2002, pp. I–I.
    [30] G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000.
    [31] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9.
    [32] C.-Y. Lin and C. Wang, “Benefits of gpu-cpu task replacement for edge device and platform: Poster abstract,” ser. IoTDI ’19. New York, NY, USA: Association for Computing Machinery, 2021, p. 261–266.

    下載圖示
    QR CODE