研究生: |
蔡欣翰 Tsai, Hsin-Han |
---|---|
論文名稱: |
A Scalable and Ultrafast Eigensolver for Three Dimensional Photonic Crystals on GPU A Scalable and Ultrafast Eigensolver for Three Dimensional Photonic Crystals on GPU |
指導教授: |
黃聰明
Huang, Tsung-Ming |
學位類別: |
碩士 Master |
系所名稱: |
數學系 Department of Mathematics |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 英文 |
論文頁數: | 39 |
中文關鍵詞: | Maxwell equation 、band structure 、face-centered cubic lattice 、GPU 、CUDA 、MPI 、cuBLAS 、cuFFT |
英文關鍵詞: | Maxwell equation, band structure, face-centered cubic lattice, GPU, CUDA, MPI, cuBLAS, cuFFT |
DOI URL: | https://doi.org/10.6345/NTNU202203517 |
論文種類: | 學術論文 |
相關次數: | 點閱:134 下載:25 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
無中文摘要
This research applies parallel computations on a GPU by CUDA for solving three dimensional Maxwell's equation with face-centered cubic (FCC) lattice. We focus on how to solve an Eigenvalue Problem more efficiently. Because of the problem we solved is Hermitian and positive definite. The algorithm of the solver is based on inverse Lanczos method for eigenvalue problems and associated conjugate gradient method for linear systems. By using cuBLAS, cuFFT, combining kernels, transpose multiple matrices simultaneously, and some skills, we can save time from computations or accessing memory. Integrating all techniques, we can solve each of a set of 5.184 million dimension eigenvalue problem for 10 smallest positive eigenvalues within 44 to 63 seconds. And we have a great scability on multiple GPU cards by MPI. All results are computed on two clusters. One is equipped two GPU cards called NVIDIA Tesla K40c, most of works are computed here. And the other is equipped a lot of GPU cards called M2070, which are used for MPI.
[1] R.-L. Chern, C. Chung Chang, C.-C. Chang, and R. Hwang, Numerical study of three dimensional photonic crystals with large band gaps, J. Phys. Soc. Japan, 73 (2004), pp. 727–737.
[2] L.-S. Chien, Matrix transpose, 2011. http://oz.nthu.edu.tw/~d947207/NVIDIA/copy3D/Matrix_transpose_post.pdf.
[3] G. H. Golub and C. F. Van Loan, Matrix Computations, Johns Hopkins Univ. Pr., 3rd ed., 1996.
[4] T.-M. Huang, W.-J. Chang, Y.-L. Huang, W.-W. Lin, W. C. Wang, and W. Wang, Preconditioning bandgap eigenvalue problems in three dimensional photonic crystals simulations, J. Comput. Phys., 229 (2010), p. 8684–8703.
[5] T.-M. Huang, H.-E. Hsieh, W.-W. Lin, and W. Wang, Eigendecomposition of the discrete double-curl operator with application to fast eigensolver for three dimensional photonic crystals, SIAM J. Matrix
Anal. Appl., 34(2) (2013), pp. 369–391.
[6] T.-M. Huang, H.-E. Hsieh, W.-W. Lin, and W. Wang, Matrix representation of the double-curl operator for simulating three dimensional photonic crystals., Math. Comput. Model., 58(1-2) (2013), pp. 379–392.
[7] T.-M. Huang, H.-E. Hsieh, W.-W. Lin, and W. Wang, Eigenvalue solvers for three dimensional photonic crystals with face-centered cubic lattice, Journal of Computational and Applied Mathematics., 272 (2014), pp. 350–361.
[8] K. Inoue and K. Ohtaka, Photonic crystals: physics, fabrication and applications, vol. 94, Springer, 2004.
[9] J. D. Joannopoulos, S. G. Johnson, J. N. Winn, and R. D. Meade, Photonic Crystals: Modeling the Flow of Light., Princeton University Press, 2008.
[10] C. Kittel, Introduction to solid state physics., Wiley, New York, 2005.
[11] C. NVIDIA, Next generation CUDA Compute Architecture: Kepler GK110. White paper.
[12] , The NVIDIA CUDA Basic Linear Algebre Subroutines (cuBLAS) library.
[13] , The NVIDIA CUDA Fast Fourier Transform library (cuFFT).
[14] , The NVIDIA CUDA Sparse library (cuSPARSE).
[15] , Cuda c programming guide, 2016. http://www.jstor.org/stable/853365.
[16] M. Reed and B. Simon, Methods of modern mathematical physics, in Analysis of Operators IV, Academic Press, San Diego, CA, 1978.
[17] C. M. Soukoulis, Photonic crystals and light localization in the 21st century., vol. 563, Springer, 2001.
[18] K. Yee, Numerical solution of initial boundary value problems involving maxwell’s equations in isotropic media, IEEE Trans. Antennas Propag., 14 (1996), pp. 302–307.