Kyushu University Academic Staff Educational and Research Activities Database
List of Presentations
Satoshi Ohshima Last modified date:2024.04.10

Associate Professor / Section of Advanced Computational Science / Research Institute for Information Technology


Presentations
1. Satoshi Ohshima, Considering multi process calculations on current GPU, ATAT in HPSC 2024, 2024.03.
2. Satoshi Ohshima, QR Factorization of Block Low-rank Matrices on Multiple-/Multi-Instance GPUs, ATAT in HPSC 2023, 2023.03.
3. Satoshi Ohshima, Akihiro Ida, Rio Yokota, Ichitaro Yamazaki, QR Factorization of Block Low-Rank Matrices on Multi-Instance GPU, The 23rd International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT' 22), 2022.12.
4. Naruya Kitai, Daisuke Takahasi, Franz Franchetti, Takahiro Katagiri, Satoshi Ohshima, Toru Nagai, Adaptation of A64 Scalable Vector Extension for Spiral, 情報処理学会 研究報告(HPC-178), 2021.03.
5. 3. Progress in Simulation Study of Fundamental Physics and Visualization Technology: 3.2: Visualization Technology.
6. Satoshi Ohshima, Soichiro Suzuki, Tatsuya Sakashita, Masao Ogino, Takahiro Katagiri, Yoshimichi Andoh, Performance evaluation of the MODYLAS application on modern multi-core and many-core environments, The Fourteenth International Workshop on Automatic Performance Tuning (iWAPT2019, IPDPS2019 Workshop), 2019.05.
7. Satoshi Ohshima, Trying to accelerate many small BLAS calculations on GPU, ATAT in HPSC (2019 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing), 2019.02.
8. Ichitaro Yamazaki, Ahmad Abdelfattah, Akihiro Ida, Satoshi Ohshima, Stanimire Tomov, Rio Yokota, Jack Dongarra, Performance of Hierarchical-matrix BiCGStab Solver on GPU Clusters, 32nd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, 2018.08, HACApK is a software package for solving dense linear systems of equations and is used in other software packages, like ppohBEM for solving boundary integral equations. To enable the solution of large-scale boundary value problems, HACApK hierarchically compresses the coefficient matrix and uses the BiConjugate Gradient Stabilized (BiCGStab) method for solving the linear system. To extend HACApK's capability, this paper outlines how we ported the HACApK linear solver onto GPU clusters. Though the potential of GPUS has been widely accepted in high-performance computing, it is still a challenge to utilize the GPUS for a solver, like HACApK, that requires fine-grained irregular computation and global communication. To utilize the GPUS, we integrated the variable-size batched GPU kernel that was recently released in the MAGMA software package. This is the first time the variable-size batched kernels were used in a solver or application code. We discuss several techniques to improve the performance of the batched kernel and demonstrate the effects of these techniques on two state-of-The-Art GPU clusters. For instance, with two 14-core Intel Xeon CPUs and four NVIDIA P100 GPUS per node, the GPU kernel obtained a solver speedup of 8× on one node and 4× on eight nodes. We also show that when the inter-GPU communication becomes significant, the solution time can be further reduced by a factor of 2× by carefully designing the communication layer with the underlying node architecture in mind..
9. Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota, Optimization of Hierarchical matrix computation on GPU, SC-Asia 2018, 2018.03.
10. Satoshi Ohshima, Auto-tuning of directives: tuning directives of OpenMP and OpenACC, Second International Workshop on Deepening Performance Models for Automatic Tuning (DPMAT), 2017.08.
11. Takahiro Katagiri, Satoshi Ohshima Masaharu Matsumoto, Auto-tuning on NUMA and Many-core Environments with an FDM code, The Twelfth International Workshop on Automatic Performance Tuning (iWAPT2017) (In Conjunction with the IEEE IPDPS2017), 2017.06.
12. Takahiro Katagiri, Masaharu Matsumoto, Satoshi Ohshima, Auto-Tuning of Hierarchical Computations with ppOpen-AT, SIAM Conference on Parallel Processing for Scientific Computing (PP16), MS55 Auto-Tuning for the Post Moore's Era - Part I of II, 2016.04.
13. IT Talents Who Sprang Out of the Mitoh-Youth : Field for Exploration, Growing, and Connection.
14. SIAM AN10(Conference Reports).