九州大学 研究者情報
発表一覧
南里 豪志(なんり たけし) データ更新日:2024.04.26

准教授 /  情報基盤研究開発センター 先端計算科学研究部門 大学院システム情報科学研究院 情報知能工学部門


学会発表等
1. Y. Miyake, Y. Sunada, Y. Tanaka, K. Nakazawa, T. Nanri, K. Fukazawa and Y. Katoh, Implementation of Coupled Numerical Analysis of Magnetospheric Dynamics and Spacecraft Charging Phenomena via Code-To-Code Adapter (CoToCoA) Framework, ICCS 2023, 2023.06, [URL].
2. 南里豪志, 松山和広, 田代皓嗣, 原田浩睦, 九州大学スーパーコンピュータとAWSクラウドサービスによるハイブリッド計算環境の相互補完的利用方法に関する調査, 大学ICT推進協議会 2022年度 年次大会, 2022.12.
3. Yuto Katoh, Keiichiro Fukazawa, Takeshi Nanri, Yohei Miyake , Cross-reference simulation by Code-To-Code Adapter (CoToCoA) library for the study of multi-scale physics in planetary magnetospheres, 2021 Ninth International Symposium on Computing and Networking Workshops (CANDARW), 2021.12.
4. 南里 豪志、大江 和一、吉田 英司、大辻 弘貴、林 英里香, DIMMスロット装着型不揮発性メモリ上のRDMAによるメッセージキューイングシステムの試作, 大学ICT推進協議会2020年度年次大会, 2020.12, [URL].
5. 南里豪志, 実用アプリケーションのスイッチのキーテクノロジーであるSHARPを使用したMPI通信パフォーマンス向上の挑戦と、将来のスイッチテクノロジーへの期待, GPU TECHNOLOGY CONFERENCE, 2020.10, [URL].
6. Kenji Ono, Toshihiro Kato, Satoshi Ohshima, Takeshi Nanri, Scalable Direct-Iterative Hybrid Solver for Sparse Matrices on Multi-Core and Vector Architectures, International Conference on High Performance Computing in Asia-Pacific Region, 2019.12.
7. Keiichiro Fukazawa, Yuto Katoh, Takeshi Nanri, Yohei Miyake, Application of cross-reference framework CoToCoA to Macro- and micro-scale simulations of planetary magnetospheres, 7th International Symposium on Computing and Networking Workshops, CANDARW 2019, 2019.11, [URL], In this study, we have introduced the Code-to-Code Adapter (CoToCoA) library to couple the magnetohydrodynamic (MHD) simulation and the Electron Hybrid (EH) simulation of planetary magnetospheres. CoToCoA has been developed newly to connect the different codes easily. The concept of CoToCoA is that we do not add modifications to each code as possible without data transfer functions, and we do not need to know the referred code without data format. With CoToCoA, we have been developing the cross-reference simulation of macro (MHD) and micro (EH) scales in the magnetosphere. Then, we have evaluated the performance of cross-reference simulation using CoToCoA on the massively parallel computer system..
8. Kazuichi Oe, Takeshi Nanri, Hybrid Storage System to Achieve Efficient Use of Fast Memory Area, 7th International Symposium on Computing and Networking, CANDAR 2019, 2019.11, [URL], Hybrid storage techniques are useful methods to improve the cost performance for input-output (IO) intensive workloads. These techniques choose areas of concentrated IO accesses and migrate them to an upper tier to extract as much performance as possible through greater use of upper tier areas. Automated tiered storage with fast memory and slow flash storage (ATSMF) is a hybrid storage system situated between non-volatile memories (NVMs) and solid-state drives (SSDs). ATSMF aims to reduce the average response time for IO accesses by migrating areas of concentrated IO access from an SSD to an NVM. When a concentrated IO access finishes, the system migrates these areas from the NVM back to the SSD. Unfortunately, the published ATSMF implementation temporarily consumes much NVM capacity upon migrating concentrated IO access areas to NVM, because its algorithm executes NVM migration with high priority. As a result, it often delays evicting areas in which IO concentrations have ended to the SSD. Therefore, to reduce the consumption of NVM while maintaining the average response time, we developed new techniques for making ATSMF more practical. The first is a queue handling technique based on the number of IO accesses for NVM migration and eviction. The second is an eviction method that selects only write-accessed partial regions in finished areas. The third is a technique for variable eviction timing to balance the NVM consumption and average response time. Experimental results indicate that the average response times of the proposed ATSMF are almost the same as those of the published ATSMF, while the NVM consumption is drastically lower..
9. Praphan Pavarangkoon, Ken T. Murata, Kazunori Yamamoto, Kazuya Muranaga, Takamichi Mizuhara, Keiichiro Fukazawa, Ryusuke Egawa, Takahiro Katagiri, Masao Ogino, Takeshi Nanri, Performance improvement of high-speed file transfer over JHPCN, 17th IEEE International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019, 2019.08, [URL], This paper proposes a novel file transfer tool to improve file transfer performance over Japan high performance computing and networking (JHPCN). We first develop a high-performance and flexible protocol (HpFP) for inter-datacenter transport network. An original HpFP is designed first for specified networks and puts more emphasis on latency and packet loss tolerances than fairness and friendliness, while an enhanced HpFP is more suitable for real network environments. Then, based on the enhanced HpFP, we implement a file transfer tool, called high-performance copy (HCP). The performance of our file transfer tool is evaluated between datacenters of JHPCN using real datasets collected from supercomputer resources. The results show that the HCP achieves higher throughput than traditional tool for file transfer over JHPCN..
10. Kenji Ono, Jorji Nonaka, Hiroyuki Yoshikawa, Takeshi Nanri, Yoshiyuki Morie, Tomohiro Kawanabe, Fumiyoshi Shoji, Design of a Flexible In Situ Framework with a Temporal Buffer for Data Processing and Visualization of Time-Varying Datasets, International Conference on High Performance Computing, ISC High Performance 2018, 2018.01, [URL], This paper presents an in situ framework focused on time-varying simulations, and uses a novel temporal buffer for storing simulation results sampled at user-defined intervals. This framework has been designed to provide flexible data processing and visualization capabilities in modern HPC operational environments composed of powerful front-end systems, for pre-and post-processing purposes, along with traditional back-end HPC systems. The temporal buffer is implemented using the functionalities provided by Open Address Space (OpAS) library, which enables asynchronous one-sided communication from outside processes to any exposed memory region on the simulator side. This buffer can store time-varying simulation results, and can be processed via in situ approaches with different proximities. We present a prototype of our framework, and code integration process with a target simulation code. The proposed in situ framework utilizes separate files to describe the initialization and execution codes, which are in the form of Python scripts. This framework also enables the runtime modification of these Python-based files, thus providing greater flexibility to the users, not only for data processing, such as visualization and analysis, but also for the simulation steering..
11. Kazuichi Oe, Takeshi Nanri, Non-volatile memory driver for applying automated tiered storage with fast memory and slow flash storage, 6th International Symposium on Computing and Networking Workshops, CANDARW 2018, 2018.12, [URL], Automated tiered storage with fast memory and slow flash storage (ATSMF) is a hybrid storage system located between non-volatile memories (NVMs) and solid state drives (SSDs). ATSMF aims to reduce average response time for inputoutput (IO) accesses by migrating concentrated IO access areas from SSD to NVM. However, the current ATSMF implementation cannot reduce average response time sufficiently because of the bottleneck caused by the Linux brd driver, which is used for the NVM access driver. The response time of the brd driver is more than ten times larger than memory access speed. To reduce the average response time sufficiently, we developed a block-level driver for NVM called a 'two-mode (2M) memory driver.' The 2M memory driver has both the. map IO access mode and direct IO access mode to reduce the response time while maintaining compatibility with the Linux device-mapper framework. The direct IO access mode has a drastically lower response time than the Linux brd driver because the ATSMF driver can execute the IO access function of 2M memory driver directly. Experimental results also indicate that ATSMF using the 2M memory driver reduces the IO access response time to less than that of ATSMF using the Linux brd driver in most cases..
12. Kenji Ono, Jorji Nonaka, Yoshiyuki Morie, Takeshi Nanri, Tomohiro Kawanabe, Design of an In Transit Framework with Staging Buffer for Flexible Data Processing and Visualization of Time-Varying Data, ISC WORKSHOP ON IN SITU VISUALIZATION 2018, 2018.06.
13. Kazuichi Oe, Mitsuru Sato, Takeshi Nanri, Automated Tiered Storage System Consisting of Memory and Flash Storage to Improve Response Time with Input-Output (IO) Concentration Workloads, 5th International Symposium on Computing and Networking, CANDAR 2017, 2018.04, [URL], The response time of solid state drives (SSDs) has dramatically reduced according to the spread of non-volatile memory express (NVMe) devices. These devices have response times of less than 100 micro seconds on average. The response time of all-flash-array systems has also drastically reduced through the use of NVMe SSDs. However, there are applications, particularly, virtual desktop infrastructure and in-memory database systems, that require storage systems with even shorter response time. Their workloads were found to contain many input-output (IO) concentrations. We define IO concentration by using a declarative style. Input-output (IO) concentrations are aggregations of IO accesses. They appear in narrow regions of the storage volume and continue for periods of up to about an hour. These narrow regions occupy a few percent of the logical unit number capacity, include most IO accesses, and appear at unpredictable logical block addresses. To drastically reduce the response time of these workloads, we developed automated tiered storage system called 'automated tiered storage with fast memory and slow flash storage' (ATSMF). The memory component of ATSMF is a memory with a non-volatile feature. The system predicts the remaining duration of IO concentration, calculates the response-time increase during migration and response-time decrease after migration, and migrates the IO concentrations if the response-time decrease after migration surpasses the response-time increase during migration. Experimental results indicate that ATSMF is at least 20% faster than flash storage only and its memory access ratio is more than 50%..
14. Takeshi Nanri, Proposal of Interface for Runtime Memory Manipulation of Applications via PGAS-based Communication Library, Workshop on PGAS programming models: Experiences and Implementations (PGAS-EI), 2018.01.
15. Tetsuya Nakatoh, Sachio Hirokawa, Toshiro Minami, Takeshi Nanri, Miho Funamori, Attribute-based quality classification of academic papers, 2017.11, Investigating the relevant literature is very important for research activities. However, it is difficult to select the most appropriate and important academic papers from the enormous number of papers published annually. Researchers search paper databases by combining keywords, and then select papers to read using some evaluation measure—often, citation count. However, the citation count of recently published papers tends to be very small because citation count measures accumulated importance. This paper focuses on the possibility of classifying high-quality papers superficially using attributes such as publication year, publisher, and words in the abstract. To examine this idea, we construct classifiers by applying machine-learning algorithms and evaluate these classifiers using cross-validation. The results show that our approach effectively finds high-quality papers..
16. Satoshi Ohshima, Takeshi Nanri, Yoshitaka Watanabe, Hirofumi Amano, Kenji Ono, スーパーコンピュータシステムITOの性能評価, 2017.12.
17. Takeshi Nanri, Satoshi Ohshima, Kenji Ono, 非ブロッキング集団通信の通信隠蔽効果に関する調査, 2017.12.
18. Tetsuya Nakatoh, Kenta Nagatani, Toshiro Minami, Sachio Hirokawa, Takeshi Nanri, Miho Funamori, Analysis of the quality of academic papers by the words in abstracts, Thematic track on Human Interface and the Management of Information, held as part of the 19th International Conference on Human–Computer Interaction, HCI International 2017, 2017.01, The investigation of related research is very important for research activities. However, it is not easy to choose an appropriate and important academic paper from among the huge number of possible papers. The researcher searches by combining keywords and then selects an paper to be checked because it uses an index that can be evaluated. The citation count is commonly used as this index, but information about recently published papers cannot be obtained. This research attempted to identify good papers using only the words included in the abstract. We constructed a classifier by machine learning and evaluated it using cross validation. As a result, it was found that a certain degree of discrimination is possible..
19. Kazuichi Oe, Takeshi Nanri, Koji Okamura, Feasibility study for building hybrid storage system consisting of non-volatile DIMM and SSD, 4th International Symposium on Computing and Networking, CANDAR 2016, 2017.01, Various vendors develop a byte accessible Nonvolatile Dual-Inline Memory Module (NVDIMM). The performance of the NVDIMM drastically surpasses that of the Solid State Drive (SSD), which is connected by PCI express. However, the cost of the NVDIMM is much higher than that of the SSD. Therefore, a hybrid storage system between the NVDIMM and SSD is an effective technique for improving cost-performance. If a system uses the NVDIMM less while maintaining performance, its cost-performance should be improved. Our previous work involves on-the-fly automated storage tiering (OTF-AST). OTF-AST is a hybrid storage system consisting of an SSD and HDD. It aims to reduce the average response time of IO accesses by migrating only the IO concentration area to the SSD when IO concentration happens. Therefore, we construct OTF-AST with both the DIMM and SSD and evaluate it in order to understand how to build a cost-effective hybrid storage system with these devices. We use a DIMM instead of a byte accessible NVDIMM, which is difficult to obtain. As a result, we found that the original OTF-AST is suitable for a hybrid storage system consisting of the DIMM and SSD. Moreover, we can improve the performance of OTF-AST if replace its migration algorithm with a more positive migration algorithm. This is because the IO access response time barely increases when the data migration between the DIMM and SSD is done. We will build a more positive migration algorithm in the near future..
20. Shinji Sumimoto, Yuichiro Ajima, Kazushige Saga, Takafumi Nose, Naoyuki Shida, Takeshi Nanri, The design of advanced communication to reduce memory usage for exa-scale systems, 12th International Conference on High Performance Computing for Computational Science, VECPAR 2016, 2017.01, Current MPI (Message Passing Interface) communication libraries require larger memories in proportion of the number of processes, and can not be used for exa-scale systems. This paper proposes a global memory based communication design to reduce memory usage for exa-scale communication. To realize exa-scale communication, we propose true global memory based communication primitives called Advanced Communication Primitives (ACPs). ACPs provide global address, which is able to use remote atomic memory operations on the global memory, RDMA (Remote Direct Memory Access) based remote memory copy operation, global heap allocator and global data libraries. ACPs are different from the other communication libraries because ACPs are global memory based so that house keeping memories can be distributed to other processes and programmers explicitly consider memory usage by using ACPs. The preliminary result of memory usage by ACPs is 70 MB on one million processes..
21. Keiichiro Fukazawa, Yoshiyuki Morie, Toshiya Takami, Takeshi Nanri, Takeshi Soga, Effective calculation with halo communication using halo functions, 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016.09, The issue of halo communication is the decrease of parallel scalability. To overcome the issues, we have introduced "Halo thread" to our simulation code. However, we have not solved the issue basically in the strong scaling. In this study, we have developed the Halo functions which perform the halo communication effectively. Then we can perform the calculation and communication in a pipeline and obtained good performance..
22. Yoshiyuki Morie, Hiroaki Honda, Takeshi Nanri, Taizo Kobayashi, Hidetomo Shibamura, Ryutaro Susukita, Yuichiro Ajima, Memory Efficient One-Sided Communucation Library "aCP" in Globary Memory on Raspberry Pi 2, 36th IEEE International Conference on Distributed Computing Systems, ICDCS 2016, 2016.08, Previously, communications in parallel programs forHigh Performance Computing (HPC) and Distributed Computing(DC) are mostly written with two-sided communicationinterfaces that are based on a pair of operations, Send andReceive. Since such interface requires explicit synchronizationbetween both sides of the communication, techniquesfor communication optimization such as overlapping are notefficiently described in many cases. On the other hand, onesidedcommunication interface is becoming important as amethod to describe asynchronous communications to enablehighly overlapped communication with computation. As oneof such interface, in this demonstration, Advanced CommunicationPrimitives (ACP) is introduced. ACP is a portableinterface that supports UDP, IBverbs of InfiniBand and Tofulibrary of K Computer. In addition to that, it is designed tobe memory efficient. For example, with 10 thousand processes, the memory consumption of ACP over UDP is estimated to beless than 1MB. Since the number of computational elements isincreasing more rapidly than the amount of available memory, this memory efficiency is becoming one of the keys for parallelprograms in HPC and DC. To show this characteristics, we runACP library on Raspberry Pi 2, and examine its performanceand memory consumption..
23. Takeshi Nanri, Keiichiro Fukazawa, Effect of Overlapping Halo Exchange with One-Sided Communication, 5th JSST Annual Conference International Conference on Simulation Technology, 2016.10.
24. Keiichiro Fukazawa, Takayuki Umeda, Takeshi Nanri, Performance Evaluation of MHD Simulation Code with X86 CPUs and Manycore Systems, 5th JSST Annual Conference International Conference on Simulation Technology, 2016.10.
25. Ryutaro Susukita, Yoshiyuki Morie, Takeshi Nanri, Efficient communications of particle data in particle-based simulations, 5th JSST Annual Conference International Conference on Simulation Technology, 2016.10.
26. Hiroaki Honda, Yoshiyuki Morie, Takeshi Nanri, Development of A Memory Efficient Communication Method for Connecting MPI Programs by using ACP Library, 5th JSST Annual Conference International Conference on Simulation Technology, 2016.10.
27. Shinji Sumimoto, Yuichiro Ajima, Kazushige Saga, Takafumi Nose, Naoyuki Shida, Takeshi Nanri, The Design of Advanced Communication to Reduce Memory Usage for Exa-scale Systems, 12th International Meeting On High Performance Computing for Computational Science, 2016.09.
28. Takeshi Nanri, Runtime Algorithm Selection of Collective Communication with RMA-based Monitoring Mechanism, 4th Annual MVAPICH Users Group Meeting, 2016.08.
29. Keiichiro Fukazawa, Toshiya Takami, Takeshi Soga, Yoshiyuki Morie, Takeshi Nanri, Effective Calculation with Halo communication using Halo Functions, 23rd European MPI Users' Group Meeting, 2016.09.
30. Seiji FUJINO, Takeshi Nanri, Improvement of Eisenstat-SSOR preconditioning using tolerance value, 5th IMA Conference on Numerical Linear Algebra and Optimization, 2016.09.
31. Ryutaro Susukita, Yoshiyuki Morie, Takeshi Nanri, NSIM-ACE: An Interconnection Network Simulator for Evaluating Remote Direct Memory Access, International Conference on Simulation and Modeling Methodologies, Technologies and Applications, 2016.07.
32. Kazuichi Oe, Takeshi Nanri, KOJI OKAMURA, Analysis of Storage Workloads of Input-Output Access Locality and Designing of Hybrid Storage System, 1st International Conference on Enterprise Architecture and Information Systems, 2016.01.
33. Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki, Implementation of PVM-based Distributed Shared Memory System, International Conference on Parallel and Distributed Processing Techniques and Applications, 1998.07.
34. Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki, Effects of Scheduling Attributes on Multithread-Based Software DSM System, Workshop on Scheduling Algorithms for Parallel/Distributed Computing, 1999.07.
35. Takeshi Nanri, Yoshitaka Watanabe, Hiroyuki Sato and Masaaki Shimasaki, Preliminary Investigation of Distributed Shared Memory System on a Cluster of High Performance Clusters, European Congress on Computational Methods in Applied Sciences and Engineering, 2000.09.
36. Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki, Design and Implementation of an Adaptive Distributed Shared Memory System, International Conference of Parallel and Distributed Computing and Systems, 2001.08.
37. T. Nanri, Y. Watanabe, H. Sato, Performance comparison of vector-calculations between Itanium2 and other processors, International Workshop on Innovative Architecture, 2005.01.
38. 南里 豪志, 並列計算機の大規模化に向けた MPI の Alltoall通信アルゴリズムの性能評価, 第10回環瀬戸内応用数理研究部会シンポジウム, 2006.07.
39. Feng Long Gu, Takeshi Nanri and Kazuaki Murakami, Implementation of GAMESS on Parallel Computers: TCP/IP versus MPI, International Conference of Computational Methods in Sciences and Engineering, 2006.10.
40. Hyacinthe Nzigou Mamadou, Takeshi Nanri and Kazuaki Murakami, Collective Communication Costs Analysis over Gigabit Ethernet and InfiniBand, High Performance Computing - HiPC 2006, 2006.12.
41. 森江 善之, 末安 直樹, 松本 透, 南里 豪志, 石畑 宏明, 井上 弘士, 村上 和彰, 通信タイミングを考慮したMPI ランク配置最適化技術, HOKKE2007, 2007.03.
42. 森江 善之, 末安 直樹, 松本 透, 南里 豪志, 石畑 宏明, 井上 弘士, 村上 和彰, 通信タイミングを考慮した衝突削減のためのMPIランク配置最適化技術, 先進的計算基盤システムシンポジウム (SACSIS2007), 2007.05.
43. 栗原 康志,Hyacinthe Nzigou Mamadou,南里 豪志,末安 直樹,松本透,井上 弘士,村上 和彰, 負荷ばらつきを考慮したMPIブロードキャスト通信の動的最適化に関する研究, SWoPP2007, 2007.08.
44. Hyacinthe Nzigou Mamadou, Feng Long Gu, Takeshi Nanri, Kazuaki Murakami, A Study of All-to-all Collective Communication Algorithms on Modern High Performance System Architectures, High Performance Computing International Conference (HPC Asia) 2007, 2007.09.
45. Takeshi Nanri, Takeshi Soga, Koji Kurihara, Feng Long Gu, Hiroaki Ishihata and Kazuaki Murakami, Evaluation of the Performance of Parallel Sparse-Matrix Multiplication and the Effect of Dynamic Load-Balancing, International Conference on Computational Methods in Science and Engineering 2007, 2007.09.
46. Feng Long Gu, Hyacinthe Nzigou Mamadou, Guilherme Domingues, Takeshi Nanri and Kazuaki Murakami, Investigating the Performance of Collective Communications on SMP Clusters: A Case for MPI_Allgather, International Conference on Computational Methods in Science and Engineering 2007, 2007.09.
47. Guilherme Domingues, Yoshiyuki Morie, Feng Long Gu , Takeshi Nanri and Kazuaki Murakami, SMMH - A Parallel Heuristic for Combinatorial Optimization Problems, International Conference on Computational Methods in Science and Engineering 2007, 2007.09.
48. Takesi Soga, Kouji Kurihara, Takeshi Nanri, Motoyoshi Kurokawa and Kazuaki Murakami, Dynamic Optimization of Load Balance in MPI Broadcast, Euro PVM/MPI 2007, 2007.10.
49. Hyacinthe Nzigou Mamadou, Takeshi Nanri and Kazuaki Murakami, Performance Analysis and Linear Optimization Modeling of All-to-all Collective Communication Algorithms, SBAC-PAD 2007, 2007.10.
50. Takesi Soga, Takeshi Nanri, Motoyoshi Kurokawa and Kazuaki Murakami, Effect of Reordering Internal Messages in MPI Broadcast According to the Load Imbalance, IWIA '08, 2008.01.
51. 馬場慎也、南里豪志、藤野清次, ハイブリッド並列化したIDR(s)法の計算時間に対するプロセス数とスレッド数の組み合わせ依存性について
, 情報処理学会ハイパフォーマンスコンピューティング研究会, 2008.05.
52. 馬場慎也, 南里豪志, 藤野清次, 染原一仁 , 並列版 PAGME つき CG 法の性能解析
, 情報処理学会ハイパフォーマンスコンピューティング研究会, 2008.12.
53. 南里豪志, Hyacinthe Nzigou Mamadou, Feng Long Gu, 村上和彰, 性能モデルによる予測を併用した Alltoallアルゴリズム動的選択技術の評価, 情報処理学会ハイパフォーマンスコンピューティング研究会, 2008.12.
54. Takeshi Soga, Takeshi Nanri, Motoyoshi Kurokawa and Kazuaki Murakami, Profiling Technique for Dynamic Optimization According to Waiting Time
, HPC Asia, 2009.03.
55. Shinya Baba, Yusuke Onoue, Takeshi Nanri and Seiji Fujino, Dependence on loop distribution of performance in hybrid-parallel IDR(s) method
, HPC Asia, 2009.03.
56. Hyacinthe Nzigou Mamadou, Feng Long Gu, Vivien Oddou, Takeshi Nanri, Kazuaki Murakami , A Dynamic Solution for Efficient MPI Collective Communications, International Workshop on HPC and Grid Applications , 2009.04.
57. 馬場 慎也, 南里 豪志, 藤野 清次, 染原 一仁, 階層型並列計算機向けPAGMEつきCG法の実装と性能解析, 計算工学講演会, 2009.05.
58. Hyacinthe Nzigou Mamadou, Takeshi Nanri, and Kazuaki Murakami, A Robust Dynamic Optimization for MPI Alltoall Operation, 18th International Heterogeneity in Computing Workshop, 2009.05.
59. Kenichiro Kusaba, Takeshi Nanri and Seiji Fujino, Runtime Load-balancing Technique for Sparse Matrix-Vector Multiplication, International Workshop on Innovative Architecture, 2010.03.
60. 草場健一郎,南里豪志,藤野清次, 通信と計算の負荷を考慮した並列疎行列ベクトル積の動的負荷分散技術, 2010年並列/分散/協調処理に関する『金沢』サマー・ワークショップ, 2010.08.
61. Yoshiyuki Morie, Takeshi Nanri, and Motoyoshi Kurokawa , Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communications, The Tenth IASTED International Conference on Parallel and Distributed Computing and Networks, 2011.02.
62. Yoshiyuki Morie, Takeshi Nanri, Ryutaro Susukita and Koji Inoue,, A Method for Predicting a Penalty of Contentions by Considering Priorities of Routing among Packets on Direct Interconnection Network, International Joing Conference on Computational Sciences and Optimization 2011, 2011.04.
63. Takeshi Nanri and Motoyoshi Kurokawa, Effect of Dynamic Algorithm Selection of All-to-All Communication on Environments with Unstable Network Speed, International Conference on High Performance Computing & Simulation,, 2011.07.
64. 松本幸,安達知也,田中稔,住元真司,曽我武史,南里豪志, MPI Allreduce の「京」上での実装と評価, 第19回ハイパフォーマンスコンピューティングとアーキテクチャの評価に関する北海道ワークショップ, 2011.11.
65. 南里 豪志, 通信ライブラリにおける実行時自動チューニング技術, 第3回自動チューニング技術の現状と応用に関するシンポジウム, 2011.12.
66. 南里 豪志, スケーラブルな通信ライブラリ実装技術, 第8回戦略的高性能計算システム開発に関するワークショップ, 2012.02.
67. 稲富雄一、眞木 淳、高見利也、本田宏明、小林泰三、南里豪志、青柳睦、南一生, 並列FMOプログラムOpenFMOの性能最適化, 第133回ハイパフォーマンスコンピューティング研究会, 2012.03.
68. 南里豪志、黒川原佳, ランク配置に応じた集団通信アルゴリズム動的選択技術の提案, 第133回ハイパフォーマンスコンピューティング研究会, 2012.03.
69. 松本 幸,安達 知也,住元 真司,曽我 武史,南里 豪志,宇野 篤也,黒川 原佳,庄司 文由,横川 三津夫, MPI Allreduce の「京」上での実装と評価, 先進的計算基盤システムシンポジウム(SACSIS2012), 2012.05.
70. Yoshiyuki Morie, Takeshi Nanri, Task Allocation Optimization for Neighboring Communication on Fat Tree, 14th IEEE International Conference on High Performance Computing and Communication, 2012.06.
71. FUKAZAWA Keiichiro, Takeshi Nanri, Performance of Large Scale MHD Simulation of Global Planetary Magnetosphere with Massively Parallel Scalar Type Supercomputer Including Post Processing, 14th IEEE International Conference on High Performance Computing and Communication, 2012.06.
72. Matthew Livesey, James Francis Stack, Jr., Fumie Costen, Takeshi Nanri, Norimasa Nakashima, Seiji FUJINO, Impact of GPU Memory Access Patterns on FDTD, IEEE Antennas and Propagation Society International Symposium (APSURSI), 2012.07.
73. Seiji FUJINO, Takeshi Nanri, Kenichirou Kusaba, Balancing Communication and Execution Technique for Parallelized Sparse Matrix-Vector Multiplication, 4th International Conference on Future Computational Technologies and Applications, 2012.07.
74. FUKAZAWA Keiichiro, Takeshi Nanri, Effective Performance of Large-Scale MHD Simulation for Planetary Magnetosphere with Massively Parallel Computer, JSST2012 International Conference on Simulation Technology, 2012.07.
75. Takeshi Nanri, Motoyoshi Kurokawa, Efficient Runtime Algorithm Selection of Collective Communication with Topology-Based Performance Models, International Conference on Parallel and Distributed Processing Techniques and Applications, 2012.07.
76. 南里 豪志, Tofu ネットワークにおけるプロセス配置形状による集団通信アルゴリズムの性能解析,, ハイパフォーマンスコンピューティング研究発表会, 2012.10, スーパーコンピュータの大規模化に伴って,ノード間インターコネクトネットワークとして,コストの低い多次元メッシュ/トーラストポロジを採用したものを用いる事例が増えている.多次元メッシュ/トーラスは,使用するノード数が同じでも,プロセスが配置されるノード群の形状によって性能が大きく変動する.本研究では,京コンピュータや,その互換機である Fujitsu PRIMEHPC FX10で用いられている Tofuインターコネクトネットワークを対象として,プロセス配置の形状による集団通信アルゴリズムの性能への影響を計測した.得られた性能を,Tofuインターコネクトの性能解析ツールを用いて取得した通信衝突による転送待ち時間と比較したところ,プロセス配置形状による変動がどちらもほぼ同じ傾向を示すことを明らかにした.これらの結果から,集団通信アルゴリズムの選択において,プロセス配置の形状を考慮した性能見積もりが重要であることを示した..
77. 深沢 圭一郎, 南里 豪志, 高見 利也, 異なるスカラアーキテクチャ(x86、SPARC64)の電磁流体コードによる性能評価, ハイパフォーマンスコンピューティング研究発表会, 2012.10.
78. Matthew Livesey, James Francis Stack, Jr., Fumie Costen, Takeshi Nanri, Norimasa Nakashima, Seiji FUJINO, An Alternative Domain Decomposition Technique for CUDA-based 3D FDTD Methods, 9th European Radar Conference, 2012.11.
79. 森江 善之, 南里 豪志, 通信衝突を削減するタスク配置最適化における通信タイミングの予測方式の影響, 第194回計算機アーキテクチャ・第137回ハイパフォーマンスコンピューティング合同研究発表会(HOKKE-20), 2012.12.
80. 森江 善之, 南里 豪志, 多次元メッシュ/トーラスにおける通信衝突を考慮したタスク配置最適化技術, ハイパフォーマンスコンピューティングと計算科学シンポジウム, 2013.01.
81. Takeshi Nanri, Introduction of ACE(Advanced Communication library for Exa) Project, International workshop on HPC, Krylov Subspace method and its application, 2013.01.
82. Yoshiyuki Morie, Takeshi Nanri, Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communication, International workshop on HPC, Krylov Subspace method and its application, 2013.01.
83. Tsuyoshi Okuma, Takeshi Nanri, Evaluation of Implementation Methods for Non-Blocking Collective Communications in Overlapping Communication and Computation, International workshop on HPC, Krylov Subspace method and its application, 2013.01.
84. Hironobu Sugiyama, Takeshi Nanri, Performance Prediction Technology for Collective Communication Algorithm on Multi-Dimensional Mesh/Torus, International workshop on HPC, Krylov Subspace method and its application, 2013.01.
85. 南里 豪志, 杉山 裕宣, 森江 善之, 多次元メッシュ/トーラスにおけるプロセス配置に応じた集団通信アルゴリズム選択技術の提案, 第138回ハイパフォーマンスコンピューティング研究会, 2013.02.
86. Takeshi Nanri, Hironobu Sugiyama, FUKAZAWA Keiichiro, A Cost-Efficient Approach for Automatic Algorithm Selection of Collective Communications, SIAM Conference on Computational Science and Engineering, 2013.03.
87. 南里 豪志, 通信ライブラリの自動チューニングを支援する Hint API の提案, 第141回ハイパフォーマンスコンピューティング研究会, 2013.10.
88. Takeshi Nanri, What Communication Library Can do with a Little Hint from Programmers?, MVAPICH User Group Meeting, 2013.08.
89. 児玉 大器, 南里 豪志, 性能予測と実測を併用した集団通信アルゴリズム選択, 今後のHPC(基盤技術と応用)に関するワークショップ, 2013.12.
90. 杉山 裕宣, 南里 豪志, プログラムのヒント情報を用いた通信ライブラリ動的最適化技術について
, 今後のHPC(基盤技術と応用)に関するワークショップ, 2013.12.
91. 南里 豪志, MPI における最適化情報提供のためのインターフェイスに関する評価
, 今後のHPC(基盤技術と応用)に関するワークショップ, 2013.12.
92. Yoshiyuki Morie, Takeshi Nanri, A neighbor communication algorithm with making an effective use of NICs on multidimensional-mesh/torus, International Conference on Simulation Technology , 2013.09.
93. Tsuyoshi Okuma, Takeshi Nanri, Performance Study of Non-blocking Collective Communication Implementations Toward Adaptive Selection, Networking, Computing, Systems and Software, 2013.12.
94. Hironobu Sugiyama, Takeshi Nanri, Topology Aware Performance Prediction of Collective Communication Algorithms on Multi-Dimensional Mesh/Torus, Networking, Computing, Systems and Software, 2013.12.
95. Takeshi Nanri, Proposal of HINT Interface for Runtime Tuning of Communication Links, 22nd Euromicro International Conference on Parallel, Distributed and network-based Processing, 2014.02.
96. Takeshi Nanri, Design and Implementation of Channel Interface as a Memory Efficient Communication Model, Annual Meeting on Advanced Computing System and Infrastructure (ACSI) 2015, 2015.01.
97. Takeshi Nanri, Channel Interface: a Primitive Model for Memory Efficient Communication, 23rd Euromicro International Conference on Parallel, Distributed and network-based Processing, 2015.03.
98. Hiroaki Honda, Takeshi Nanri, Yoshiyuki Morie, Performance and memory usage evaluations for channel interface of Advanced Communication Primitives library, 1st Pan-American Congress on Computational Mechanics (PANACM 2015), 2015.04.
99. 森江 善之, 南里 豪志, 直接網において複数の通信デバイスを有効に使用する隣接通信アルゴリズムの提案, 2015 ハイパフォーマンスコンピューティングと計算科学シンポジウム, 2015.05.
100. Ryutaro Susukita, Yoshiyuki Morie, Takeshi Nanri, Hidetomo Shibamura, Performance Evaluation of RDMA Communication Patterns by Means of Simulations, 2015 Joint International Mechanical, Electronic and Information Technology Conference, 2015.12.
101. Takeshi Nanri, Evaluation of On-Demand Message-Passing Module over RDMA Network, ACSI2016, 2016.01.
102. Kazuichi Oe, Takeshi Nanri, KOJI OKAMURA, On-The-Fly Automated Storage Tiering with Caching and both Proactive and Observational Migration, Workshop on Computer Systems and Architectures (CSA'15), 2015.12.
103. Shinji Sumimoto, Yuichiro Ajima, Takafumi Nose, Kazushige Saga, Naoyuki Shida, Takeshi Nanri, Parallel Application Experiences Using Advanced Communication Primitives, 25th Euromicro International Conference on Parallel, Distributed and network-based Processing, 2017.03.

九大関連コンテンツ

pure2017年10月2日から、「九州大学研究者情報」を補完するデータベースとして、Elsevier社の「Pure」による研究業績の公開を開始しました。