Updated on 2025/04/21

Information

 

写真a

 
NANRI TAKESHI
 
Organization
Research Institute for Information Technology Section of Advanced Computational Science Associate Professor
Graduate School of Information Science and Electrical Engineering Department of Information Science and Technology(Concurrent)
Joint Graduate School of Mathematics for Innovation (Concurrent)
Title
Associate Professor
Contact information
メールアドレス
Profile
Research: My major research topic is to develop technologies for runtime systems of parallel programs. Parallel computers have become popular platform from PC servers to large-scale supercomputers. To utilize the computational power of those platforms, users must prepare parallel programs. In an execution of a parallel program on a parallel computer, a runtime system interprets the operation written in the program into specific behavior of the computer. Therefore, technologies in runtime systems are the keys for efficient use of parallel computers. Especially, when the size of the computer becomes large, to achieve better performance, consideration of the information available at runtime, such as the allocation of processes, balance of load among processors or contentions of resources among jobs. In our research, we are developing technologies for dynamic optimization of runtime systems on large-scale parallel computers.
External link

Research Areas

  • Informatics / High performance computing

Degree

  • Ph.D

Research History

  • Kyushu University Research Institute for Information Technology Section of Advanced Computational Science  Associate Professor 

    2007.3 - Present

Research Interests・Research Keywords

  • Research theme: Fundamental techniques to enable highly scalable parallel computations

    Keyword: scalability, parallel computation, high-performance computation

    Research period: 2011.9

  • Research theme: Technologies for dynamic optimization of communication libraries on large-scale parallel computers

    Keyword: Parallel Computing, Runtime Optimization

    Research period: 2005.4

  • Research theme: Programming environment for hierarchical parallel environment

    Keyword: Hierarchical parallel computer, distributed shared memory, communication optimization

    Research period: 2003.4

Awards

  • 大学ICT推進協議会2020年度年次大会優秀論文賞

    2021.4   大学ICT推進協議会   DIMMスロット装着型不揮発性メモリ上のRDMAによるメッセージキューイングシステムの試作

  • 山下記念研究賞

    2013.7   一般社団法人 情報処理学会   第136回ハイパフォーマンスコンピューティング研究会における研究発表「Tofuネットワークにおけるプロセス配置形状による集団通信アルゴリズムの性能解析」に対する受賞。

     More details

    スーパーコンピュータの大規模化に伴って,ノード間インターコネクトネットワークとして,コストの低い多次元メッシュ/トーラストポロジを採用したものを用いる事例が増えている.多次元メッシュ/トーラスは,使用するノード数が同じでも,プロセスが配置されるノード群の形状によって性能が大きく変動する.本研究では,京コンピュータや,その互換機である FujitsuPRIMEHPC FX10で用いられている Tofuインターコネクトネットワークを対象として,プロセス配置の形状による集団通信アルゴリズムの性能への影響を計測した.得られた性能を,Tofuインターコネクトの性能解析ツールを用いて取得した通信衝突による転送待ち時間と比較したところ,プロセス配置形状による変動がどちらもほぼ同じ傾向を示すことを明らかにした.これらの結果から,集団通信アルゴリズムの選択において,プロセス配置の形状を考慮した性能見
    積もりが重要であることを示した.

Papers

  • 「京」の後の時代を支えるスパコン:5.多数のXeonプロセッサを用いるスパコン Invited

    @南里 豪志

    情報処理   60 ( 12 )   1198 - 1203   2019.11

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • 分散共有メモリシステム上にソフトウェアによって構築されたキャッシュシステムの静的制御 Reviewed

    南里 豪志, 佐藤周行, 島崎眞昭

    情報処理学会論文誌   1997.9

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • Portability in Implementing Distributed Shared Memory System on the Workstation Cluster Environment Reviewed

    Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki

    Research Reports on Information Science and Electrical Engineering of Kyushu University   1997.3

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

  • Optical properties of rutile TiO<inf>2</inf> with Zr, Mo, Zn, Cd impurities

    Ohno K., Sahara R., Nanri T., Kawazoe Y.

    Computational Condensed Matter   41   2024.12   ISSN:2352-2143

     More details

    Publisher:Computational Condensed Matter  

    To explore quasiparticle (QP) energy gaps and photoabsorption spectra of rutile TiO2 with nonmagnetic transition metal (Zr, Mo, Zn, Cd) impurities, we conducted a Γ-point only GW + Bethe–Salpeter equation (BSE) calculation on a 72 (or 71) atom supercell. Our findings reveal that Zn and Cd impurities must coexist, at least partly, with oxygen vacancies to maintain charge neutrality. Among the systems considered, Mo, Zn, or Cd doped rutile TiO2 may exhibit optical absorption and catalytic activity under visible light. The resulting QP energy gaps (ΔɛQP) and photoabsorption energies (PAEs) are fairly in good agreement with both experimental and theoretical data currently available. The necessary conditions for the applicability of the Γ-point only approach in the GW + BSE framework were found to be: (1) The Γ-point only GW calculation should reproduce a reasonable band gap. (2) The “superficial” exciton binding energy (the diagonal element of Wvc;vc−2Xvc;vc between v= VBM and c= CBM, where W and X are the direct and exchange terms of the BSE matrix elements, respectively) must be positive or marginally negative. (3) The “real” exciton binding energy (ΔɛQP− the lowest PAE) should be positive, even if it is exceptionally small.

    DOI: 10.1016/j.cocom.2024.e00977

    Web of Science

    Scopus

  • Forward and backward multi-particle dispersion in homogeneous isotropic turbulence Invited Reviewed

    ARAKAWA Ryunosuke, KITAMURA Takuya, SONOBE Yohei, SAIMOTO Akihide, NANRI Takeshi

    Transactions of the JSME (in Japanese)   90 ( 929 )   1 - 9   2024.1   eISSN:21879761

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)   Publisher:The Japan Society of Mechanical Engineers  

    <p>Turbulent diffusion in homogeneous isotropic turbulence is numerically investigated using the direct numerical simulation (DNS). The four-dimensional turbulence database allows us to track fluid particles not only in the forward direction of time but also in the backward direction. Two-particle dispersion has been studied in previous studies and it is known that backward diffusion is faster than forward diffusion. However, little is known about multi-particle dispersion due to the difficulty of observing it experimentally. Studies on backward diffusion are also limited. In this study, multi-particle dispersion is numerically investigated and its properties are discussed, e.g., direction of time and geometry of a tetrahedron. The results show that forward and backward diffusions of multi-particles behave differently at the beginning and evolve similarly after the transient time, but the coefficients of the backward direction are larger than those of the forward direction.</p>

    DOI: 10.1299/transjsme.23-00281

    CiNii Research

  • Implementation of Coupled Numerical Analysis of Magnetospheric Dynamics and Spacecraft Charging Phenomena via Code-To-Code Adapter (CoToCoA) Framework

    Miyake Y., Sunada Y., Tanaka Y., Nakazawa K., Nanri T., Fukazawa K., Katoh Y.

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   14074 LNCS   438 - 452   2023   ISSN:03029743 ISBN:9783031360206

     More details

    Publisher:Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)  

    This paper addresses the implementation of a coupled numerical analysis of the Earth’s magnetospheric dynamics and spacecraft charging (SC) processes based on our in-house Code-To-Code Adapter (CoToCoA). The basic idea is that the magnetohydrodynamic (MHD) simulation reproduces the global dynamics of the magnetospheric plasma, and its pressure and density data at local spacecraft positions are provided and used for the SC calculations. This allows us to predict spacecraft charging that reflects the dynamic changes of the space environment. CoToCoA defines three types of independent programs: Requester, Worker, and Coupler, which are executed simultaneously in the analysis. Since the MHD side takes the role of invoking the SC analysis, Requester and Worker positions are assigned to the MHD and SC calculations, respectively. Coupler then supervises necessary coordination between them. Physical data exchange between the models is implemented using MPI remote memory access functions. The developed program has been tested to ensure that it works properly as a coupled physical model. The numerical experiments also confirmed that the addition of the SC calculations has a rather small impact on the MHD simulation performance with up to about 500-process executions.

    DOI: 10.1007/978-3-031-36021-3_46

    Scopus

  • Numerical approach for aerodynamics around two tone holes of woodwind instruments

    Takanami S., Tabata R., Iwagami S., Ohno T., Nanri T., Kobayashi T., Takahashi K.

    Proceedings of the International Congress on Acoustics   2022   ISSN:22267808

     More details

    Publisher:Proceedings of the International Congress on Acoustics  

    In this paper, we discuss the numerical reproducibility of the compressible fluid behavior around two tone holes of woodwind instruments by using compressible Large Eddy Simulation (LES). In particular, we focus on the situation that the tone holes are opened and closed with moving pads above the tone holes, which is regarded as a moving boundary problem with topology change, and reproduce the change of the pitch when opening and closing the tone holes. Our two-dimensional model of a "recorder" has two tone holes. To reproduce the opening and closing the tone holes, the pads are moved continuously. That is, the position of the pads were continuously changed in the order of "open - close". Our numerical results are consistent with the Keef's experimental results. We solved the moving boundary problem with topology change under the situation of acoustics of fluid-structure interaction, and reproduced the pitch change in the opening and closing the tone holes of the recorder like woodwind instrument model.

    Scopus

  • Compressible fluid analysis on basic properties of a thermoacoustic equipment

    Tashima Y., Ohno T., Nanri T., Kobayashi T., Takahashi K.

    Proceedings of the International Congress on Acoustics   2022   ISSN:22267808

     More details

    Publisher:Proceedings of the International Congress on Acoustics  

    Two and three-dimensional models of a test-tube thermoacoustic engine were numerically analyzed using compressible Large Eddy Simulation (LES) to investigate initial transient behavior, i.e., generation mechanism of thermoacoustic waves in an initial state. In the model used in this study, a stack is placed near the bottom of the test-tube and a temperature gradient is applied between both ends of the stack. The model has an external region connecting the opening of the test-tube for radiations of sound waves and heat. As a result, a fluid flow was observed inside the stack, a strong pressure oscillation, i.e., acoustic resonance, was observed inside the test-tube, and sound radiation from the open end was also observed. Furthermore, the frequency of the sound vibration was almost the same as the theoretical estimation of the resonance frequency of the test-tube. Thus, we successfully reproduced the basic properties of the thermoacoustic engine in the initial state. However, noise components increased in time evolution and stationary oscillations were not attained yet. Thus, we need to improve our numerical method. We are also planning to analyze sound waves' generation mechanism taking into account aeroacoustic theory, e.g., Lighthill's acoustic analogy.

    Scopus

  • Aeroacoustic analysis of port noise by using a three-dimensional numerical model of a bass reflex speaker system

    Uryuu K., Tabata R., Ohno T., Nanri T., Kobayashi T., Takahashi K.

    Proceedings of the International Congress on Acoustics   2022   ISSN:22267808

     More details

    Publisher:Proceedings of the International Congress on Acoustics  

    We numerically study port noise observed for a bass reflex speaker system with a compressible fluid solver, Large-Eddy Simulation(LES), to explore the noise generation mechanism from the viewpoint of aeroacoustics. The port noise is considered an aerodynamic sound generated by vortices, created by the interaction between the acoustic field and port opening. However, the detail of the sound generation mechanism is still an open problem. By using a 3D-model, the port-noise is well reproduced by compressible LES, when the speaker system is acoustically driven at its resonance frequency, the Helmholtz resonance frequency of the bass reflex speaker system. Vortices are created near the edges of the port and generate broadband noise. However, noises are enhanced due to resonance in some bands, which correspond to the acoustic resonance frequencies of the port and encloser themselves. We are also planning to investigate the noise generation mechanism by using Howe's energy corollary, which allows us to estimate the energy transfer between fluid dynamics and acoustics, namely we consider the problems of where the aeroacoustic noise is generated and how much energy is transferred from the vortex motions to the acoustic field.

    Scopus

  • Aeroacoustic analysis of oboe reeds with compressible direct numerical simulation

    Nakahara Y., Sumita R., Tabata R., Iwagami S., Nanri T., Kobayashi T., Hattori Y., Takahashi K.

    Proceedings of the International Congress on Acoustics   2022   ISSN:22267808

     More details

    Publisher:Proceedings of the International Congress on Acoustics  

    A two-dimensional model of an oboe reed is studied numerically with a direct numerical simulation (DNS) of the compressible Navier-Stokes equations to investigate the sound generation mechanism from the viewpoint of aeroacoustics. The numerical tool is extremely accurate due to the smallest mesh size on the order of micrometers and successfully reproduces the details of fluid motion and acoustic vibrations inside and outside the reed. Particular attention is paid to the effect of reed vibration on the sound generation mechanism. When the reeds are fixed and a periodically varying flow is injected through the fixed reed slit, an aerodynamics sound created inside the reeds is an almost monotone including a few overtones. On the other hand, when a flow is injected through periodically vibrating reeds from an oral cavity, more overtone components are observed and the pressure waveforms are similar to those observed in the experiment. This indicates that the richness of the overtones of the double-reed instrument is mainly attributed to the aerodynamic sound created by the flow injected through vibrating reeds and the bore, a linear resonator, just enhances characteristics of the instrument, e.g., formant.

    Scopus

  • Numerical study of a French horn mouthpiece accompanied by vibrating lips and an oral cavity with compressible direct numerical simulation

    Sumita R., Tabata R., Iwagami S., Nakahara Y., Nanri T., Kobayashi T., Hattori Y., Takahashi K.

    Proceedings of the International Congress on Acoustics   2022   ISSN:22267808

     More details

    Publisher:Proceedings of the International Congress on Acoustics  

    A two-dimensional model of a French Horn mouthpiece is numerically studied with a 2D direct numerical simulation (DNS) of the compressible Navier-Stokes equations to investigate the sound generation mechanism from the viewpoint of aeroacoustics. That is, we consider the sounding mechanism of buzzing, when the mouthpiece without a bore is played. Our numerical tool is highly accurate due to the minimum mesh size of the order of the micro-meter, and details of fluid motion and acoustic oscillation inside and near the mouthpiece are successfully reproduced. In particular, we focus on the roles of vibrating lips and an oral cavity in the sound generation mechanism. When the mouthpiece without lips and an oral cavity is driven by a periodic flow with a certain frequency, a single tone without overtones is observed. On the other hand, when the mouthpiece is driven by vibrating lips with an oral cavity, a generating sound includes rich overtones and its waveform is similar to that observed experimentally. Since the bore is a linear element and cannot generates overtones from a single tone by itself, the sound of a horn including rich overtones is generated by a mouthpiece with the vibrating lips and oral cavity.

    Scopus

  • Numerical study of the feedback mechanism of the edge tone

    Onomata T., Iwagami S., Tabata R., Ohno T., Nanri T., Kabayashi T., Takahashi K.

    Proceedings of the International Congress on Acoustics   2022   ISSN:22267808

     More details

    Publisher:Proceedings of the International Congress on Acoustics  

    We numerically investigate fundamental problems of the edge tone with compressible Large Eddy Simulation (LES) together with acoustic solver FDTD and incompressible LES. Jet oscillation and edge tone in the first mode are successfully reproduced by a 3D model with compressible LES. Namely, the acoustic intensity changes with the jet velocity well following the sixth power law, which is an evidence of the reliability of our numerical method. Next, we estimate the intensity of acoustic feedback in the following way. According to Kaykayoglu and Rockwell, effective pressure sources are considered to be located slightly downstream of the edge tip on both sides of the edge. Indeed, such a pair of positive and negative pressure spots periodically appear in our numerical calculation. Then, we set the pressure spots on both sides of the edge and reproduce acoustic waves radiated from them by FDTD. The acoustic particle velocity of the reproduced acoustic field at the nozzle outlet is regarded as acoustic feedback. Even though such acoustic feedback may make a contribution to driving the jet, we can consider that the fluid feedback is still dominant in a low-Reynolds number regime as pointed out by Paál et al. from the results of incompressible LES.

    Scopus

  • Design of a Flexible In Situ Framework with a Temporal Buffer for Data Processing and Visualization of Time-Varying Datasets Reviewed

    Kenji Ono, Jorji Nonaka, Hiroyuki Yoshikawa, Takeshi Nanri, Yoshiyuki Morie, Tomohiro Kawanabe, Fumiyoshi Shoji

    Lecture Notes in Computer Science   11203   243 - 257   2019.1

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

  • Hybrid storage system consisting of cache drive and multi-tier SSD for improved IO access when IO is concentrated Reviewed

    Kazuichi Oe, Takeshi Nanri, Koji Okamura

    IEICE Transactions on Information and Systems   E102D ( 9 )   1715 - 1730   2019.1

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    In previous studies, we determined that workloads often contain many input-output (IO) concentrations. Such concentrations are aggregations of IO accesses. They appear in narrow regions of a storage volume and continue for durations of up to about an hour. These narrow regions occupy a small percentage of the logical unit number capacity, include most IO accesses, and appear at unpredictable logical block addresses. We investigated these workloads by focusing on page-level regularity and found that they often include few regularities. This means that simple caching may not reduce the response time for these workloads sufficiently because the cache migration algorithm uses page-level regularity. We previously developed an on-the-fly automated storage tiering (OTFAST) system consisting of an SSD and an HDD. The migration algorithm identifies IO concentrations with moderately long durations and migrates them from the HDD to the SSD. This means that there is little or no reduction in the response time when the workload includes few such concentrations. We have now developed a hybrid storage system consisting of a cache drive with an SSD and HDD and a multi-tier SSD that uses OTFAST, called "OTF-AST with caching." The OTF-AST scheme handles the IO accesses that produce moderately long duration IO concentrations while the caching scheme handles the remaining IO accesses. Experiments showed that the average response time for our system was 45% that of Facebook FlashCache on a Microsoft Research Cambridge workload.

    DOI: 10.1587/transinf.2018EDP7253

  • ATSMF Automated tiered storage with fast memory and slow flash storage to improve response time with concentrated input-output (IO) workloads Reviewed

    Kazuichi Oe, Mitsuru Sato, Takeshi Nanri

    IEICE Transactions on Information and Systems   E101D ( 12 )   2889 - 2901   2018.12

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    The response times of solid state drives (SSDs) have decreased dramatically due to the growing use of non-volatile memory express (NVMe) devices. Such devices have response times of less than 100 micro seconds on average. The response times of all-flash-array systems have also decreased dramatically through the use of NVMe SSDs. However, there are applications, particularly virtual desktop infrastructure and in-memory database systems, that require storage systems with even shorter response times. Their workloads tend to contain many input-output (IO) concentrations, which are aggregations of IO accesses. They target narrow regions of the storage volume and can continue for up to an hour. These narrow regions occupy a few percent of the logical unit number capacity, are the target of most IO accesses, and appear at unpredictable logical block addresses. To drastically reduce the response times for such workloads, we developed an automated tiered storage system called “automated tiered storage with fast memory and slow flash storage” (ATSMF) in which the data in targeted regions are migrated between storage devices depending on the predicted remaining duration of the concentration. The assumed environment is a server with non-volatile memory and directly attached SSDs, with the user applications executed on the server as this reduces the average response time. Our system predicts the effect of migration by using the previously monitored values of the increase in response time during migration and the change in response time after migration. These values are consistent for each type of workload if the system is built using both non-volatile memory and SSDs. In particular, the system predicts the remaining duration of an IO concentration, calculates the expected response-time increase during migration and the expected response-time decrease after migration, and migrates the data in the targeted regions if the sum of response-time decrease after migration exceeds the sum of response-time increase during migration. Experimental results indicate that ATSMF is at least 20% faster than flash storage only and that its memory access ratio is more than 50%.

    DOI: 10.1587/transinf.2018PAP0005

  • Approaches for memory-efficient communication library and runtime communication optimization

    Takeshi Nanri

    Advanced Software Technologies for Post-Peta Scale Computing The Japanese Post-Peta CREST Research Project   121 - 138   2018.12

     More details

    Language:English  

    This article summarizes the works established in Advanced Communication for Exa (ACE) project. The most important motivation of this project was the severe demands for scalable communication toward Exa-scale computations. Therefore, in the project, we have built a PGAS-based communication library, Advanced Communication Primitives (ACP). Its fundamental communication model is onesided, based on PGAS model, so that it can consume internal memory footprint as small as possible. Based on this model, several applications including simulations of magnetohydrodynamic, molecular orbitals, and particles were tuned to achieve higher scalability. In addition to that, some communication optimization techniques have been investigated. Especially, tuning methods of collective communications, such as message ordering, algorithm selection, and overlapping, are studied. Also, in this project, a network simulator NSIM-ACE is developed. It simulates behavior of packets for one-sided communications to study the effects of congestions on interconnects.

    DOI: 10.1007/978-981-13-1924-2_7

  • Design of a Flexible In Situ Framework with a Temporal Buffer for Data Processing and Visualization of Time-Varying Datasets. Reviewed

    Kenji Ono, Jorji Nonaka, Hiroyuki Yoshikawa, Takeshi Nanri, Yoshiyuki Morie, Tomohiro Kawanabe, Fumiyoshi Shoji

    High Performance Computing - ISC High Performance 2018 International Workshops, Frankfurt/Main, Germany, June 28, 2018, Revised Selected Papers   243 - 257   2018.6

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Design of a Flexible In Situ Framework with a Temporal Buffer for Data Processing and Visualization of Time-Varying Datasets.

    DOI: 10.1007/978-3-030-02465-9_17

  • Performance Evaluation and Optimization of MagnetoHydroDynamic Simulation for Planetary Magnetosphere with Xeon Phi KNL

    Keiichiro Fukazawa, Takeshi Soga, Takayuki Umeda, Takeshi Nanri

    Parallel Computing is Everywhere   178 - 187   2018.1

     More details

    Language:English  

    The magnetohydrodynamic (MHD) simulation is often applied to study the global dynamics and configuration of a planetary magnetosphere for the space weather. In this paper, the computational performance of MHD code is evaluated with 128 nodes Xeon Phi KNL of Cray XC40. As the results, the 2D and 3D domain decompositions of SoA (structure of array) make the effective performances and AoS (array of structure) and hybrid parallel computation become low performances. Adding the performance optimizations for Xeon Phi to our MHD simulation code, then we have obtained 2.4 % increase of execution efficiency in total and we achieved 3 TFlops performance gain using 128 nodes.

    DOI: 10.3233/978-1-61499-843-3-178

  • Analysis of the Quality of Academic Papers by the Words in Abstracts Invited Reviewed International journal

    Tetsuya Nakatoh, Kenta Nagatani, Toshiro Minami, Sachio Hirokawa, Takeshi Nanri, Miho Funamori

    HIMI 2017, Part II, LNCS 10274, Proc. of the 19th International Conference on Human-Computer Interaction (HCI International 2017)   2017.7

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • HPCにおける通信ライブラリの動向 Reviewed

    南里 豪志

    シミュレーション   36 ( 2 )   79 - 84   2017.6

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • Assessing the Significance of Scholarly Articles using their Attributes Invited Reviewed International journal

    Tetsuya Nakatoh, Sachio Hirokawa, Toshiro Minami, Takeshi Nanri, Miho Funamori

    Proc. of the 22nd International Symposium on Artificial Life and Robotics (AROB2017)   742 - 746   2017.1

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

  • 同種コンパイラーと他機種実行を利用した計算時間の短縮 Reviewed

    藤野 清次, 小玉 捷平, 南里 豪志, 岩里 洸介

    日本シミュレーション学会論文誌   8 ( 1 )   21 - 24   2016.1

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • 性能向上を期待できる継続時間とIOアクセス数を満たしたIOアクセス集中領域を自動抽出してSSDに移動することで性能向上を図る階層型ストレージシステムの提案と評価 Reviewed

    大江 和一, 岩田 聡, 南里 豪志, 岡村 耕二

    情報処理学会論文誌   9 ( 1 )   1 - 16   2016.1

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • 直接網において複数の通信デバイスを有効に使用する隣接通信アルゴリズムの提案 Reviewed

    森江善之, 森江善之, 南里豪志, 南里豪志

    情報処理学会論文誌トランザクション コンピューティングシステム(Web)   8 ( 4 )   26-35 (WEB ONLY)   2015.11

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

    A Neighboring Communication Algorithm Using Effective Multiple Communication Devices on Direct Connection Network

  • 並列計算における reduction指示の実装に関する考察 Reviewed

    岩里 洸介, 南里 豪志, 藤野 清次

    日本シミュレーション学会論文誌   7 ( 4 )   109 - 113   2015.7

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • Performance Measurements of MHD Simulation for Planetary Magnetosphere on Peta-Scale Computer FX10 Reviewed

    FUKAZAWA Keiichiro, Takeshi Nanri, Takayuki Umeda

    Advances in Parallel Computing   2014.3

     More details

    Language:English  

    DOI: 10.3233/978-1-61499-381-0-387

  • Performance evaluation of magnetohydrodynamics simulation for magnetosphere on K computer Reviewed

    FUKAZAWA Keiichiro, Takeshi Nanri, Takayuki Umeda

    Communications in Computer and Information Science   2013.12

     More details

    Language:English  

    DOI: 10.1007/978-3-642-45037-2_61

  • Implementation of Neighbor Communication Algorithm Using Multi-NICs Effectively by Extended RDMA Interface Reviewed

    Yoshiyuki Morie, Takeshi Nanri

    SC13 Technical Posters   1 - 2   2013.11

     More details

    Language:Others  

    Implementation of Neighbor Communication Algorithm Using Multi-NICs Effectively by Extended RDMA Interface

  • 多次元メッシュ/トーラスにおける通信衝突を考慮したタスク配置最適化技術 Reviewed

    森江 善之, 南里 豪志

    情報処理学会   6 ( 3 )   12 - 21   2013.9

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • 多次元メッシュ/トーラスにおける通信衝突を考慮したタスク配置最適化技術 Reviewed

    森江善之, 南里豪志

    情報処理学会論文誌トランザクション コンピューティングシステム(Web)   6 ( 3 )   12-21 (WEB ONLY)   2013.9

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

    Task Allocation Technique for Avoiding Contentions on Multi-dimensional Mesh/Torus

  • A Neighbor Communication Algorithm with Making an Effective Use of NICs on Multidimensional-Mesh/torus Reviewed

    Yoshiyuki Morie, Takeshi Nanri

    International Conference on Simulation Technology (JSST2013)   JSST2013   1 - 2   2013.9

     More details

    Language:Others  

    A Neighbor Communication Algorithm with Making an Effective Use of NICs on Multidimensional-Mesh/torus

  • Development of a CUDA Implementation of the 3D FDTD Method Reviewed International journal

    Matthew Livesey, James Francis Stack, Jr., Fumie Costen, Takeshi Nanri, Norimasa Nakashima, Seiji FUJINO

    IEEE Antennas and Propagation Magazine   54 ( 5 )   186 - 195   2012.10

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1109/MAP.2012.6348145

  • MPI_Allreduceの「京」上での実装と評価 Reviewed

    松本 幸,安達 知也,住元 真司,曽我 武史,南里 豪志,宇野 篤也,黒川 原佳,庄司 文由,横川 三津夫

    情報処理学会 ACS論文誌   ( 40 )   2012.9

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • Task Allocation Optimization for Neighboring Communication on Fat Tree Reviewed

    Yoshiyuki Morie, Takeshi Nanri

    4th IEEE International Conference on High Performance Computing and Communication 9th IEEE International Conference on Embedded Software and Systems, HPCC-ICESS 2012   1219 - 1225   2012.1

     More details

    Language:Others  

    Task Allocation Optimization for Neighboring Communication on Fat Tree

    DOI: 10.1109/HPCC.2012.179

  • A Method for Predicting a Penalty of Contentions by Considering Priorities of Routing among Packets on Direct Interconnection Network Reviewed

    Yoshiyuki Morie, Takeshi Nanri, Ryutaro Susukita

    2011 Fourth International Joint Conference on Computational Sciences and Optimization   263 - 267   2011.4

     More details

    Language:Others  

    A Method for Predicting a Penalty of Contentions by Considering Priorities of Routing among Packets on Direct Interconnection Network

    DOI: 10.1109/CSO.2011.35

  • Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communication Reviewed

    Yoshiyuki Morie, Takeshi Nanri, Motoyoshi Kurokawa

    The Tenth IASTED International Conference on Parallel and Distributed Computing and Networks   62 - 69   2011.2

     More details

    Language:Others  

    Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communication

    DOI: 10.2316/P.2011.719-025

  • 負荷バランスの動的最適化によるMPIブロードキャスト性能改善 Reviewed

    曽我 武史, 栗原 康志, 南里 豪志, 黒川 原佳, 村上 和彰

    情報処理学会論文誌 コンピュータシステム   2008.12

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

    Dynamic Optimization of Load Balance in MPI Broadcast

  • Performance Models for MPI Collective Communications with Network Contention Reviewed

    Hyacinthe Nzigou Mamadou, Takeshi Nanri and Kazuaki Murakami

    IEICE Transactions on Communications   2008.5

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

  • 衝突削減のためのタスク配置最適化に関する研究 Reviewed

    森江 善之, 末安 直樹, 松本 透, 南里 豪志, 石畑 宏明, 井上 弘士, 村上 和彰

    次世代スーパーコンピューティング・シンポジウム2007   2007   2007.10

     More details

    Language:Others  

  • 通信タイミングを考慮した衝突削減のためのMPIランク配置最適化技術 Reviewed

    森江善之, 末安直樹, 松本透, 南里豪志, 石畑宏明, 井上弘士, 村上和彰

    情報処理学会論文誌   48 ( SIG13(ACS19) )   192 - 202   2007.8

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

    Optimization of MPI Rank Allocation Considering Communication Timing for Reducing Contention

▼display all

Presentations

  • DIMMスロット装着型不揮発性メモリ上のRDMAによるメッセージキューイングシステムの試作

    @南里 豪志、@大江 和一、@吉田 英司、@大辻 弘貴、@林 英里香

    大学ICT推進協議会2020年度年次大会  2020.12 

     More details

    Event date: 2020.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:オンライン   Country:Japan  

  • 高スループット非同期集団通信の性能モデル化に向けた予備評価

    Yoshiyuki Morie, Yasutaka Wada, Ryohei Kobayashi, Ryuichi Sakamoto, @Takeshi Nanri

    第198回ハイパフォーマンスコンピューティング・第14回量子ソフトウェア合同研究発表会  2025.3 

     More details

    Event date: 2025.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Sapporo  

  • Implementation of Task Scheduler for Quantum-Classic Hybrid Environments

    Takumi Tsuda, Taizo Kobayashi, Kin'ya Takahashi, @Takeshi Nanri

    第198回ハイパフォーマンスコンピューティング・第14回量子ソフトウェア合同研究発表会  2025.3 

     More details

    Event date: 2025.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Sapporo  

  • Implamentation of an Benchmark That Enables Comparison of Overlapping Effects among Different Non-Blocking Collective Implementations

    Takeru Narumi, @Takeshi Nanri

    第198回ハイパフォーマンスコンピューティング・第14回量子ソフトウェア合同研究発表会  2025.3 

     More details

    Event date: 2025.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Sapporo  

  • Implementation and Performance Evaluation of Discontinuous Data Transfer in Halo Communication with Tofu Interconnect

    Rennma Arisako, Takeshi Nanri

    2024年並列/分散/協調処理に関するサマー・ワークショップ  2024.8 

     More details

    Event date: 2024.8

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:徳島市  

  • Implementation of Coupled Numerical Analysis of Magnetospheric Dynamics and Spacecraft Charging Phenomena via Code-To-Code Adapter (CoToCoA) Framework International conference

    Y. Miyake, Y. Sunada, Y. Tanaka, K. Nakazawa, @T. Nanri, K. Fukazawa and Y. Katoh

    ICCS 2023  2023.6 

     More details

    Event date: 2023.6

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Prague   Country:Czech Republic  

  • 九州大学スーパーコンピュータとAWSクラウドサービスによるハイブリッド計算環境の相互補完的利用方法に関する調査

    @南里豪志, 松山和広, 田代皓嗣, 原田浩睦

    大学ICT推進協議会 2022年度 年次大会  2022.12 

     More details

    Event date: 2022.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:仙台国際センター   Country:Japan  

  • Cross-reference simulation by Code-To-Code Adapter (CoToCoA) library for the study of multi-scale physics in planetary magnetospheres International conference

    Yuto Katoh, Keiichiro Fukazawa, @Takeshi Nanri, Yohei Miyake

    2021 Ninth International Symposium on Computing and Networking Workshops (CANDARW)  2021.12 

     More details

    Event date: 2021.12

    Language:English   Presentation type:Oral presentation (general)  

    Country:Japan  

  • 実用アプリケーションのスイッチのキーテクノロジーであるSHARPを使用したMPI通信パフォーマンス向上の挑戦と、将来のスイッチテクノロジーへの期待 Invited

    南里豪志

    GPU TECHNOLOGY CONFERENCE  2020.10 

     More details

    Event date: 2020.10

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:オンライン   Country:Japan  

  • Scalable Direct-Iterative Hybrid Solver for Sparse Matrices on Multi-Core and Vector Architectures

    Kenji Ono, Toshihiro Kato, Satoshi Ohshima, Takeshi Nanri

    International Conference on High Performance Computing in Asia-Pacific Region  2019.12 

     More details

    Event date: 2020.1

    Language:English  

    Venue:Fukuoka   Country:Japan  

  • Application of cross-reference framework CoToCoA to Macro- and micro-scale simulations of planetary magnetospheres

    Keiichiro Fukazawa, Yuto Katoh, Takeshi Nanri, Yohei Miyake

    7th International Symposium on Computing and Networking Workshops, CANDARW 2019  2019.11 

     More details

    Event date: 2019.11

    Language:English  

    Venue:Nagasaki   Country:Japan  

    In this study, we have introduced the Code-to-Code Adapter (CoToCoA) library to couple the magnetohydrodynamic (MHD) simulation and the Electron Hybrid (EH) simulation of planetary magnetospheres. CoToCoA has been developed newly to connect the different codes easily. The concept of CoToCoA is that we do not add modifications to each code as possible without data transfer functions, and we do not need to know the referred code without data format. With CoToCoA, we have been developing the cross-reference simulation of macro (MHD) and micro (EH) scales in the magnetosphere. Then, we have evaluated the performance of cross-reference simulation using CoToCoA on the massively parallel computer system.

  • Hybrid Storage System to Achieve Efficient Use of Fast Memory Area

    Kazuichi Oe, Takeshi Nanri

    7th International Symposium on Computing and Networking, CANDAR 2019  2019.11 

     More details

    Event date: 2019.11

    Language:English  

    Venue:Nagasaki   Country:Japan  

    Hybrid storage techniques are useful methods to improve the cost performance for input-output (IO) intensive workloads. These techniques choose areas of concentrated IO accesses and migrate them to an upper tier to extract as much performance as possible through greater use of upper tier areas. Automated tiered storage with fast memory and slow flash storage (ATSMF) is a hybrid storage system situated between non-volatile memories (NVMs) and solid-state drives (SSDs). ATSMF aims to reduce the average response time for IO accesses by migrating areas of concentrated IO access from an SSD to an NVM. When a concentrated IO access finishes, the system migrates these areas from the NVM back to the SSD. Unfortunately, the published ATSMF implementation temporarily consumes much NVM capacity upon migrating concentrated IO access areas to NVM, because its algorithm executes NVM migration with high priority. As a result, it often delays evicting areas in which IO concentrations have ended to the SSD. Therefore, to reduce the consumption of NVM while maintaining the average response time, we developed new techniques for making ATSMF more practical. The first is a queue handling technique based on the number of IO accesses for NVM migration and eviction. The second is an eviction method that selects only write-accessed partial regions in finished areas. The third is a technique for variable eviction timing to balance the NVM consumption and average response time. Experimental results indicate that the average response times of the proposed ATSMF are almost the same as those of the published ATSMF, while the NVM consumption is drastically lower.

  • Performance improvement of high-speed file transfer over JHPCN

    Praphan Pavarangkoon, Ken T. Murata, Kazunori Yamamoto, Kazuya Muranaga, Takamichi Mizuhara, Keiichiro Fukazawa, Ryusuke Egawa, Takahiro Katagiri, Masao Ogino, Takeshi Nanri

    17th IEEE International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019  2019.8 

     More details

    Event date: 2019.8

    Language:English  

    Venue:Fukuoka   Country:Japan  

    This paper proposes a novel file transfer tool to improve file transfer performance over Japan high performance computing and networking (JHPCN). We first develop a high-performance and flexible protocol (HpFP) for inter-datacenter transport network. An original HpFP is designed first for specified networks and puts more emphasis on latency and packet loss tolerances than fairness and friendliness, while an enhanced HpFP is more suitable for real network environments. Then, based on the enhanced HpFP, we implement a file transfer tool, called high-performance copy (HCP). The performance of our file transfer tool is evaluated between datacenters of JHPCN using real datasets collected from supercomputer resources. The results show that the HCP achieves higher throughput than traditional tool for file transfer over JHPCN.

  • Non-volatile memory driver for applying automated tiered storage with fast memory and slow flash storage

    Kazuichi Oe, Takeshi Nanri

    6th International Symposium on Computing and Networking Workshops, CANDARW 2018  2018.12 

     More details

    Event date: 2018.11

    Language:English  

    Venue:Takayama   Country:Japan  

    Automated tiered storage with fast memory and slow flash storage (ATSMF) is a hybrid storage system located between non-volatile memories (NVMs) and solid state drives (SSDs). ATSMF aims to reduce average response time for inputoutput (IO) accesses by migrating concentrated IO access areas from SSD to NVM. However, the current ATSMF implementation cannot reduce average response time sufficiently because of the bottleneck caused by the Linux brd driver, which is used for the NVM access driver. The response time of the brd driver is more than ten times larger than memory access speed. To reduce the average response time sufficiently, we developed a block-level driver for NVM called a 'two-mode (2M) memory driver.' The 2M memory driver has both the. map IO access mode and direct IO access mode to reduce the response time while maintaining compatibility with the Linux device-mapper framework. The direct IO access mode has a drastically lower response time than the Linux brd driver because the ATSMF driver can execute the IO access function of 2M memory driver directly. Experimental results also indicate that ATSMF using the 2M memory driver reduces the IO access response time to less than that of ATSMF using the Linux brd driver in most cases.

  • Design of a Flexible In Situ Framework with a Temporal Buffer for Data Processing and Visualization of Time-Varying Datasets

    Kenji Ono, Jorji Nonaka, Hiroyuki Yoshikawa, Takeshi Nanri, Yoshiyuki Morie, Tomohiro Kawanabe, Fumiyoshi Shoji

    International Conference on High Performance Computing, ISC High Performance 2018  2018.1 

     More details

    Event date: 2018.6

    Language:English  

    Venue:Frankfurt   Country:Germany  

    This paper presents an in situ framework focused on time-varying simulations, and uses a novel temporal buffer for storing simulation results sampled at user-defined intervals. This framework has been designed to provide flexible data processing and visualization capabilities in modern HPC operational environments composed of powerful front-end systems, for pre-and post-processing purposes, along with traditional back-end HPC systems. The temporal buffer is implemented using the functionalities provided by Open Address Space (OpAS) library, which enables asynchronous one-sided communication from outside processes to any exposed memory region on the simulator side. This buffer can store time-varying simulation results, and can be processed via in situ approaches with different proximities. We present a prototype of our framework, and code integration process with a target simulation code. The proposed in situ framework utilizes separate files to describe the initialization and execution codes, which are in the form of Python scripts. This framework also enables the runtime modification of these Python-based files, thus providing greater flexibility to the users, not only for data processing, such as visualization and analysis, but also for the simulation steering.

  • Design of an In Transit Framework with Staging Buffer for Flexible Data Processing and Visualization of Time-Varying Data

    Kenji Ono, Jorji Nonaka, Yoshiyuki Morie, Takeshi Nanri, Tomohiro Kawanabe

    ISC WORKSHOP ON IN SITU VISUALIZATION 2018  2018.6 

     More details

    Event date: 2018.6

    Language:English  

    Venue:Frankfurt   Country:Germany  

  • Proposal of Interface for Runtime Memory Manipulation of Applications via PGAS-based Communication Library Invited International conference

    Takeshi Nanri

    Workshop on PGAS programming models: Experiences and Implementations (PGAS-EI)  2018.1 

     More details

    Event date: 2018.1

    Language:English   Presentation type:Oral presentation (invited, special)  

    Venue:Tokyo   Country:Japan  

  • Automated Tiered Storage System Consisting of Memory and Flash Storage to Improve Response Time with Input-Output (IO) Concentration Workloads

    Kazuichi Oe, Mitsuru Sato, Takeshi Nanri

    5th International Symposium on Computing and Networking, CANDAR 2017  2018.4 

     More details

    Event date: 2017.11

    Language:English  

    Venue:Aomori   Country:Japan  

    The response time of solid state drives (SSDs) has dramatically reduced according to the spread of non-volatile memory express (NVMe) devices. These devices have response times of less than 100 micro seconds on average. The response time of all-flash-array systems has also drastically reduced through the use of NVMe SSDs. However, there are applications, particularly, virtual desktop infrastructure and in-memory database systems, that require storage systems with even shorter response time. Their workloads were found to contain many input-output (IO) concentrations. We define IO concentration by using a declarative style. Input-output (IO) concentrations are aggregations of IO accesses. They appear in narrow regions of the storage volume and continue for periods of up to about an hour. These narrow regions occupy a few percent of the logical unit number capacity, include most IO accesses, and appear at unpredictable logical block addresses. To drastically reduce the response time of these workloads, we developed automated tiered storage system called 'automated tiered storage with fast memory and slow flash storage' (ATSMF). The memory component of ATSMF is a memory with a non-volatile feature. The system predicts the remaining duration of IO concentration, calculates the response-time increase during migration and response-time decrease after migration, and migrates the IO concentrations if the response-time decrease after migration surpasses the response-time increase during migration. Experimental results indicate that ATSMF is at least 20% faster than flash storage only and its memory access ratio is more than 50%.

  • Analysis of the quality of academic papers by the words in abstracts

    Tetsuya Nakatoh, Kenta Nagatani, Toshiro Minami, Sachio Hirokawa, Takeshi Nanri, Miho Funamori

    Thematic track on Human Interface and the Management of Information, held as part of the 19th International Conference on Human–Computer Interaction, HCI International 2017  2017.1 

     More details

    Event date: 2017.7

    Language:English  

    Venue:Vancouver   Country:Canada  

    The investigation of related research is very important for research activities. However, it is not easy to choose an appropriate and important academic paper from among the huge number of possible papers. The researcher searches by combining keywords and then selects an paper to be checked because it uses an index that can be evaluated. The citation count is commonly used as this index, but information about recently published papers cannot be obtained. This research attempted to identify good papers using only the words included in the abstract. We constructed a classifier by machine learning and evaluated it using cross validation. As a result, it was found that a certain degree of discrimination is possible.

  • Parallel Application Experiences Using Advanced Communication Primitives International conference

    Shinji Sumimoto, Yuichiro Ajima, Takafumi Nose, Kazushige Saga, Naoyuki Shida, Takeshi Nanri

    25th Euromicro International Conference on Parallel, Distributed and network-based Processing  2017.3 

     More details

    Event date: 2017.3

    Language:English   Presentation type:Oral presentation (general)  

    Country:Russian Federation  

  • Feasibility study for building hybrid storage system consisting of non-volatile DIMM and SSD

    Kazuichi Oe, Takeshi Nanri, Koji Okamura

    4th International Symposium on Computing and Networking, CANDAR 2016  2017.1 

     More details

    Event date: 2016.11

    Language:English  

    Venue:Hiroshima   Country:Japan  

    Various vendors develop a byte accessible Nonvolatile Dual-Inline Memory Module (NVDIMM). The performance of the NVDIMM drastically surpasses that of the Solid State Drive (SSD), which is connected by PCI express. However, the cost of the NVDIMM is much higher than that of the SSD. Therefore, a hybrid storage system between the NVDIMM and SSD is an effective technique for improving cost-performance. If a system uses the NVDIMM less while maintaining performance, its cost-performance should be improved. Our previous work involves on-the-fly automated storage tiering (OTF-AST). OTF-AST is a hybrid storage system consisting of an SSD and HDD. It aims to reduce the average response time of IO accesses by migrating only the IO concentration area to the SSD when IO concentration happens. Therefore, we construct OTF-AST with both the DIMM and SSD and evaluate it in order to understand how to build a cost-effective hybrid storage system with these devices. We use a DIMM instead of a byte accessible NVDIMM, which is difficult to obtain. As a result, we found that the original OTF-AST is suitable for a hybrid storage system consisting of the DIMM and SSD. Moreover, we can improve the performance of OTF-AST if replace its migration algorithm with a more positive migration algorithm. This is because the IO access response time barely increases when the data migration between the DIMM and SSD is done. We will build a more positive migration algorithm in the near future.

  • Effect of Overlapping Halo Exchange with One-Sided Communication International conference

    Takeshi Nanri, Keiichiro Fukazawa

    5th JSST Annual Conference International Conference on Simulation Technology  2016.10 

     More details

    Event date: 2016.10

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Kyoto   Country:Japan  

  • Development of A Memory Efficient Communication Method for Connecting MPI Programs by using ACP Library International conference

    Hiroaki Honda, Yoshiyuki Morie, Takeshi Nanri

    5th JSST Annual Conference International Conference on Simulation Technology  2016.10 

     More details

    Event date: 2016.10

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Kyoto   Country:Japan  

  • Efficient communications of particle data in particle-based simulations International conference

    Ryutaro Susukita, Yoshiyuki Morie, Takeshi Nanri

    5th JSST Annual Conference International Conference on Simulation Technology  2016.10 

     More details

    Event date: 2016.10

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Kyoto   Country:Japan  

  • Performance Evaluation of MHD Simulation Code with X86 CPUs and Manycore Systems International conference

    Keiichiro Fukazawa, Takayuki Umeda, Takeshi Nanri

    5th JSST Annual Conference International Conference on Simulation Technology  2016.10 

     More details

    Event date: 2016.10

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Kyoto   Country:Japan  

  • Effective calculation with halo communication using halo functions

    Keiichiro Fukazawa, Yoshiyuki Morie, Toshiya Takami, Takeshi Nanri, Takeshi Soga

    23rd European MPI Users' Group Meeting, EuroMPI 2016  2016.9 

     More details

    Event date: 2016.9

    Language:English  

    Venue:Edinburgh   Country:United Kingdom  

    The issue of halo communication is the decrease of parallel scalability. To overcome the issues, we have introduced "Halo thread" to our simulation code. However, we have not solved the issue basically in the strong scaling. In this study, we have developed the Halo functions which perform the halo communication effectively. Then we can perform the calculation and communication in a pipeline and obtained good performance.

  • The Design of Advanced Communication to Reduce Memory Usage for Exa-scale Systems International conference

    Shinji Sumimoto, Yuichiro Ajima, Kazushige Saga, Takafumi Nose, Naoyuki Shida, Takeshi Nanri

    12th International Meeting On High Performance Computing for Computational Science  2016.9 

     More details

    Event date: 2016.9

    Language:English   Presentation type:Oral presentation (general)  

    Country:Portugal  

  • Improvement of Eisenstat-SSOR preconditioning using tolerance value International conference

    Seiji FUJINO, Takeshi Nanri

    5th IMA Conference on Numerical Linear Algebra and Optimization  2016.9 

     More details

    Event date: 2016.9

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Birmingham   Country:United Kingdom  

  • Effective Calculation with Halo communication using Halo Functions International conference

    Keiichiro Fukazawa, Toshiya Takami, Takeshi Soga, Yoshiyuki Morie, Takeshi Nanri

    23rd European MPI Users' Group Meeting  2016.9 

     More details

    Event date: 2016.9

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Edinburgh   Country:United Kingdom  

  • Runtime Algorithm Selection of Collective Communication with RMA-based Monitoring Mechanism International conference

    Takeshi Nanri

    4th Annual MVAPICH Users Group Meeting  2016.8 

     More details

    Event date: 2016.8

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Columbus, Ohio   Country:United States  

  • NSIM-ACE: An Interconnection Network Simulator for Evaluating Remote Direct Memory Access International conference

    Ryutaro Susukita, Yoshiyuki Morie, Takeshi Nanri

    International Conference on Simulation and Modeling Methodologies, Technologies and Applications  2016.7 

     More details

    Event date: 2016.7

    Language:English   Presentation type:Oral presentation (general)  

    Country:Portugal  

  • The design of advanced communication to reduce memory usage for exa-scale systems

    Shinji Sumimoto, Yuichiro Ajima, Kazushige Saga, Takafumi Nose, Naoyuki Shida, Takeshi Nanri

    12th International Conference on High Performance Computing for Computational Science, VECPAR 2016  2017.1 

     More details

    Event date: 2016.6

    Language:English  

    Venue:Porto   Country:Portugal  

    Current MPI (Message Passing Interface) communication libraries require larger memories in proportion of the number of processes, and can not be used for exa-scale systems. This paper proposes a global memory based communication design to reduce memory usage for exa-scale communication. To realize exa-scale communication, we propose true global memory based communication primitives called Advanced Communication Primitives (ACPs). ACPs provide global address, which is able to use remote atomic memory operations on the global memory, RDMA (Remote Direct Memory Access) based remote memory copy operation, global heap allocator and global data libraries. ACPs are different from the other communication libraries because ACPs are global memory based so that house keeping memories can be distributed to other processes and programmers explicitly consider memory usage by using ACPs. The preliminary result of memory usage by ACPs is 70 MB on one million processes.

  • Memory Efficient One-Sided Communucation Library "aCP" in Globary Memory on Raspberry Pi 2

    Yoshiyuki Morie, Hiroaki Honda, Takeshi Nanri, Taizo Kobayashi, Hidetomo Shibamura, Ryutaro Susukita, Yuichiro Ajima

    36th IEEE International Conference on Distributed Computing Systems, ICDCS 2016  2016.8 

     More details

    Event date: 2016.6

    Language:English  

    Venue:Nara   Country:Japan  

    Previously, communications in parallel programs forHigh Performance Computing (HPC) and Distributed Computing(DC) are mostly written with two-sided communicationinterfaces that are based on a pair of operations, Send andReceive. Since such interface requires explicit synchronizationbetween both sides of the communication, techniquesfor communication optimization such as overlapping are notefficiently described in many cases. On the other hand, onesidedcommunication interface is becoming important as amethod to describe asynchronous communications to enablehighly overlapped communication with computation. As oneof such interface, in this demonstration, Advanced CommunicationPrimitives (ACP) is introduced. ACP is a portableinterface that supports UDP, IBverbs of InfiniBand and Tofulibrary of K Computer. In addition to that, it is designed tobe memory efficient. For example, with 10 thousand processes, the memory consumption of ACP over UDP is estimated to beless than 1MB. Since the number of computational elements isincreasing more rapidly than the amount of available memory, this memory efficiency is becoming one of the keys for parallelprograms in HPC and DC. To show this characteristics, we runACP library on Raspberry Pi 2, and examine its performanceand memory consumption.

  • Evaluation of On-Demand Message-Passing Module over RDMA Network

    Takeshi Nanri

    ACSI2016  2016.1 

     More details

    Event date: 2016.1

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Fukuoka   Country:Japan  

  • Analysis of Storage Workloads of Input-Output Access Locality and Designing of Hybrid Storage System International conference

    Kazuichi Oe, Takeshi Nanri, KOJI OKAMURA

    1st International Conference on Enterprise Architecture and Information Systems  2016.1 

     More details

    Event date: 2016.1

    Language:English   Presentation type:Oral presentation (general)  

    Country:Japan  

  • Performance Evaluation of RDMA Communication Patterns by Means of Simulations International conference

    Ryutaro Susukita, Yoshiyuki Morie, Takeshi Nanri, Hidetomo Shibamura

    2015 Joint International Mechanical, Electronic and Information Technology Conference  2015.12 

     More details

    Event date: 2015.12

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Chonqing   Country:China  

  • On-The-Fly Automated Storage Tiering with Caching and both Proactive and Observational Migration International conference

    Kazuichi Oe, Takeshi Nanri, KOJI OKAMURA

    Workshop on Computer Systems and Architectures (CSA'15)  2015.12 

     More details

    Event date: 2015.12

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Sapporo   Country:Japan  

  • 直接網において複数の通信デバイスを有効に使用する隣接通信アルゴリズムの提案

    森江 善之, 南里 豪志

    2015 ハイパフォーマンスコンピューティングと計算科学シンポジウム  2015.5 

     More details

    Event date: 2015.5

    Language:Japanese   Presentation type:Oral presentation (general)  

    Country:Japan  

  • Performance and memory usage evaluations for channel interface of Advanced Communication Primitives library International conference

    Hiroaki Honda, Takeshi Nanri, Yoshiyuki Morie

    1st Pan-American Congress on Computational Mechanics (PANACM 2015)  2015.4 

     More details

    Event date: 2015.4

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Buenos Aires   Country:Argentina  

  • Channel Interface: a Primitive Model for Memory Efficient Communication International conference

    Takeshi Nanri

    23rd Euromicro International Conference on Parallel, Distributed and network-based Processing  2015.3 

     More details

    Event date: 2015.3

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Turku   Country:Finland  

  • Design and Implementation of Channel Interface as a Memory Efficient Communication Model International conference

    Takeshi Nanri

    Annual Meeting on Advanced Computing System and Infrastructure (ACSI) 2015  2015.1 

     More details

    Event date: 2015.1

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Tsukuba   Country:Japan  

  • Proposal of HINT Interface for Runtime Tuning of Communication Links International conference

    Takeshi Nanri

    22nd Euromicro International Conference on Parallel, Distributed and network-based Processing  2014.2 

     More details

    Event date: 2014.2

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Turin   Country:Italy  

  • 性能予測と実測を併用した集団通信アルゴリズム選択

    児玉 大器, 南里 豪志

    今後のHPC(基盤技術と応用)に関するワークショップ  2013.12 

     More details

    Event date: 2013.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:長崎市   Country:Japan  

  • MPI における最適化情報提供のためのインターフェイスに関する評価

    南里 豪志

    今後のHPC(基盤技術と応用)に関するワークショップ  2013.12 

     More details

    Event date: 2013.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:長崎市   Country:Japan  

  • プログラムのヒント情報を用いた通信ライブラリ動的最適化技術について

    杉山 裕宣, 南里 豪志

    今後のHPC(基盤技術と応用)に関するワークショップ  2013.12 

     More details

    Event date: 2013.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:長崎市   Country:Japan  

  • Performance Study of Non-blocking Collective Communication Implementations Toward Adaptive Selection International conference

    Tsuyoshi Okuma, Takeshi Nanri

    Networking, Computing, Systems and Software  2013.12 

     More details

    Event date: 2013.12

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Matsuyama   Country:Japan  

  • Topology Aware Performance Prediction of Collective Communication Algorithms on Multi-Dimensional Mesh/Torus International conference

    Hironobu Sugiyama, Takeshi Nanri

    Networking, Computing, Systems and Software  2013.12 

     More details

    Event date: 2013.12

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Matsuyama   Country:Japan  

  • 通信ライブラリの自動チューニングを支援する Hint API の提案

    南里 豪志

    第141回ハイパフォーマンスコンピューティング研究会  2013.10 

     More details

    Event date: 2013.10

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:那覇市   Country:Japan  

  • A neighbor communication algorithm with making an effective use of NICs on multidimensional-mesh/torus International conference

    Yoshiyuki Morie, Takeshi Nanri

    International Conference on Simulation Technology  2013.9 

     More details

    Event date: 2013.9

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Tokyo   Country:Japan  

  • What Communication Library Can do with a Little Hint from Programmers? International conference

    Takeshi Nanri

    MVAPICH User Group Meeting  2013.8 

     More details

    Event date: 2013.8

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Columbus   Country:United States  

  • A Cost-Efficient Approach for Automatic Algorithm Selection of Collective Communications Invited International conference

    Takeshi Nanri, Hironobu Sugiyama, FUKAZAWA Keiichiro

    SIAM Conference on Computational Science and Engineering  2013.3 

     More details

    Event date: 2013.2 - 2013.3

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Boston   Country:United States  

  • 多次元メッシュ/トーラスにおけるプロセス配置に応じた集団通信アルゴリズム選択技術の提案

    南里 豪志, 杉山 裕宣, 森江 善之

    第138回ハイパフォーマンスコンピューティング研究会  2013.2 

     More details

    Event date: 2013.2

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:あわら市   Country:Japan  

  • 多次元メッシュ/トーラスにおける通信衝突を考慮したタスク配置最適化技術

    森江 善之, 南里 豪志

    ハイパフォーマンスコンピューティングと計算科学シンポジウム  2013.1 

     More details

    Event date: 2013.1

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:東京   Country:Japan  

  • Performance Prediction Technology for Collective Communication Algorithm on Multi-Dimensional Mesh/Torus International conference

    Hironobu Sugiyama, Takeshi Nanri

    International workshop on HPC, Krylov Subspace method and its application  2013.1 

     More details

    Event date: 2013.1

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Beppu   Country:Japan  

  • Evaluation of Implementation Methods for Non-Blocking Collective Communications in Overlapping Communication and Computation International conference

    Tsuyoshi Okuma, Takeshi Nanri

    International workshop on HPC, Krylov Subspace method and its application  2013.1 

     More details

    Event date: 2013.1

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Beppu   Country:Japan  

  • Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communication International conference

    Yoshiyuki Morie, Takeshi Nanri

    International workshop on HPC, Krylov Subspace method and its application  2013.1 

     More details

    Event date: 2013.1

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Beppu   Country:Japan  

  • Introduction of ACE(Advanced Communication library for Exa) Project Invited International conference

    Takeshi Nanri

    International workshop on HPC, Krylov Subspace method and its application  2013.1 

     More details

    Event date: 2013.1

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Beppu   Country:Japan  

  • 通信衝突を削減するタスク配置最適化における通信タイミングの予測方式の影響

    森江 善之, 南里 豪志

    第194回計算機アーキテクチャ・第137回ハイパフォーマンスコンピューティング合同研究発表会(HOKKE-20)  2012.12 

     More details

    Event date: 2012.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:札幌市   Country:Japan  

  • An Alternative Domain Decomposition Technique for CUDA-based 3D FDTD Methods International conference

    Matthew Livesey, James Francis Stack, Jr., Fumie Costen, Takeshi Nanri, Norimasa Nakashima, Seiji FUJINO

    9th European Radar Conference  2012.11 

     More details

    Event date: 2012.10 - 2012.11

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Amsterdam   Country:Netherlands  

  • Tofu ネットワークにおけるプロセス配置形状による集団通信アルゴリズムの性能解析,

    南里 豪志

    ハイパフォーマンスコンピューティング研究発表会  2012.10 

     More details

    Event date: 2012.10

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:那覇市   Country:Japan  

    スーパーコンピュータの大規模化に伴って,ノード間インターコネクトネットワークとして,コストの低い多次元メッシュ/トーラストポロジを採用したものを用いる事例が増えている.多次元メッシュ/トーラスは,使用するノード数が同じでも,プロセスが配置されるノード群の形状によって性能が大きく変動する.本研究では,京コンピュータや,その互換機である Fujitsu PRIMEHPC FX10で用いられている Tofuインターコネクトネットワークを対象として,プロセス配置の形状による集団通信アルゴリズムの性能への影響を計測した.得られた性能を,Tofuインターコネクトの性能解析ツールを用いて取得した通信衝突による転送待ち時間と比較したところ,プロセス配置形状による変動がどちらもほぼ同じ傾向を示すことを明らかにした.これらの結果から,集団通信アルゴリズムの選択において,プロセス配置の形状を考慮した性能見積もりが重要であることを示した.

  • 異なるスカラアーキテクチャ(x86、SPARC64)の電磁流体コードによる性能評価

    深沢 圭一郎, 南里 豪志, 高見 利也

    ハイパフォーマンスコンピューティング研究発表会  2012.10 

     More details

    Event date: 2012.10

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:那覇市   Country:Japan  

  • Impact of GPU Memory Access Patterns on FDTD International conference

    Matthew Livesey, James Francis Stack, Jr., Fumie Costen, Takeshi Nanri, Norimasa Nakashima, Seiji FUJINO

    IEEE Antennas and Propagation Society International Symposium (APSURSI)  2012.7 

     More details

    Event date: 2012.7

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Chicago   Country:United States  

  • Efficient Runtime Algorithm Selection of Collective Communication with Topology-Based Performance Models International conference

    Takeshi Nanri, Motoyoshi Kurokawa

    International Conference on Parallel and Distributed Processing Techniques and Applications  2012.7 

     More details

    Event date: 2012.7

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Las Vegas   Country:United States  

  • Effective Performance of Large-Scale MHD Simulation for Planetary Magnetosphere with Massively Parallel Computer International conference

    FUKAZAWA Keiichiro, Takeshi Nanri

    JSST2012 International Conference on Simulation Technology  2012.7 

     More details

    Event date: 2012.7

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Kobe   Country:Japan  

  • Balancing Communication and Execution Technique for Parallelized Sparse Matrix-Vector Multiplication International conference

    Seiji FUJINO, Takeshi Nanri, Kenichirou Kusaba

    4th International Conference on Future Computational Technologies and Applications  2012.7 

     More details

    Event date: 2012.7

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Nice   Country:France  

  • Task Allocation Optimization for Neighboring Communication on Fat Tree International conference

    Yoshiyuki Morie, Takeshi Nanri

    14th IEEE International Conference on High Performance Computing and Communication  2012.6 

     More details

    Event date: 2012.6

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Liverpool   Country:United Kingdom  

  • Performance of Large Scale MHD Simulation of Global Planetary Magnetosphere with Massively Parallel Scalar Type Supercomputer Including Post Processing International conference

    FUKAZAWA Keiichiro, Takeshi Nanri

    14th IEEE International Conference on High Performance Computing and Communication  2012.6 

     More details

    Event date: 2012.6

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Liverpool   Country:United Kingdom  

  • MPI Allreduce の「京」上での実装と評価

    松本 幸,安達 知也,住元 真司,曽我 武史,南里 豪志,宇野 篤也,黒川 原佳,庄司 文由,横川 三津夫

    先進的計算基盤システムシンポジウム(SACSIS2012)  2012.5 

     More details

    Event date: 2012.5

    Presentation type:Oral presentation (general)  

    Venue:神戸   Country:Japan  

  • 並列FMOプログラムOpenFMOの性能最適化

    稲富雄一、眞木 淳、高見利也、本田宏明、小林泰三、南里豪志、青柳睦、南一生

    第133回ハイパフォーマンスコンピューティング研究会  2012.3 

     More details

    Event date: 2012.3

    Presentation type:Oral presentation (general)  

    Venue:神戸   Country:Japan  

  • ランク配置に応じた集団通信アルゴリズム動的選択技術の提案

    南里豪志、黒川原佳

    第133回ハイパフォーマンスコンピューティング研究会  2012.3 

     More details

    Event date: 2012.3

    Presentation type:Oral presentation (general)  

    Country:Japan  

  • スケーラブルな通信ライブラリ実装技術

    南里 豪志

    第8回戦略的高性能計算システム開発に関するワークショップ  2012.2 

     More details

    Event date: 2012.2

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:東京   Country:Japan  

  • 通信ライブラリにおける実行時自動チューニング技術 Invited

    南里 豪志

    第3回自動チューニング技術の現状と応用に関するシンポジウム  2011.12 

     More details

    Event date: 2011.12

    Presentation type:Oral presentation (general)  

    Venue:東京大学   Country:Japan  

  • MPI Allreduce の「京」上での実装と評価

    松本幸,安達知也,田中稔,住元真司,曽我武史,南里豪志

    第19回ハイパフォーマンスコンピューティングとアーキテクチャの評価に関する北海道ワークショップ  2011.11 

     More details

    Event date: 2011.11

    Presentation type:Oral presentation (general)  

    Country:Japan  

  • Effect of Dynamic Algorithm Selection of All-to-All Communication on Environments with Unstable Network Speed International conference

    Takeshi Nanri and Motoyoshi Kurokawa

    International Conference on High Performance Computing & Simulation,  2011.7 

     More details

    Event date: 2011.7

    Presentation type:Oral presentation (general)  

    Venue:Istanbul   Country:Turkey  

  • A Method for Predicting a Penalty of Contentions by Considering Priorities of Routing among Packets on Direct Interconnection Network International conference

    Yoshiyuki Morie, Takeshi Nanri, Ryutaro Susukita and Koji Inoue,

    International Joing Conference on Computational Sciences and Optimization 2011  2011.4 

     More details

    Event date: 2011.4

    Presentation type:Oral presentation (general)  

    Venue:Kunming   Country:China  

  • Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communications International conference

    Yoshiyuki Morie, Takeshi Nanri, and Motoyoshi Kurokawa

    The Tenth IASTED International Conference on Parallel and Distributed Computing and Networks  2011.2 

     More details

    Event date: 2011.2

    Presentation type:Oral presentation (general)  

    Venue:Innsbruck   Country:Austria  

  • 通信と計算の負荷を考慮した並列疎行列ベクトル積の動的負荷分散技術

    草場健一郎,南里豪志,藤野清次

    2010年並列/分散/協調処理に関する『金沢』サマー・ワークショップ  2010.8 

     More details

    Event date: 2010.8

    Presentation type:Oral presentation (general)  

    Venue:金沢   Country:Japan  

  • Runtime Load-balancing Technique for Sparse Matrix-Vector Multiplication International conference

    Kenichiro Kusaba, Takeshi Nanri and Seiji Fujino

    International Workshop on Innovative Architecture  2010.3 

     More details

    Event date: 2010.3

    Presentation type:Oral presentation (general)  

    Venue:Kona   Country:United States  

  • A Robust Dynamic Optimization for MPI Alltoall Operation International conference

    Hyacinthe Nzigou Mamadou, Takeshi Nanri, and Kazuaki Murakami

    18th International Heterogeneity in Computing Workshop  2009.5 

     More details

    Event date: 2009.5

    Presentation type:Oral presentation (general)  

    Venue:Rome   Country:Italy  

  • 階層型並列計算機向けPAGMEつきCG法の実装と性能解析

    馬場 慎也, 南里 豪志, 藤野 清次, 染原 一仁

    計算工学講演会  2009.5 

     More details

    Event date: 2009.5

    Presentation type:Oral presentation (general)  

    Venue:東京   Country:Japan  

    Implementation and Performance Evaluation of Parallelized CG method with PAGME - Preconditioning Method on Hierarchical Parallel Computers

  • A Dynamic Solution for Efficient MPI Collective Communications International conference

    Hyacinthe Nzigou Mamadou, Feng Long Gu, Vivien Oddou, Takeshi Nanri, Kazuaki Murakami

    International Workshop on HPC and Grid Applications  2009.4 

     More details

    Event date: 2009.4

    Presentation type:Oral presentation (general)  

    Venue:Sanya, Hainan   Country:China  

  • Profiling Technique for Dynamic Optimization According to Waiting Time International conference

    Takeshi Soga, Takeshi Nanri, Motoyoshi Kurokawa and Kazuaki Murakami

    HPC Asia  2009.3 

     More details

    Event date: 2009.3

    Presentation type:Oral presentation (general)  

    Venue:Kaohsiung   Country:Taiwan, Province of China  

  • Dependence on loop distribution of performance in hybrid-parallel IDR(s) method International conference

    Shinya Baba, Yusuke Onoue, Takeshi Nanri and Seiji Fujino

    HPC Asia  2009.3 

     More details

    Event date: 2009.3

    Presentation type:Oral presentation (general)  

    Venue:Kaohsiung   Country:Taiwan, Province of China  

  • 並列版 PAGME つき CG 法の性能解析

    馬場慎也, 南里豪志, 藤野清次, 染原一仁

    情報処理学会ハイパフォーマンスコンピューティング研究会  2008.12 

     More details

    Event date: 2008.12

    Presentation type:Oral presentation (general)  

    Venue:福岡   Country:Japan  

    Performance analysis of the CG method with parallelized PAGME

  • 性能モデルによる予測を併用した Alltoallアルゴリズム動的選択技術の評価

    南里豪志, Hyacinthe Nzigou Mamadou, Feng Long Gu, 村上和彰

    情報処理学会ハイパフォーマンスコンピューティング研究会  2008.12 

     More details

    Event date: 2008.12

    Presentation type:Oral presentation (general)  

    Venue:福岡   Country:Japan  

    Evaluation of Dynamic Algorithm Selection with Performance Prediction Models on Alltoall Operation

  • ハイブリッド並列化したIDR(s)法の計算時間に対するプロセス数とスレッド数の組み合わせ依存性について

    馬場慎也、南里豪志、藤野清次

    情報処理学会ハイパフォーマンスコンピューティング研究会  2008.5 

     More details

    Event date: 2008.5

    Presentation type:Oral presentation (general)  

    Venue:東京都   Country:Japan  

    Dependence on combination with number of processes and threads for com-
    putation times of hybrid-parallel version of IDR(s) Method

  • Effect of Reordering Internal Messages in MPI Broadcast According to the Load Imbalance International conference

    Takesi Soga, Takeshi Nanri, Motoyoshi Kurokawa and Kazuaki Murakami

    IWIA '08  2008.1 

     More details

    Presentation type:Oral presentation (general)  

    Venue:Hiro   Country:United States  

  • Performance Analysis and Linear Optimization Modeling of All-to-all Collective Communication Algorithms International conference

    Hyacinthe Nzigou Mamadou, Takeshi Nanri and Kazuaki Murakami

    SBAC-PAD 2007  2007.10 

     More details

    Presentation type:Oral presentation (general)  

    Venue:Gramad   Country:Brazil  

  • Dynamic Optimization of Load Balance in MPI Broadcast International conference

    Takesi Soga, Kouji Kurihara, Takeshi Nanri, Motoyoshi Kurokawa and Kazuaki Murakami

    Euro PVM/MPI 2007  2007.10 

     More details

    Venue:Paris   Country:France  

  • SMMH - A Parallel Heuristic for Combinatorial Optimization Problems International conference

    Guilherme Domingues, Yoshiyuki Morie, Feng Long Gu , Takeshi Nanri and Kazuaki Murakami

    International Conference on Computational Methods in Science and Engineering 2007  2007.9 

     More details

    Presentation type:Oral presentation (general)  

    Venue:Corfu   Country:Greece  

  • Investigating the Performance of Collective Communications on SMP Clusters: A Case for MPI_Allgather International conference

    Feng Long Gu, Hyacinthe Nzigou Mamadou, Guilherme Domingues, Takeshi Nanri and Kazuaki Murakami

    International Conference on Computational Methods in Science and Engineering 2007  2007.9 

     More details

    Presentation type:Oral presentation (general)  

    Venue:Corfu   Country:Greece  

  • Evaluation of the Performance of Parallel Sparse-Matrix Multiplication and the Effect of Dynamic Load-Balancing International conference

    Takeshi Nanri, Takeshi Soga, Koji Kurihara, Feng Long Gu, Hiroaki Ishihata and Kazuaki Murakami

    International Conference on Computational Methods in Science and Engineering 2007  2007.9 

     More details

    Presentation type:Oral presentation (general)  

    Venue:Corfu   Country:Greece  

  • A Study of All-to-all Collective Communication Algorithms on Modern High Performance System Architectures International conference

    Hyacinthe Nzigou Mamadou, Feng Long Gu, Takeshi Nanri, Kazuaki Murakami

    High Performance Computing International Conference (HPC Asia) 2007  2007.9 

     More details

    Presentation type:Oral presentation (general)  

    Venue:Seoul   Country:Korea, Republic of  

  • 負荷ばらつきを考慮したMPIブロードキャスト通信の動的最適化に関する研究

    栗原 康志,Hyacinthe Nzigou Mamadou,南里 豪志,末安 直樹,松本透,井上 弘士,村上 和彰

    SWoPP2007  2007.8 

     More details

    Presentation type:Oral presentation (general)  

    Venue:旭川市   Country:Japan  

  • 通信タイミングを考慮した衝突削減のためのMPIランク配置最適化技術

    森江 善之, 末安 直樹, 松本 透, 南里 豪志, 石畑 宏明, 井上 弘士, 村上 和彰

    先進的計算基盤システムシンポジウム (SACSIS2007)  2007.5 

     More details

    Presentation type:Oral presentation (general)  

    Venue:東京   Country:Japan  

  • 通信タイミングを考慮したMPI ランク配置最適化技術

    森江 善之, 末安 直樹, 松本 透, 南里 豪志, 石畑 宏明, 井上 弘士, 村上 和彰

    HOKKE2007  2007.3 

     More details

    Presentation type:Oral presentation (general)  

    Venue:札幌市   Country:Japan  

  • Collective Communication Costs Analysis over Gigabit Ethernet and InfiniBand International conference

    Hyacinthe Nzigou Mamadou, Takeshi Nanri and Kazuaki Murakami

    High Performance Computing - HiPC 2006  2006.12 

     More details

    Presentation type:Oral presentation (general)  

    Country:India  

  • Implementation of GAMESS on Parallel Computers: TCP/IP versus MPI International conference

    Feng Long Gu, Takeshi Nanri and Kazuaki Murakami

    International Conference of Computational Methods in Sciences and Engineering  2006.10 

     More details

    Presentation type:Oral presentation (general)  

    Country:Greece  

  • 並列計算機の大規模化に向けた MPI の Alltoall通信アルゴリズムの性能評価

    南里 豪志

    第10回環瀬戸内応用数理研究部会シンポジウム  2006.7 

     More details

    Presentation type:Oral presentation (general)  

    Venue:沖縄県   Country:Japan  

  • Performance comparison of vector-calculations between Itanium2 and other processors International conference

    T. Nanri, Y. Watanabe, H. Sato

    International Workshop on Innovative Architecture  2005.1 

     More details

    Presentation type:Oral presentation (general)  

    Venue:ハワイ   Country:United States  

  • Design and Implementation of an Adaptive Distributed Shared Memory System International conference

    Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki

    International Conference of Parallel and Distributed Computing and Systems  2001.8 

     More details

    Venue:Anaheim   Country:United States  

  • Preliminary Investigation of Distributed Shared Memory System on a Cluster of High Performance Clusters International conference

    Takeshi Nanri, Yoshitaka Watanabe, Hiroyuki Sato and Masaaki Shimasaki

    European Congress on Computational Methods in Applied Sciences and Engineering  2000.9 

     More details

    Venue:Barcelona   Country:Spain  

  • Effects of Scheduling Attributes on Multithread-Based Software DSM System International conference

    Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki

    Workshop on Scheduling Algorithms for Parallel/Distributed Computing  1999.7 

     More details

    Venue:Rhodes   Country:Greece  

  • Implementation of PVM-based Distributed Shared Memory System International conference

    Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki

    International Conference on Parallel and Distributed Processing Techniques and Applications  1998.7 

     More details

    Venue:Las Vegas   Country:United States  

  • 非ブロッキング集団通信の通信隠蔽効果に関する調査

    Takeshi Nanri, Satoshi Ohshima, Kenji Ono

    2017.12 

     More details

    Language:Japanese  

    Country:Other  

  • スーパーコンピュータシステムITOの性能評価

    Satoshi Ohshima, Takeshi Nanri, Yoshitaka Watanabe, Hirofumi Amano, Kenji Ono

    2017.12 

     More details

    Language:Japanese  

    Country:Other  

  • Attribute-based quality classification of academic papers

    Tetsuya Nakatoh, Sachio Hirokawa, Toshiro Minami, Takeshi Nanri, Miho Funamori

    2017.11 

     More details

    Language:English  

    Country:Other  

    Investigating the relevant literature is very important for research activities. However, it is difficult to select the most appropriate and important academic papers from the enormous number of papers published annually. Researchers search paper databases by combining keywords, and then select papers to read using some evaluation measure—often, citation count. However, the citation count of recently published papers tends to be very small because citation count measures accumulated importance. This paper focuses on the possibility of classifying high-quality papers superficially using attributes such as publication year, publisher, and words in the abstract. To examine this idea, we construct classifiers by applying machine-learning algorithms and evaluate these classifiers using cross-validation. The results show that our approach effectively finds high-quality papers.

▼display all

MISC

  • スーパーコンピュータ玄界の性能評価

    大島聡史, 南里豪志, 美添一樹

    情報処理学会研究報告(Web)   2024 ( HPC-196 )   2024

  • Implementation and Performance Evaluation of Discontinuous Data Transfer in Halo Communication with Tofu Interconnect

    有迫廉真, 南里豪志

    情報処理学会研究報告(Web)   2024 ( HPC-195 )   2024

  • Introduction of multi-site sharing experiment of ultra-high-resolution meteorological satellite images using “JHPCN Wide Area Distributed Cloud“ and tiled displays

    川鍋友宏, 村田健史, 山本和憲, 村永和哉, 樋口篤志, 豊嶋紘一, 深沢圭一郎, 小野謙二, 南里豪志

    日本地球惑星科学連合大会予稿集(Web)   2022   2022

  • FX100における永続型集団通信関数のプロトタイプ実装と評価

    森江善之, 畑中正行, 高木将通, 堀敦史, 石川裕, 南里豪志

    情報処理学会研究報告(Web)   2018.2

     More details

    Language:Japanese  

    FX100における永続型集団通信関数のプロトタイプ実装と評価

  • スーパーコンピュータシステムITOの性能評価

    大島聡史, 南里豪志, 渡部善隆, 天野浩文, 小野謙二

    情報処理学会研究報告(Web)   2017.12

     More details

    Language:Japanese  

    スーパーコンピュータシステムITOの性能評価

  • ACPライブラリの通信性能およびメモリ使用量の評価

    森江善之, 森江善之, 本田宏明, 本田宏明, 南里豪志, 南里豪志

    情報処理学会研究報告(Web)   2016.2

     More details

    Language:Japanese  

    ACPライブラリの通信性能およびメモリ使用量の評価

  • ACPライブラリによるMPI_Comm_spawnの置き換えとOpenFMOへの適用

    本田宏明, 森江善之, 南里豪志, 稲富雄一, 高見利也, 本田宏明, 森江善之, 南里豪志, 稲富雄一, 高見利也

    情報処理学会研究報告(Web)   2016.2

     More details

    Language:Japanese  

    ACPライブラリによるMPI_Comm_spawnの置き換えとOpenFMOへの適用

  • ステンシル計算における効率的なHalo通信・計算モデルの開発

    深沢圭一郎, 深沢圭一郎, 森江善之, 森江善之, 曽我武史, 曽我武史, 高見利也, 高見利也, 南里豪志, 南里豪志

    情報処理学会研究報告(Web)   2016.2

     More details

    Language:Japanese  

    Development of Effective Halo Communication and Calculation Model on Stencil Computation

  • ACP通信ライブラリを用いたOpenFMOプログラムの実装

    本田宏明, 本田宏明, 森江善之, 森江善之, 南里豪志, 南里豪志, 稲富雄一, 稲富雄一, 高見利也, 高見利也

    日本コンピュータ化学会年会講演予稿集   2015.10

     More details

    Language:Japanese  

    ACP通信ライブラリを用いたOpenFMOプログラムの実装

  • エクサスケールコンピューティングに向けたHaloスレッドの電磁流体シミュレーションに対する効果

    深沢圭一郎, 森江善之, 曽我武史, 高見利也, 南里豪志, 深沢圭一郎, 森江善之, 曽我武史, 高見利也, 南里豪志

    情報処理学会研究報告(Web)   2015.9

     More details

    Language:Japanese  

    Effects of Halo Thread to the Magnetohydrodynamic Simulation toward Exascale Computing

  • RDMAにおける同期通信のインターコネクトシミュレーション

    薄田竜太郎, 森江善之, 南里豪志, 柴村英智

    電子情報通信学会技術研究報告   2015.7

     More details

    Language:Japanese  

    Interconnection Network Simulation of Synchronization Communication in RDMA

  • InfiniBandによるACP基本層の実装と評価

    森江善之, 南里豪志, 安島雄一郎, 本田宏明, 曽我武史, 小林泰三, 住元真司, 森江善之, 南里豪志, 安島雄一郎, 本田宏明, 曽我武史, 小林泰三, 住元真司

    情報処理学会研究報告(Web)   2015.2

     More details

    Language:Japanese  

    Implementation and Evaluation of ACP Basic layer

  • ACPライブラリの集団通信インターフェース

    本田宏明, 本田宏明, 山田博厚, 森江善之, 森江善之, 南里豪志, 南里豪志, 高見利也, 高見利也

    情報処理学会研究報告(Web)   2015.2

     More details

    Language:Japanese  

    ACPライブラリの集団通信インターフェース

  • RDMA評価のための大規模インターコネクトシミュレータ「NSIM‐ACE」

    薄田竜太郎, 森江善之, 南里豪志, 柴村英智

    情報処理学会研究報告(Web)   2014.12

     More details

    Language:Japanese  

    NSIM-ACE: A Simulator for Evaluating RDMA on Large-Scale Interconnection Networks

  • 多次元メッシュ/トーラスにおけるプロセス配置に応じた集団通信アルゴリズム選択技術の提案

    南里豪志, 杉山裕宣, 森江善之

    情報処理学会研究報告(CD-ROM)   2013.4

     More details

    Language:Japanese  

    Proposal of a Method for Selecting Algorithm of Collective Communications on Multi-Dimensional Mesh/Torus

  • 通信衝突を削減するタスク配置最適化における通信タイミングの予測方式の影響

    森江善之, 南里豪志

    情報処理学会研究報告(CD-ROM)   2013.2

     More details

    Language:Japanese  

    通信衝突を削減するタスク配置最適化における通信タイミングの予測方式の影響

  • 通信衝突削減のためのタスク配置最適化の評価

    森江善之, 南里豪志, 石畑宏明, 井上弘士, 村上和彰

    情報処理学会研究報告   2008.3

     More details

    Language:Japanese  

    Evaluation of optimization of task allocation for reducing contentions

  • OpenMP入門(4)

    南里 豪志

    計算工学   2007.7

     More details

    Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)  

  • OpenMP入門(3)

    南里 豪志

    計算工学   2007.4

     More details

    Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)  

  • 通信タイミングを考慮したランク配置最適化技術

    森江善之, 末安直樹, 松本透, 南里豪志, 石畑宏明, 井上弘士, 村上和彰

    情報処理学会研究報告   2007.3

     More details

    Language:Japanese  

    Optimization of rank allocation considerin communication timing

  • OpenMP入門(2)

    南里 豪志

    計算工学   2007.1

     More details

    Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)  

  • OpenMP入門(1)

    南里 豪志

    計算工学   2006.10

     More details

    Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)  

  • MPIによる並列プログラミング入門

    南里 豪志

    プラズマ・核融合学会誌   2003.8

     More details

    Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)  

▼display all

Industrial property rights

Patent   Number of applications: 1   Number of registrations: 1
Utility model   Number of applications: 0   Number of registrations: 0
Design   Number of applications: 0   Number of registrations: 0
Trademark   Number of applications: 0   Number of registrations: 0

Professional Memberships

  • 情報処理学会

  • IEEE

Committee Memberships

  • 電子情報通信学会九州支部   庶務幹事   Domestic

    2019.4 - 2021.3   

  • IEEE福岡支部   secretary   Domestic

    2019.4 - 2021.3   

  • 情報処理学会ハイパフォーマンスコンピューティング研究会   Steering committee member   Domestic

    2016.4 - 2020.3   

Academic Activities

  • Program Chair International contribution

    7th International Workshop on Large-scale HPC Application Modernization  ( Japan ) 2020.11

     More details

    Type:Competition, symposium, etc. 

  • Track chair International contribution

    HPC Asia 2020  ( Fukuoka Japan ) 2020.1

     More details

    Type:Competition, symposium, etc. 

  • 実行副委員長

    AXIES2019  ( Japan ) 2019.12

     More details

    Type:Competition, symposium, etc. 

  • 企画委員

    男女共同参画シンポジウム  ( Japan ) 2019.9

     More details

    Type:Competition, symposium, etc. 

  • 電子情報通信学会誌

    2019.4 - 2021.3

     More details

    Type:Academic society, research group, etc. 

  • Local Arrangement

    ACSI2016  ( Fukuoka Japan ) 2016.1

     More details

    Type:Competition, symposium, etc. 

    Number of participants:110

  • Committee International contribution

    International Workshop LENS (Language, Network and System Software) 2015  ( Tokyo Japan ) 2015.10

     More details

    Type:Competition, symposium, etc. 

    Number of participants:50

  • 座長(Chairmanship)

    2009年並列/分散/協調処理に関する 『仙台』サマー・ワークショップ  ( Japan ) 2009.8

     More details

    Type:Competition, symposium, etc. 

  • 実行委員

    SWoPP2007  ( Japan ) 2007.8 - Present

     More details

    Type:Competition, symposium, etc. 

    Number of participants:200

  • 実行委員長

    SWoPP2006  ( Japan ) 2006.7 - Present

     More details

    Type:Competition, symposium, etc. 

    Number of participants:200

▼display all

Research Projects

  • 次世代計算基盤に係る調査研究(文部科学省)

    2022.7 - 2024.3

    理化学研究所 

      More details

    Authorship:Coinvestigator(s) 

    次世代計算基盤には、SDGs・Society 5.0の実現に向けた課題解決のためのプラットフォームとしての役割が求められる。そこで、今後の科学に「研究DX」をもたらす高度なデジタルツイン実現の基盤として、広範な計算手法・シミュレーション技法や大規模データを駆使しつつ、それらが密に連携しながら全体のワークフロー実行が可能な汎用性の高い計算基盤の実現を目指し、あるべきアーキテクチャやシステムソフトウェア・ライブラリ技術について、アプリケーションとのコデザインを通じた調査研究を行う。
    特に、システム設計の基本理念として演算精度も考慮しながら必要な計算性能を確保し、電力制約の下でデータ移動を高度化・効率化する「FLOPS to Byte」指向のシステム構築を、アーキテクチャ開発からアルゴリズム設計、アプリケーション技術に至るまで実践する。
    ALL Japan体制のもと、実効的な性能を向上させる次世代計算基盤のシステム構成や要素技術の調査検討、要素技術の開発を、アーキテクチャ・システムソフトウェアとアプリケーションとのコデザインを通じて実施する。

  • Implementation of Efficient Asynchronously Coupled Computation with Timed Buffer on NVDIMM

    Grant number:22K12049  2022 - 2024

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (C)

    南里 豪志, 深沢 圭一郎, 加藤 雄人

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

    複数の事象が関係する問題を計算により解決する手段として、それぞれの事象の解決プログラムを接続する連成計算が注目されている。連成計算における課題の一つに、それぞれのプログラムの進行速度の違いによる同期待ちが有る。本研究は、安価で大容量の不揮発性メモリNVDIMM上に時系列でデータを格納するバッファを実装し、これにより、同期待ちの少ない非同期連成計算の実現を目指す。本研究は、計算機の基盤ソフトウェア技術を専門とする研究者と、様々なシミュレーションプログラムを開発する研究者によるチームで取り組むことで、実用性の高い技術の開発を図る。成果は幅広く利用してもらえるようにGitHub等で公開する。

    CiNii Research

  • システムソフトウェア・ライブラリ調査研究

    2022 - 2023

    文部科学省 次世代計算基盤に係る調査研究事業

      More details

    Authorship:Coinvestigator(s)  Grant type:Contract research

  • 不揮発性メモリへ高効率にRDMAする技術の研究・開発

    2020.10 - 2021.3

    Joint research

      More details

    Authorship:Principal investigator  Grant type:Other funds from industry-academia collaboration

  • 量子計算及びイジング計算システムの統合型研究開発(NEDO)

    2020.4 - 2027.3

    産業技術総合研究所 

      More details

    Authorship:Coinvestigator(s) 

    超スマート社会の実現のため、先進的なモビリティサービスやスマートファクトリ、金融、創薬など多様な産業分野におけるディジタライゼーションの進展と、これに伴う高性能次世代コンピューティングに対する社会的要請が急激に高まっている。本プロジェクトにおいては、3つのNEDO プロジェクト「超伝導パラメトロン素子を用いた量子アニーリング技術の研究開発」(2018年度〜)、「イジングマシン共通ソフトウェア基盤の研究開発」(2018年度〜)、「超伝導体・半導体技術を融合した集積量子計算システムの開発」(2020年度〜)を2021年4月に統合し、フルスタック型の統合型研究開発を産学官連携に基づいて実施する

  • 量子計算及びイジング計算システムの統合型研究開発

    2020 - 2027

    NEDO 高効率・高速処理を可能とするAIチップ・次世代コンピューティングの技術開発

      More details

    Authorship:Coinvestigator(s)  Grant type:Contract research

  • 不揮発性メモリへ高効率にRDMAする技術の研究・開発

    2019.9 - 2020.3

    Joint research

      More details

    Authorship:Principal investigator  Grant type:Other funds from industry-academia collaboration

  • NVDIMM上の通信バッファによるスケーラブルな非同期通信レイヤの開発

    Grant number:19K11991  2019 - 2021

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (C)

    南里 豪志

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

    DIMMスロットに装着可能な不揮発性メモリNVDIMMは、DRAMより省電力かつ安価で大容量化が容易なメモリデバイスとして注目されている。本研究では、このNVDIMMを通信ライブラリ内部のバッファ領域として用いる通信レイヤを開発する。これにより、大規模並列計算機での非ブロッキング一対一通信による通信隠蔽が可能となるため、並列アプリケーションのスケーラビリティ向上が期待できる。また、DRAM上バッファとNVDIMM上バッファを、通信頻度等の実行時の状況に応じて切り替えることにより、1~10μ秒と予想されているNVDIMMの遅延時間による性能への影響の軽減を図る。

    CiNii Research

  • エクサスケールスパコンの省エネ化に向けたシステム電力管理戦略の研究

    2018 - 2020

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (B)

      More details

    Authorship:Coinvestigator(s)  Grant type:Scientific research funding

  • 超並列において高スケーラビリティを実現するステンシル計算・通信モデルの開発

    2018 - 2020

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (C)

      More details

    Authorship:Coinvestigator(s)  Grant type:Scientific research funding

  • Development of Time-Reversal Method for Detecting Multiple Moving Targets Behind the Wall International coauthorship

    2017.4 - 2018.3

    JHPCN (Japan) 

      More details

    Authorship:Principal investigator 

    There are many imaging systems in the world for see-through the wall or cancer detection such as MRI for medical imaging. However the current technologies are not cheap nor not available everywhere. One of the cheap alternatives to such expensive systems is microwave imaging using the Time Reversal (TR) method which was first introduced in acoustics. TR has found applications in various disciplines ranging from non-destructive testing, underwater communications and medicine. TR has also been studied for Ground Penetrating Radar (GPR) as well as Through the Wall Imaging (TWI). The TR method with some super-resolution techniques such as Decomposition Of the Time-Reversal Operator (DORT in its French acronym) or MUltiple SIgnal Classification (MUSIC) requires more than 150 Fast Fourier Transform and more than 20000 singular value decomposition for a very small imaging system which consists of 13 antenna elements. Therefore the current approach is far from the real-time system due to the long computational time. Furthermore there is a high demand on the detection of multiple moving targets but the work in this field is scarce. The detection of multiple moving targets behind the wall is the one of the most challenging scenarios in through-the-wall microwave imaging. So far Fumie Costen at University of Manchester has developed the spatio-temporal windowing for the differential MDM (multi-static data matrix) for time reversal algorithm to detect multiple moving objects in a simple canonical case. This project will develop and verify an algorithm to detect the multiple moving targets with high computational efficiency.

  • スケーラブル通信ライブラリを用いた次世代惑星電磁圏連成計算技術の創出

    2017 - 2019

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Challenging Research(Exploratory)

      More details

    Authorship:Coinvestigator(s)  Grant type:Scientific research funding

  • MPI向け準備型集団通信インタフェースの研究開発

    2015 - 2017

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (C)

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

  • 並列言語CAFプログラム向け通信隠蔽技術の研究開発

    Grant number:24500068  2012 - 2014

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (C)

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

  • 省メモリ技術と動的最適化技術によるスケーラブル通信ライブラリの開発(JST CREST 研究領域「ポストペタスケール高性能計算に資するシステムソフトウェア技術の創出」)

    2011.10 - 2017.3

    九州大学(日本) 

      More details

    Authorship:Principal investigator 

    Within a decade, the number of processing cores on supercomputers is predicted to be more than 100 million. This project researches technologies for memory saving and runtime optimizations to implement a scalable communication library that will be required on such large scale computers. In addition to that, the project also develops methods for building scalable applications by utilizing facilities of the communication library.

  • 省メモリ技術と動的最適化技術によるスケーラブル通信ライブラリの開発

    2011 - 2016

    Grants-in-Aid for Scientific Research  戦略的創造研究推進事業

      More details

    Authorship:Principal investigator  Grant type:Competitive funding other than Grants-in-Aid for Scientific Research

  • 1億コア超の大規模並列計算環境に耐える通信ライブラリおよび数値計算ライブラリの研究

    2011

    教育研究プログラム・研究拠点形成プロジェクト(特別枠:追加採択分)

      More details

    Authorship:Principal investigator  Grant type:On-campus funds, funds, etc.

  • 並列言語CAF向け動的通信最適化技術の開発

    Grant number:21700036  2009 - 2011

    Grants-in-Aid for Scientific Research  Grant-in-Aid for Young Scientists (B)

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

  • IPv6とMyrinetによる階層型クラスタ上のOpenMP処理環境の開発

    Grant number:18700065  2006 - 2008

    Grants-in-Aid for Scientific Research  Grant-in-Aid for Young Scientists (B)

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

  • ペタスケール・システムインターコネクト技術の開発(文部科学省 「次世代IT基盤構築のための研究開発」 、研究開発領域「将来のスーパー コンピューティングのための要素技術の研究開発」(平成17年度〜19年度))

    2005.4 - 2008.3

    九州大学(日本) 

      More details

    Authorship:Coinvestigator(s) 

    PSI is one of the national projects on elemental technologies for peta-scale computing systems. The project works intensively on the topics of system interconnection networks: Optical switches, Intelligent interconnects and Performance prediction.

  • 階層型クラスタシステム上のOpenMPプログラム翻訳実行環境の開発に関する研究

    Grant number:15700033  2003 - 2005

    Grants-in-Aid for Scientific Research  Grant-in-Aid for Young Scientists (B)

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

  • 超並列において高スケーラビリティを実現するステンシル計算・通信モデルの開発

    Grant number:18K11336 

    深沢 圭一郎, 南里 豪志

      More details

    Grant type:Scientific research funding

    本研究では、エクサスケール環境においてスケーラビリティ減衰が無いステンシル計算・通信モデルの開発、及びそこで利用されるHalo通信関数の開発を行うことを目的とした。
    まずステンシルシミュレーションにおいて、「計算」と「通信が必要な計算と通信」にスレッドを分けるモデルを開発した。これにより、通信が終わったことを知るための同期が必要無く、並列性能劣化を回避することができた。次に、そこで利用された通信モデルを関数群(Halo関数)にまとめ、他のアプリケーションでも容易に利用可能とした。これらの性能を2000ノード利用した環境で測定を行い、高いスケーラビリティを確認した。

    CiNii Research

▼display all

Educational Activities

  • For graduated students, teaching at classes of high-performance parallel computing and network.
    For under graduate students, teaching at classes of programming and network.

Class subject

  • 高性能並列計算法特論Ⅱ

    2024.6 - 2024.8   Summer quarter

  • High-Performance Parallel Computing II

    2024.6 - 2024.8   Summer quarter

  • 【通年】情報理工学研究Ⅰ

    2024.4 - 2025.3   Full year

  • 【通年】情報理工学講究

    2024.4 - 2025.3   Full year

  • 【通年】情報理工学演習

    2024.4 - 2025.3   Full year

  • 【修士】高性能並列計算法特論

    2024.4 - 2024.9   First semester

  • 情報理工学論議Ⅰ

    2024.4 - 2024.9   First semester

  • 情報理工学論述Ⅰ

    2024.4 - 2024.9   First semester

  • 情報理工学読解

    2024.4 - 2024.9   First semester

  • 高性能並列計算法特論Ⅰ

    2024.4 - 2024.6   Spring quarter

  • High-Performance Parallel Computing I

    2024.4 - 2024.6   Spring quarter

  • (IUPE)Int. to Information Processing II

    2023.12 - 2024.2   Winter quarter

  • 通信ネットワークⅡ

    2023.12 - 2024.2   Winter quarter

  • 情報ネットワーク特論

    2023.12 - 2024.2   Winter quarter

  • 通信ネットワークB

    2023.12 - 2024.2   Winter quarter

  • (後期)通信ネットワーク

    2023.10 - 2024.3   Second semester

  • 情報理工学論議Ⅱ

    2023.10 - 2024.3   Second semester

  • 情報理工学論述Ⅱ

    2023.10 - 2024.3   Second semester

  • 情報理工学演示

    2023.10 - 2024.3   Second semester

  • 通信ネットワークⅠ

    2023.10 - 2023.12   Fall quarter

  • (IUPE)Int. to Information Processing I

    2023.10 - 2023.12   Fall quarter

  • 通信ネットワークA

    2023.10 - 2023.12   Fall quarter

  • 高性能並列計算法特論Ⅱ

    2023.6 - 2023.8   Summer quarter

  • High-Performance Parallel Computing II

    2023.6 - 2023.8   Summer quarter

  • 【通年】情報理工学研究Ⅰ

    2023.4 - 2024.3   Full year

  • 【通年】情報理工学講究

    2023.4 - 2024.3   Full year

  • 【通年】情報理工学演習

    2023.4 - 2024.3   Full year

  • 【修士】高性能並列計算法特論

    2023.4 - 2023.9   First semester

  • 情報理工学論議Ⅰ

    2023.4 - 2023.9   First semester

  • 情報理工学論述Ⅰ

    2023.4 - 2023.9   First semester

  • 情報理工学読解

    2023.4 - 2023.9   First semester

  • 電気情報工学入門

    2023.4 - 2023.6   Spring quarter

  • High-Performance Parallel Computing I

    2023.4 - 2023.6   Spring quarter

  • 高性能並列計算法特論Ⅰ

    2023.4 - 2023.6   Spring quarter

  • サイバーセキュリティ基礎論

    2023.4 - 2023.6   Spring quarter

  • サイバーセキュリティ基礎論

    2023.4 - 2023.6   Spring quarter

  • (IUPE)Int. to Information Processing II

    2022.12 - 2023.2   Winter quarter

  • 情報ネットワーク特論

    2022.12 - 2023.2   Winter quarter

  • 通信ネットワークB

    2022.12 - 2023.2   Winter quarter

  • (後期)通信ネットワーク

    2022.10 - 2023.3   Second semester

  • 情報理工学論議Ⅱ

    2022.10 - 2023.3   Second semester

  • 情報理工学論述Ⅱ

    2022.10 - 2023.3   Second semester

  • 情報理工学演示

    2022.10 - 2023.3   Second semester

  • (IUPE)Int. to Information Processing I

    2022.10 - 2022.12   Fall quarter

  • 通信ネットワークA

    2022.10 - 2022.12   Fall quarter

  • High-Performance Parallel Computing II

    2022.6 - 2022.8   Summer quarter

  • 高性能並列計算法特論Ⅱ

    2022.6 - 2022.8   Summer quarter

  • 情報理工学講究

    2022.4 - 2023.3   Full year

  • 情報理工学研究Ⅰ

    2022.4 - 2023.3   Full year

  • 情報理工学演習

    2022.4 - 2023.3   Full year

  • High-Performance Parallel Computing

    2022.4 - 2022.9   First semester

  • 【修士】高性能並列計算法特論

    2022.4 - 2022.9   First semester

  • 情報理工学読解

    2022.4 - 2022.9   First semester

  • 情報理工学論述Ⅰ

    2022.4 - 2022.9   First semester

  • 情報理工学論議Ⅰ

    2022.4 - 2022.9   First semester

  • High-Performance Parallel Computing I

    2022.4 - 2022.6   Spring quarter

  • サイバーセキュリティ基礎論

    2022.4 - 2022.6   Spring quarter

  • サイバーセキュリティ基礎論

    2022.4 - 2022.6   Spring quarter

  • 高性能並列計算法特論Ⅰ

    2022.4 - 2022.6   Spring quarter

  • 情報ネットワーク特論

    2021.12 - 2022.2   Winter quarter

  • (IUPE)Int. to Information Processing II

    2021.12 - 2022.2   Winter quarter

  • 情報ネットワーク特論

    2021.12 - 2022.2   Winter quarter

  • (IUPE)Int. to Information Processing II

    2021.12 - 2022.2   Winter quarter

  • (IUPE)Int. to Information Processing l

    2021.10 - 2021.12   Fall quarter

  • (IUPE)Int. to Information Processing l

    2021.10 - 2021.12   Fall quarter

  • (IUPE)Int. to Information Processing II

    2020.12 - 2021.2   Winter quarter

  • (IUPE)Int. to Information Processing II

    2020.12 - 2021.2   Winter quarter

  • (IUPE)Int. to Information Processing II

    2020.12 - 2021.2   Winter quarter

  • 情報ネットワーク特論

    2020.10 - 2021.3   Second semester

  • (IUPE)Int. to Information Processing l

    2020.10 - 2020.12   Fall quarter

  • (IUPE)Int. to Information Processing l

    2020.10 - 2020.12   Fall quarter

  • (IUPE)Int. to Information Processing l

    2020.10 - 2020.12   Fall quarter

  • (IUPE)Int. to Information Processing II

    2019.12 - 2020.2   Winter quarter

  • (IUPE)Int. to Information Processing II

    2019.12 - 2020.2   Winter quarter

  • 情報ネットワーク特論

    2019.10 - 2020.3   Second semester

  • 情報ネットワーク特論

    2019.10 - 2020.3   Second semester

  • (IUPE)Int. to Information Processing l

    2019.10 - 2019.12   Fall quarter

  • (IUPE)Int. to Information Processing l

    2019.10 - 2019.12   Fall quarter

  • Introduction to Information Processing

    2019.4 - 2019.6   Spring quarter

  • (IUPE) Introduction to Information Processing

    2019.4 - 2019.6   Spring quarter

  • (IUPE) Introduction to Information Processing

    2019.4 - 2019.6   Spring quarter

  • 情報ネットワーク特論

    2018.10 - 2019.3   Second semester

  • Introduction to Information Processing

    2018.4 - 2018.6   Spring quarter

  • Introduction to Information Processing

    2018.4 - 2018.6   Spring quarter

  • 情報ネットワーク特論

    2017.10 - 2018.3   Second semester

  • Introduction to Information Processing

    2017.4 - 2017.9   First semester

  • Introduction to Information Processing

    2017.4 - 2017.6   Spring quarter

  • Introduction to Information Processing

    2017.4 - 2017.6   Spring quarter

  • 情報ネットワーク特論

    2016.10 - 2017.3   Second semester

  • Introduction to Information Processing

    2016.4 - 2016.9   First semester

  • 情報ネットワーク特論

    2015.10 - 2016.3   Second semester

  • Introduction to Information Processing

    2015.4 - 2015.9   First semester

  • 情報ネットワーク特論

    2014.10 - 2015.3   Second semester

  • Introduction to Information Processing

    2014.4 - 2014.9   First semester

  • 情報ネットワーク特論

    2013.10 - 2014.3   Second semester

  • 情報ネットワーク特論

    2012.10 - 2013.3   Second semester

  • 情報ネットワーク特論

    2011.10 - 2012.3   Second semester

  • 情報ネットワーク特論

    2010.10 - 2011.3   Second semester

  • 情報処理概論

    2010.4 - 2010.9   First semester

  • 情報処理概論

    2009.4 - 2009.9   First semester

  • 情報処理概論

    2008.4 - 2008.9   First semester

  • 情報処理概論

    2007.10 - 2008.3   Second semester

  • 情報処理概論

    2007.4 - 2007.9   First semester

  • 情報処理概論

    2006.4 - 2006.9   First semester

  • 情報処理概論

    2005.4 - 2005.9   First semester

  • 基幹教育セミナー

    2025.6 - 2025.8   Summer quarter

  • [G]High-Performance Parallel Computing II

    2025.6 - 2025.8   Summer quarter

  • 高性能並列計算法特論Ⅱ

    2025.6 - 2025.8   Summer quarter

  • 【通年】情報理工学講究

    2025.4 - 2026.3   Full year

  • 【通年】情報理工学研究Ⅰ

    2025.4 - 2026.3   Full year

  • 【通年】情報理工学演習

    2025.4 - 2026.3   Full year

  • 情報理工学論述Ⅰ

    2025.4 - 2025.9   First semester

  • 情報理工学論議Ⅰ

    2025.4 - 2025.9   First semester

  • 情報理工学読解

    2025.4 - 2025.9   First semester

  • 高性能並列計算法特論Ⅰ

    2025.4 - 2025.6   Spring quarter

  • [G]High-Performance Parallel Computing I

    2025.4 - 2025.6   Spring quarter

  • 通信ネットワークⅡ

    2024.12 - 2025.2   Winter quarter

  • 通信ネットワークB

    2024.12 - 2025.2   Winter quarter

  • 情報ネットワーク特論

    2024.12 - 2025.2   Winter quarter

  • (IUPE)Int. to Information Processing II

    2024.12 - 2025.2   Winter quarter

  • (後期)通信ネットワーク

    2024.10 - 2025.3   Second semester

  • 情報理工学論述Ⅱ

    2024.10 - 2025.3   Second semester

  • 情報理工学論議Ⅱ

    2024.10 - 2025.3   Second semester

  • 情報理工学演示

    2024.10 - 2025.3   Second semester

  • 通信ネットワークⅠ

    2024.10 - 2024.12   Fall quarter

  • 通信ネットワークA

    2024.10 - 2024.12   Fall quarter

  • (IUPE)Int. to Information Processing I

    2024.10 - 2024.12   Fall quarter

  • 高性能並列計算法特論Ⅱ

    2024.6 - 2024.8   Summer quarter

  • High-Performance Parallel Computing II

    2024.6 - 2024.8   Summer quarter

  • 【通年】情報理工学講究

    2024.4 - 2025.3   Full year

  • 【通年】情報理工学研究Ⅰ

    2024.4 - 2025.3   Full year

  • 【通年】情報理工学演習

    2024.4 - 2025.3   Full year

  • 情報理工学論述Ⅰ

    2024.4 - 2024.9   First semester

  • 情報理工学論議Ⅰ

    2024.4 - 2024.9   First semester

  • 情報理工学読解

    2024.4 - 2024.9   First semester

  • 【修士】高性能並列計算法特論

    2024.4 - 2024.9   First semester

  • 高性能並列計算法特論Ⅰ

    2024.4 - 2024.6   Spring quarter

  • High-Performance Parallel Computing I

    2024.4 - 2024.6   Spring quarter

▼display all

Visiting, concurrent, or part-time lecturers at other universities, institutions, etc.

  • 2024  九州工業大学情報工学部  Classification:Part-time lecturer  Domestic/International Classification:Japan 

  • 2023  九州工業大学情報工学部  Classification:Part-time lecturer  Domestic/International Classification:Japan 

  • 2023  岡山大学工学部  Classification:Part-time lecturer  Domestic/International Classification:Japan 

  • 2023  放送大学  Classification:Affiliate faculty 

  • 2022  放送大学  Classification:Affiliate faculty 

  • 2022  九州工業大学情報工学部  Classification:Part-time lecturer  Domestic/International Classification:Japan 

  • 2021  放送大学  Classification:Part-time lecturer  Domestic/International Classification:Japan 

    Semester, Day Time or Duration:面接授業(計8コマ)担当

▼display all

Other educational activity and Special note

  • 2023  Class Teacher  学部

  • 2011  Special Affairs  システム情報科学研究院の青柳研究室に参加し、学部生 2名の卒業研究について、実質的な指導を担当した。 また、システム情報科学研究院の村上研究室に参加し、修士2年生 1名の卒業研究について、実質的な指導を担当した。

     詳細を見る

    システム情報科学研究院の青柳研究室に参加し、学部生 2名の卒業研究について、実質的な指導を担当した。
    また、システム情報科学研究院の村上研究室に参加し、修士2年生 1名の卒業研究について、実質的な指導を担当した。

Social Activities

  • スーパーコンピュータ超入門

    九州大学情報基盤研究開発センター  九州大学情報基盤研究開発センター  2020.10

     More details

    Audience:General, Scientific, Company, Civic organization, Governmental agency

    Type:Seminar, workshop

    スーパーコンピュータという言葉は知っているが、どんなものか良く分からない、という方を対象に、スーパーコンピュータの役割やパーソナルコンピュータとの違いなどを紹介する。

  • 並列プログラミングにおける国際的な標準規格 MPI (Message Passing Interface) の仕様策定会議に参加

    2016

     More details

    並列プログラミングにおける国際的な標準規格 MPI (Message Passing Interface) の仕様策定会議に参加

  • 社会人向けスパコン実践スクール  今のパソコンは、昔の大型計算機と言われた計算機を遥かに凌ぐスペックを有しており、また簡単に手に入るようになった。  今回のセミナーでは、オフィスで使用している程度のパソコンを使用して、8ノードの並列計算機(PCクラスタ)を構成し、LinuxやMPIなどのソフトウェアをインストールして、ネットワークに接続し、実際に自分のパソコンからシミュレーション・コードを走らせて、その性能を評価してみます。

    財団法人計算科学振興財団、大学院GP「大学連合による計算科学の最先端人材育成」  神戸ポートアイランド内 神戸大学BTセンター  2009.6

     More details

    Audience:General, Scientific, Company, Civic organization, Governmental agency

    Type:Seminar, workshop

Educational Activities for Highly-Specialized Professionals in Other Countries

  • 2020.2 - 2020.3   国立研究開発法人科学技術振興機構「さくらサイエンスプラン」科学技術研修コース「ミャンマーの数学科の大学院生が数学のスーパーコンピューティングへの応用を学ぶ」

    Main countries of student/trainee affiliation:Myanmar