研究者詳細 - 南里豪志

お知らせ

写真a

ナンリ　タケシ

南里豪志

NANRI TAKESHI

所属

情報基盤研究開発センター先端計算科学研究部門准教授
システム情報科学府情報理工学専攻（併任）

連絡先

プロフィール

研究業績：平成８年に九州大学大型計算機センター（現在の情報基盤研究開発センター）に就職後、主に並列計算機におけるプログラムの実行環境について研究を行ってきた。現在、大規模計算を行うための計算機としては、独自の記憶装置を持つ計算機を複数台ネットワークで接続した分散記憶型の並列計算機が主流である。このような環境では、計算機間の通信を効率よく行うための通信ライブラリの最適化が重要である。しかしながら、大規模な並列計算機環境では、実行時のプロセスの配置状況やほかのジョブの影響によって基本性能が変化するため、実行前の情報だけで最適化を行うことが困難となることが予想されている。そこで、実行時の状況に応じた自動最適化技術が求められている。その手段の一つとして、並列プログラムの実行中に取得したプロセスの配置や負荷状況、通信性能等の情報を用いて通信ライブラリの内部アルゴリズムやパラメータを調整する動的最適化技術を研究している。教育活動：情報基盤研究開発センターにおいて、利用者に対する講習会の講師を務めている他、並列プログラミングに関する大学院生向けの講義を担当している。大学運営：情報統括本部において HPC事業室に所属し、スーパーコンピュータをはじめとする大規模計算機の調達や運用、利用者向けの講習などを担当している。

ホームページ

https://nanrilab-kyushu-u.notion.site/top
南里豪志

外部リンク

研究分野

情報通信 / 高性能計算

学位

博士(情報科学)

経歴

情報基盤研究開発センター先端計算科学研究部門准教授

2007年3月 - 現在

学歴

九州大学大学院工学研究科情報工学専攻修士課程

- 1995年3月

　詳細を見る

国名：日本国
九州大学工学部

- 1993年3月

　詳細を見る

国名：日本国

研究テーマ・研究キーワード

研究テーマ：高スケーラブル並列計算に向けた基盤技術の研究開発

研究キーワード：スケーラビリティ、並列計算、高性能計算

研究期間： 2011年9月
研究テーマ：大規模並列計算機向け通信ライブラリの動的高速化手法に関する研究

研究キーワード：並列計算, 動的最適化

研究期間： 2005年4月
研究テーマ：階層型クラスタシステム上のプログラム開発環境に関する研究

研究キーワード：クラスタシステム,並列計算,分散共有メモリ,コンパイラ

研究期間： 2003年4月

受賞

大学ICT推進協議会2020年度年次大会優秀論文賞

2021年4月大学ICT推進協議会 DIMMスロット装着型不揮発性メモリ上のRDMAによるメッセージキューイングシステムの試作
山下記念研究賞

2013年7月一般社団法人情報処理学会第136回ハイパフォーマンスコンピューティング研究会における研究発表「Tofuネットワークにおけるプロセス配置形状による集団通信アルゴリズムの性能解析」に対する受賞。

　詳細を見る

スーパーコンピュータの大規模化に伴って，ノード間インターコネクトネットワークとして，コストの低い多次元メッシュ/トーラストポロジを採用したものを用いる事例が増えている．多次元メッシュ/トーラスは，使用するノード数が同じでも，プロセスが配置されるノード群の形状によって性能が大きく変動する．本研究では，京コンピュータや，その互換機である FujitsuPRIMEHPC FX10で用いられている Tofuインターコネクトネットワークを対象として，プロセス配置の形状による集団通信アルゴリズムの性能への影響を計測した．得られた性能を，Tofuインターコネクトの性能解析ツールを用いて取得した通信衝突による転送待ち時間と比較したところ，プロセス配置形状による変動がどちらもほぼ同じ傾向を示すことを明らかにした．これらの結果から，集団通信アルゴリズムの選択において，プロセス配置の形状を考慮した性能見
積もりが重要であることを示した．

論文

「京」の後の時代を支えるスパコン：5．多数のXeonプロセッサを用いるスパコン招待

@南里豪志

情報処理 60 ( 12 ) 1198 - 1203 2019年11月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）
分散共有メモリシステム上にソフトウェアによって構築されたキャッシュシステムの静的制御査読

南里豪志, 佐藤周行, 島崎眞昭

情報処理学会論文誌 1997年9月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）
Portability in Implementing Distributed Shared Memory System on the Workstation Cluster Environment 査読

Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki

Research Reports on Information Science and Electrical Engineering of Kyushu University 1997年3月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）
Optimization of a GEMM Implementation using Intel AMX 査読国際誌

Endo Y., Ohshima S., Nanri T.

Proceedings of Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region Sca Hpcasia 2026 81 - 90 2026年1月（ ISBN:9798400720673 ）

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：Proceedings of Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region Sca Hpcasia 2026

In high-performance computing, general matrix multiplication (xGEMM) routines form the core of Level-3 BLAS kernels, which enables efficient matrix operations. Among these, the low-precision GEMM such as BFloat16 has become indispensable in machine learning and deep learning because it reduces memory usage and power consumption. To meet this demand, recent hardware platforms are equipped with dedicated matrix computation units separate from the CPU, and research efforts have focused on maximizing their performance. Intel's Advanced Matrix Extension (AMX) is one such hardware accelerator designed specifically for low-precision matrix operations. In this study, we implement and optimize matrix multiplication-accumulation using AMX by applying blocking and tile register level optimizations and evaluate its performance. Our results demonstrate a performance improvement in the range of 7.27-20.40% compared to that of the BFloat16 GEMM implementations provided by MKL and OpenBLAS.

DOI： 10.1145/3773656.3773660

Scopus
On the normalised energy dissipation rate in homogeneous isotropic turbulence 査読

Kitamura, T; Nagata, K; Shimoyama, K; Nanri, T

JOURNAL OF FLUID MECHANICS 1010 2025年4月（ ISSN:0022-1120 eISSN:1469-7645 ）

　詳細を見る

出版者・発行元：Journal of Fluid Mechanics

The Reynolds number dependence of the normalised energy dissipation rate is studied, where is the energy dissipation rate, is the integral length scale and is the root-mean-square velocity. We present the derivation of the exact relationship between the normalised energy dissipation rate and integrated form of the Kármán-Howarth equation in homogeneous isotropic turbulence. The present mathematical formulation is valid for both forced and decaying turbulence. The discussion of is developed under the assumption that the term resulting from the nonlinear energy transfer appearing in is constant at sufficiently high-Reynolds-number turbulence. The fact that the integrated term originating from nonlinear energy transfer is constant plays the role of a lower bound in, implying that the energy dissipation rate is finite in high-Reynolds-number turbulence. Furthermore, the origin of the non-equilibrium dissipation law could be the imbalance between and, the influence of external forces, or both. In decaying turbulence with forced turbulence as the initial condition, the imbalance between and causes the non-equilibrium dissipation law. The validity of the theoretical analysis is investigated using direct numerical simulations of the forced and decaying turbulence.

DOI： 10.1017/jfm.2025.316

Web of Science

Scopus
新スーパーコンピュータシステムの導入について

平島智将, 菅尾貴彦, 原田浩睦, 南里豪志, 大島聡史

大学ICT推進協議会年次大会論文集 2024 ( 0 ) 37 - 40 2024年12月（ eISSN:24349305 ）

　詳細を見る

記述言語：日本語掲載種別：研究論文（研究会，シンポジウム資料等）出版者・発行元：一般社団法人大学ICT推進協議会

九州大学情報基盤研究開発センター（以下、「九大センター」という。）では、2024年10月よりスーパーコンピュータシステム玄界の運用を開始した。本稿ではITOからの利用制度の方針変更や玄界において提供する各種利用制度を紹介する。

DOI： 10.24669/axies.2024.0_37

CiNii Research
Optical properties of rutile TiO<inf>2</inf> with Zr, Mo, Zn, Cd impurities 査読

Ohno K., Sahara R., Nanri T., Kawazoe Y.

Computational Condensed Matter 41 2024年12月（ ISSN:2352-2143 ）

　詳細を見る

出版者・発行元：Computational Condensed Matter

To explore quasiparticle (QP) energy gaps and photoabsorption spectra of rutile TiO2 with nonmagnetic transition metal (Zr, Mo, Zn, Cd) impurities, we conducted a Γ-point only GW ＋ Bethe–Salpeter equation (BSE) calculation on a 72 (or 71) atom supercell. Our findings reveal that Zn and Cd impurities must coexist, at least partly, with oxygen vacancies to maintain charge neutrality. Among the systems considered, Mo, Zn, or Cd doped rutile TiO2 may exhibit optical absorption and catalytic activity under visible light. The resulting QP energy gaps (ΔɛQP) and photoabsorption energies (PAEs) are fairly in good agreement with both experimental and theoretical data currently available. The necessary conditions for the applicability of the Γ-point only approach in the GW ＋ BSE framework were found to be: (1) The Γ-point only GW calculation should reproduce a reasonable band gap. (2) The “superficial” exciton binding energy (the diagonal element of Wvc;vc−2Xvc;vc between v= VBM and c= CBM, where W and X are the direct and exchange terms of the BSE matrix elements, respectively) must be positive or marginally negative. (3) The “real” exciton binding energy (ΔɛQP− the lowest PAE) should be positive, even if it is exceptionally small.

DOI： 10.1016/j.cocom.2024.e00977

Web of Science

Scopus
一様等方性乱流における前進および後退多粒子拡散招待査読

荒川隆之介，北村拓也，園部陽平，才本明秀，@南里豪志

日本機械学会論文集 90 ( 929 ) 1 - 9 2024年1月（ eISSN:21879761 ）

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）出版者・発行元：一般社団法人日本機械学会

<p>Turbulent diffusion in homogeneous isotropic turbulence is numerically investigated using the direct numerical simulation (DNS). The four-dimensional turbulence database allows us to track fluid particles not only in the forward direction of time but also in the backward direction. Two-particle dispersion has been studied in previous studies and it is known that backward diffusion is faster than forward diffusion. However, little is known about multi-particle dispersion due to the difficulty of observing it experimentally. Studies on backward diffusion are also limited. In this study, multi-particle dispersion is numerically investigated and its properties are discussed, e.g., direction of time and geometry of a tetrahedron. The results show that forward and backward diffusions of multi-particles behave differently at the beginning and evolve similarly after the transient time, but the coefficients of the backward direction are larger than those of the forward direction.</p>

DOI： 10.1299/transjsme.23-00281

CiNii Research
九州大学情報基盤研究開発センター新スーパーコンピュータシステムの紹介

大島聡史, 南里豪志, 美添一樹, 平島智将, 原田浩睦, 池田嗣穂

大学ICT推進協議会年次大会論文集 2023 ( 0 ) 89 - 96 2023年12月（ eISSN:24349305 ）

　詳細を見る

記述言語：日本語掲載種別：研究論文（研究会，シンポジウム資料等）出版者・発行元：一般社団法人大学ICT推進協議会

九州大学情報基盤研究開発センターでは、2024年7月より新スーパーコンピュータシステムの運用を開始する。本稿では、このシステムの概要を、現有システムITOからの改善点を踏まえながら紹介する。

DOI： 10.24669/axies.2023.0_89

CiNii Research
Implementation of Coupled Numerical Analysis of Magnetospheric Dynamics and Spacecraft Charging Phenomena via Code-To-Code Adapter (CoToCoA) Framework 査読

Miyake Y., Sunada Y., Tanaka Y., Nakazawa K., Nanri T., Fukazawa K., Katoh Y.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 14074 LNCS 438 - 452 2023年（ ISSN:03029743 ISBN:9783031360206 ）

　詳細を見る

出版者・発行元：Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

This paper addresses the implementation of a coupled numerical analysis of the Earth’s magnetospheric dynamics and spacecraft charging (SC) processes based on our in-house Code-To-Code Adapter (CoToCoA). The basic idea is that the magnetohydrodynamic (MHD) simulation reproduces the global dynamics of the magnetospheric plasma, and its pressure and density data at local spacecraft positions are provided and used for the SC calculations. This allows us to predict spacecraft charging that reflects the dynamic changes of the space environment. CoToCoA defines three types of independent programs: Requester, Worker, and Coupler, which are executed simultaneously in the analysis. Since the MHD side takes the role of invoking the SC analysis, Requester and Worker positions are assigned to the MHD and SC calculations, respectively. Coupler then supervises necessary coordination between them. Physical data exchange between the models is implemented using MPI remote memory access functions. The developed program has been tested to ensure that it works properly as a coupled physical model. The numerical experiments also confirmed that the addition of the SC calculations has a rather small impact on the MHD simulation performance with up to about 500-process executions.

DOI： 10.1007/978-3-031-36021-3_46

Scopus
Numerical approach for aerodynamics around two tone holes of woodwind instruments 査読

Takanami S., Tabata R., Iwagami S., Ohno T., Nanri T., Kobayashi T., Takahashi K.

Proceedings of the International Congress on Acoustics 2022年（ ISSN:22267808 ）

　詳細を見る

出版者・発行元：Proceedings of the International Congress on Acoustics

In this paper, we discuss the numerical reproducibility of the compressible fluid behavior around two tone holes of woodwind instruments by using compressible Large Eddy Simulation (LES). In particular, we focus on the situation that the tone holes are opened and closed with moving pads above the tone holes, which is regarded as a moving boundary problem with topology change, and reproduce the change of the pitch when opening and closing the tone holes. Our two-dimensional model of a "recorder" has two tone holes. To reproduce the opening and closing the tone holes, the pads are moved continuously. That is, the position of the pads were continuously changed in the order of "open - close". Our numerical results are consistent with the Keef's experimental results. We solved the moving boundary problem with topology change under the situation of acoustics of fluid-structure interaction, and reproduced the pitch change in the opening and closing the tone holes of the recorder like woodwind instrument model.

Scopus
Compressible fluid analysis on basic properties of a thermoacoustic equipment 査読

Tashima Y., Ohno T., Nanri T., Kobayashi T., Takahashi K.

Proceedings of the International Congress on Acoustics 2022年（ ISSN:22267808 ）

　詳細を見る

出版者・発行元：Proceedings of the International Congress on Acoustics

Two and three-dimensional models of a test-tube thermoacoustic engine were numerically analyzed using compressible Large Eddy Simulation (LES) to investigate initial transient behavior, i.e., generation mechanism of thermoacoustic waves in an initial state. In the model used in this study, a stack is placed near the bottom of the test-tube and a temperature gradient is applied between both ends of the stack. The model has an external region connecting the opening of the test-tube for radiations of sound waves and heat. As a result, a fluid flow was observed inside the stack, a strong pressure oscillation, i.e., acoustic resonance, was observed inside the test-tube, and sound radiation from the open end was also observed. Furthermore, the frequency of the sound vibration was almost the same as the theoretical estimation of the resonance frequency of the test-tube. Thus, we successfully reproduced the basic properties of the thermoacoustic engine in the initial state. However, noise components increased in time evolution and stationary oscillations were not attained yet. Thus, we need to improve our numerical method. We are also planning to analyze sound waves' generation mechanism taking into account aeroacoustic theory, e.g., Lighthill's acoustic analogy.

Scopus
Aeroacoustic analysis of port noise by using a three-dimensional numerical model of a bass reflex speaker system 査読

Uryuu K., Tabata R., Ohno T., Nanri T., Kobayashi T., Takahashi K.

Proceedings of the International Congress on Acoustics 2022年（ ISSN:22267808 ）

　詳細を見る

出版者・発行元：Proceedings of the International Congress on Acoustics

We numerically study port noise observed for a bass reflex speaker system with a compressible fluid solver, Large-Eddy Simulation(LES), to explore the noise generation mechanism from the viewpoint of aeroacoustics. The port noise is considered an aerodynamic sound generated by vortices, created by the interaction between the acoustic field and port opening. However, the detail of the sound generation mechanism is still an open problem. By using a 3D-model, the port-noise is well reproduced by compressible LES, when the speaker system is acoustically driven at its resonance frequency, the Helmholtz resonance frequency of the bass reflex speaker system. Vortices are created near the edges of the port and generate broadband noise. However, noises are enhanced due to resonance in some bands, which correspond to the acoustic resonance frequencies of the port and encloser themselves. We are also planning to investigate the noise generation mechanism by using Howe's energy corollary, which allows us to estimate the energy transfer between fluid dynamics and acoustics, namely we consider the problems of where the aeroacoustic noise is generated and how much energy is transferred from the vortex motions to the acoustic field.

Scopus
Aeroacoustic analysis of oboe reeds with compressible direct numerical simulation 査読

Nakahara Y., Sumita R., Tabata R., Iwagami S., Nanri T., Kobayashi T., Hattori Y., Takahashi K.

Proceedings of the International Congress on Acoustics 2022年（ ISSN:22267808 ）

　詳細を見る

出版者・発行元：Proceedings of the International Congress on Acoustics

A two-dimensional model of an oboe reed is studied numerically with a direct numerical simulation (DNS) of the compressible Navier-Stokes equations to investigate the sound generation mechanism from the viewpoint of aeroacoustics. The numerical tool is extremely accurate due to the smallest mesh size on the order of micrometers and successfully reproduces the details of fluid motion and acoustic vibrations inside and outside the reed. Particular attention is paid to the effect of reed vibration on the sound generation mechanism. When the reeds are fixed and a periodically varying flow is injected through the fixed reed slit, an aerodynamics sound created inside the reeds is an almost monotone including a few overtones. On the other hand, when a flow is injected through periodically vibrating reeds from an oral cavity, more overtone components are observed and the pressure waveforms are similar to those observed in the experiment. This indicates that the richness of the overtones of the double-reed instrument is mainly attributed to the aerodynamic sound created by the flow injected through vibrating reeds and the bore, a linear resonator, just enhances characteristics of the instrument, e.g., formant.

Scopus
Numerical study of a French horn mouthpiece accompanied by vibrating lips and an oral cavity with compressible direct numerical simulation 査読

Sumita R., Tabata R., Iwagami S., Nakahara Y., Nanri T., Kobayashi T., Hattori Y., Takahashi K.

Proceedings of the International Congress on Acoustics 2022年（ ISSN:22267808 ）

　詳細を見る

出版者・発行元：Proceedings of the International Congress on Acoustics

A two-dimensional model of a French Horn mouthpiece is numerically studied with a 2D direct numerical simulation (DNS) of the compressible Navier-Stokes equations to investigate the sound generation mechanism from the viewpoint of aeroacoustics. That is, we consider the sounding mechanism of buzzing, when the mouthpiece without a bore is played. Our numerical tool is highly accurate due to the minimum mesh size of the order of the micro-meter, and details of fluid motion and acoustic oscillation inside and near the mouthpiece are successfully reproduced. In particular, we focus on the roles of vibrating lips and an oral cavity in the sound generation mechanism. When the mouthpiece without lips and an oral cavity is driven by a periodic flow with a certain frequency, a single tone without overtones is observed. On the other hand, when the mouthpiece is driven by vibrating lips with an oral cavity, a generating sound includes rich overtones and its waveform is similar to that observed experimentally. Since the bore is a linear element and cannot generates overtones from a single tone by itself, the sound of a horn including rich overtones is generated by a mouthpiece with the vibrating lips and oral cavity.

Scopus
Numerical study of the feedback mechanism of the edge tone 査読

Onomata T., Iwagami S., Tabata R., Ohno T., Nanri T., Kabayashi T., Takahashi K.

Proceedings of the International Congress on Acoustics 2022年（ ISSN:22267808 ）

　詳細を見る

出版者・発行元：Proceedings of the International Congress on Acoustics

We numerically investigate fundamental problems of the edge tone with compressible Large Eddy Simulation (LES) together with acoustic solver FDTD and incompressible LES. Jet oscillation and edge tone in the first mode are successfully reproduced by a 3D model with compressible LES. Namely, the acoustic intensity changes with the jet velocity well following the sixth power law, which is an evidence of the reliability of our numerical method. Next, we estimate the intensity of acoustic feedback in the following way. According to Kaykayoglu and Rockwell, effective pressure sources are considered to be located slightly downstream of the edge tip on both sides of the edge. Indeed, such a pair of positive and negative pressure spots periodically appear in our numerical calculation. Then, we set the pressure spots on both sides of the edge and reproduce acoustic waves radiated from them by FDTD. The acoustic particle velocity of the reproduced acoustic field at the nozzle outlet is regarded as acoustic feedback. Even though such acoustic feedback may make a contribution to driving the jet, we can consider that the fluid feedback is still dominant in a low-Reynolds number regime as pointed out by Paál et al. from the results of incompressible LES.

Scopus
Design of a Flexible In Situ Framework with a Temporal Buffer for Data Processing and Visualization of Time-Varying Datasets 査読

Kenji Ono, Jorji Nonaka, Hiroyuki Yoshikawa, Takeshi Nanri, Yoshiyuki Morie, Tomohiro Kawanabe, Fumiyoshi Shoji

Lecture Notes in Computer Science 11203 243 - 257 2019年1月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）
Hybrid storage system consisting of cache drive and multi-tier SSD for improved IO access when IO is concentrated 査読

Kazuichi Oe, Takeshi Nanri, Koji Okamura

IEICE Transactions on Information and Systems E102D ( 9 ) 1715 - 1730 2019年1月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

In previous studies, we determined that workloads often contain many input-output (IO) concentrations. Such concentrations are aggregations of IO accesses. They appear in narrow regions of a storage volume and continue for durations of up to about an hour. These narrow regions occupy a small percentage of the logical unit number capacity, include most IO accesses, and appear at unpredictable logical block addresses. We investigated these workloads by focusing on page-level regularity and found that they often include few regularities. This means that simple caching may not reduce the response time for these workloads sufficiently because the cache migration algorithm uses page-level regularity. We previously developed an on-the-fly automated storage tiering (OTFAST) system consisting of an SSD and an HDD. The migration algorithm identifies IO concentrations with moderately long durations and migrates them from the HDD to the SSD. This means that there is little or no reduction in the response time when the workload includes few such concentrations. We have now developed a hybrid storage system consisting of a cache drive with an SSD and HDD and a multi-tier SSD that uses OTFAST, called "OTF-AST with caching." The OTF-AST scheme handles the IO accesses that produce moderately long duration IO concentrations while the caching scheme handles the remaining IO accesses. Experiments showed that the average response time for our system was 45% that of Facebook FlashCache on a Microsoft Research Cambridge workload.

DOI： 10.1587/transinf.2018EDP7253
ATSMF Automated tiered storage with fast memory and slow flash storage to improve response time with concentrated input-output (IO) workloads 査読

Kazuichi Oe, Mitsuru Sato, Takeshi Nanri

IEICE Transactions on Information and Systems E101D ( 12 ) 2889 - 2901 2018年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

The response times of solid state drives (SSDs) have decreased dramatically due to the growing use of non-volatile memory express (NVMe) devices. Such devices have response times of less than 100 micro seconds on average. The response times of all-flash-array systems have also decreased dramatically through the use of NVMe SSDs. However, there are applications, particularly virtual desktop infrastructure and in-memory database systems, that require storage systems with even shorter response times. Their workloads tend to contain many input-output (IO) concentrations, which are aggregations of IO accesses. They target narrow regions of the storage volume and can continue for up to an hour. These narrow regions occupy a few percent of the logical unit number capacity, are the target of most IO accesses, and appear at unpredictable logical block addresses. To drastically reduce the response times for such workloads, we developed an automated tiered storage system called “automated tiered storage with fast memory and slow flash storage” (ATSMF) in which the data in targeted regions are migrated between storage devices depending on the predicted remaining duration of the concentration. The assumed environment is a server with non-volatile memory and directly attached SSDs, with the user applications executed on the server as this reduces the average response time. Our system predicts the effect of migration by using the previously monitored values of the increase in response time during migration and the change in response time after migration. These values are consistent for each type of workload if the system is built using both non-volatile memory and SSDs. In particular, the system predicts the remaining duration of an IO concentration, calculates the expected response-time increase during migration and the expected response-time decrease after migration, and migrates the data in the targeted regions if the sum of response-time decrease after migration exceeds the sum of response-time increase during migration. Experimental results indicate that ATSMF is at least 20% faster than flash storage only and that its memory access ratio is more than 50%.

DOI： 10.1587/transinf.2018PAP0005
Approaches for memory-efficient communication library and runtime communication optimization

Takeshi Nanri

Advanced Software Technologies for Post-Peta Scale Computing The Japanese Post-Peta CREST Research Project 121 - 138 2018年12月

　詳細を見る

担当区分：筆頭著者記述言語：英語

This article summarizes the works established in Advanced Communication for Exa (ACE) project. The most important motivation of this project was the severe demands for scalable communication toward Exa-scale computations. Therefore, in the project, we have built a PGAS-based communication library, Advanced Communication Primitives (ACP). Its fundamental communication model is onesided, based on PGAS model, so that it can consume internal memory footprint as small as possible. Based on this model, several applications including simulations of magnetohydrodynamic, molecular orbitals, and particles were tuned to achieve higher scalability. In addition to that, some communication optimization techniques have been investigated. Especially, tuning methods of collective communications, such as message ordering, algorithm selection, and overlapping, are studied. Also, in this project, a network simulator NSIM-ACE is developed. It simulates behavior of packets for one-sided communications to study the effects of congestions on interconnects.

DOI： 10.1007/978-981-13-1924-2_7
Design of a Flexible In Situ Framework with a Temporal Buffer for Data Processing and Visualization of Time-Varying Datasets. 査読

Kenji Ono, Jorji Nonaka, Hiroyuki Yoshikawa, Takeshi Nanri, Yoshiyuki Morie, Tomohiro Kawanabe, Fumiyoshi Shoji

High Performance Computing - ISC High Performance 2018 International Workshops, Frankfurt/Main, Germany, June 28, 2018, Revised Selected Papers 243 - 257 2018年6月

　詳細を見る

記述言語：英語掲載種別：研究論文（その他学術会議資料等）

Design of a Flexible In Situ Framework with a Temporal Buffer for Data Processing and Visualization of Time-Varying Datasets.

DOI： 10.1007/978-3-030-02465-9_17
Performance Evaluation and Optimization of MagnetoHydroDynamic Simulation for Planetary Magnetosphere with Xeon Phi KNL 査読

Keiichiro Fukazawa, Takeshi Soga, Takayuki Umeda, Takeshi Nanri

Parallel Computing is Everywhere 178 - 187 2018年1月

　詳細を見る

記述言語：英語

The magnetohydrodynamic (MHD) simulation is often applied to study the global dynamics and configuration of a planetary magnetosphere for the space weather. In this paper, the computational performance of MHD code is evaluated with 128 nodes Xeon Phi KNL of Cray XC40. As the results, the 2D and 3D domain decompositions of SoA (structure of array) make the effective performances and AoS (array of structure) and hybrid parallel computation become low performances. Adding the performance optimizations for Xeon Phi to our MHD simulation code, then we have obtained 2.4 % increase of execution efficiency in total and we achieved 3 TFlops performance gain using 128 nodes.

DOI： 10.3233/978-1-61499-843-3-178
Analysis of the Quality of Academic Papers by the Words in Abstracts 招待査読国際誌

Tetsuya Nakatoh, Kenta Nagatani, Toshiro Minami, Sachio Hirokawa, Takeshi Nanri, Miho Funamori

HIMI 2017, Part II, LNCS 10274, Proc. of the 19th International Conference on Human-Computer Interaction (HCI International 2017) 2017年7月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）
HPCにおける通信ライブラリの動向査読

南里豪志

シミュレーション 36 ( 2 ) 79 - 84 2017年6月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）
Assessing the Significance of Scholarly Articles using their Attributes 招待査読国際誌

Tetsuya Nakatoh, Sachio Hirokawa, Toshiro Minami, Takeshi Nanri, Miho Funamori

Proc. of the 22nd International Symposium on Artificial Life and Robotics (AROB2017) 742 - 746 2017年1月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）
同種コンパイラーと他機種実行を利用した計算時間の短縮査読

藤野清次, 小玉捷平, 南里豪志, 岩里洸介

日本シミュレーション学会論文誌 8 ( 1 ) 21 - 24 2016年1月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）
性能向上を期待できる継続時間とIOアクセス数を満たしたIOアクセス集中領域を自動抽出してSSDに移動することで性能向上を図る階層型ストレージシステムの提案と評価査読

大江和一, 岩田聡, 南里豪志, 岡村耕二

情報処理学会論文誌 9 ( 1 ) 1 - 16 2016年1月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）
直接網において複数の通信デバイスを有効に使用する隣接通信アルゴリズムの提案査読

森江善之, 森江善之, 南里豪志, 南里豪志

情報処理学会論文誌トランザクションコンピューティングシステム(Web) 8 ( 4 ) 26-35 (WEB ONLY) 2015年11月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）

A Neighboring Communication Algorithm Using Effective Multiple Communication Devices on Direct Connection Network
並列計算における reduction指示の実装に関する考察査読

岩里洸介, 南里豪志, 藤野清次

日本シミュレーション学会論文誌 7 ( 4 ) 109 - 113 2015年7月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）
Performance Measurements of MHD Simulation for Planetary Magnetosphere on Peta-Scale Computer FX10 査読

FUKAZAWA Keiichiro, Takeshi Nanri, Takayuki Umeda

Advances in Parallel Computing 2014年3月

　詳細を見る

記述言語：英語

DOI： 10.3233/978-1-61499-381-0-387
Performance evaluation of magnetohydrodynamics simulation for magnetosphere on K computer 査読

FUKAZAWA Keiichiro, Takeshi Nanri, Takayuki Umeda

Communications in Computer and Information Science 2013年12月

　詳細を見る

記述言語：英語

DOI： 10.1007/978-3-642-45037-2_61
Implementation of Neighbor Communication Algorithm Using Multi-NICs Effectively by Extended RDMA Interface 査読

Yoshiyuki Morie, Takeshi Nanri

SC13 Technical Posters 1 - 2 2013年11月

　詳細を見る

記述言語：その他

Implementation of Neighbor Communication Algorithm Using Multi-NICs Effectively by Extended RDMA Interface
多次元メッシュ/トーラスにおける通信衝突を考慮したタスク配置最適化技術査読

森江善之, 南里豪志

情報処理学会 6 ( 3 ) 12 - 21 2013年9月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）
多次元メッシュ/トーラスにおける通信衝突を考慮したタスク配置最適化技術査読

森江善之, 南里豪志

情報処理学会論文誌トランザクションコンピューティングシステム(Web) 6 ( 3 ) 12-21 (WEB ONLY) 2013年9月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）

Task Allocation Technique for Avoiding Contentions on Multi-dimensional Mesh/Torus
A Neighbor Communication Algorithm with Making an Effective Use of NICs on Multidimensional-Mesh/torus 査読

Yoshiyuki Morie, Takeshi Nanri

International Conference on Simulation Technology (JSST2013) JSST2013 1 - 2 2013年9月

　詳細を見る

記述言語：その他

A Neighbor Communication Algorithm with Making an Effective Use of NICs on Multidimensional-Mesh/torus
Development of a CUDA Implementation of the 3D FDTD Method 査読国際誌

Matthew Livesey, James Francis Stack, Jr., Fumie Costen, Takeshi Nanri, Norimasa Nakashima, Seiji FUJINO

IEEE Antennas and Propagation Magazine 54 ( 5 ) 186 - 195 2012年10月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1109/MAP.2012.6348145
MPI_Allreduceの「京」上での実装と評価査読

松本幸，安達知也，住元真司，曽我武史，南里豪志，宇野篤也，黒川原佳，庄司文由，横川三津夫

情報処理学会 ACS論文誌 ( 40 ) 2012年9月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）
Task Allocation Optimization for Neighboring Communication on Fat Tree 査読

Yoshiyuki Morie, Takeshi Nanri

4th IEEE International Conference on High Performance Computing and Communication 9th IEEE International Conference on Embedded Software and Systems, HPCC-ICESS 2012 1219 - 1225 2012年1月

　詳細を見る

記述言語：その他

Task Allocation Optimization for Neighboring Communication on Fat Tree

DOI： 10.1109/HPCC.2012.179
A Method for Predicting a Penalty of Contentions by Considering Priorities of Routing among Packets on Direct Interconnection Network 査読

Yoshiyuki Morie, Takeshi Nanri, Ryutaro Susukita

2011 Fourth International Joint Conference on Computational Sciences and Optimization 263 - 267 2011年4月

　詳細を見る

記述言語：その他

A Method for Predicting a Penalty of Contentions by Considering Priorities of Routing among Packets on Direct Interconnection Network

DOI： 10.1109/CSO.2011.35
Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communication 査読

Yoshiyuki Morie, Takeshi Nanri, Motoyoshi Kurokawa

The Tenth IASTED International Conference on Parallel and Distributed Computing and Networks 62 - 69 2011年2月

　詳細を見る

記述言語：その他

Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communication

DOI： 10.2316/P.2011.719-025
負荷バランスの動的最適化によるMPIブロードキャスト性能改善査読

曽我武史, 栗原康志, 南里豪志, 黒川原佳, 村上和彰

情報処理学会論文誌　コンピュータシステム 2008年12月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）

Dynamic Optimization of Load Balance in MPI Broadcast
Performance Models for MPI Collective Communications with Network Contention 査読

Hyacinthe Nzigou Mamadou, Takeshi Nanri and Kazuaki Murakami

IEICE Transactions on Communications 2008年5月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）
衝突削減のためのタスク配置最適化に関する研究査読

森江善之, 末安直樹, 松本透, 南里豪志, 石畑宏明, 井上弘士, 村上和彰

次世代スーパーコンピューティング・シンポジウム2007 2007 2007年10月

　詳細を見る

記述言語：その他
通信タイミングを考慮した衝突削減のためのMPIランク配置最適化技術査読

森江善之, 末安直樹, 松本透, 南里豪志, 石畑宏明, 井上弘士, 村上和彰

情報処理学会論文誌 48 ( SIG13(ACS19) ) 192 - 202 2007年8月

　詳細を見る

記述言語：日本語掲載種別：研究論文（学術雑誌）

Optimization of MPI Rank Allocation Considering Communication Timing for Reducing Contention

▼全件表示

講演・口頭発表等

DIMMスロット装着型不揮発性メモリ上のRDMAによるメッセージキューイングシステムの試作

@南里豪志、@大江和一、@吉田英司、@大辻弘貴、@林英里香

大学ICT推進協議会2020年度年次大会 2020年12月

　詳細を見る

開催年月日： 2020年12月

記述言語：日本語会議種別：口頭発表（一般）

開催地：オンライン国名：日本国
量子コンピュータと古典コンピュータのハイブリッド環境におけるタスクスケジューラの実装

津田拓実, 小林泰三, 高橋公也, @南里豪志

第198回ハイパフォーマンスコンピューティング・第14回量子ソフトウェア合同研究発表会 2025年3月

　詳細を見る

開催年月日： 2025年3月

記述言語：日本語会議種別：口頭発表（一般）

開催地：札幌市
高スループット非同期集団通信の性能モデル化に向けた予備評価

森江善之, 和田康孝, 小林諒平, 坂本龍一, @南里豪志

第198回ハイパフォーマンスコンピューティング・第14回量子ソフトウェア合同研究発表会 2025年3月

　詳細を見る

開催年月日： 2025年3月

記述言語：日本語会議種別：口頭発表（一般）

開催地：札幌市
異なる非ブロッキング集団通信の実装方法間で通信隠蔽効果の比較ができるベンチマークの実装

鳴海丈生, @南里豪志

第198回ハイパフォーマンスコンピューティング・第14回量子ソフトウェア合同研究発表会 2025年3月

　詳細を見る

開催年月日： 2025年3月

記述言語：日本語会議種別：口頭発表（一般）

開催地：札幌市
Halo通信における不連続データ転送のTofuインターコネクトによる実装と性能評価

有迫廉真, @南里豪志

2024年並列／分散／協調処理に関するサマー・ワークショップ 2024年8月

　詳細を見る

開催年月日： 2024年8月

記述言語：日本語会議種別：口頭発表（一般）

開催地：徳島市
Implementation of Coupled Numerical Analysis of Magnetospheric Dynamics and Spacecraft Charging Phenomena via Code-To-Code Adapter (CoToCoA) Framework 国際会議

Y. Miyake, Y. Sunada, Y. Tanaka, K. Nakazawa, @T. Nanri, K. Fukazawa and Y. Katoh

ICCS 2023 2023年6月

　詳細を見る

開催年月日： 2023年6月

記述言語：英語会議種別：口頭発表（一般）

開催地：Prague 国名：チェコ共和国
九州大学スーパーコンピュータとAWSクラウドサービスによるハイブリッド計算環境の相互補完的利用方法に関する調査

@南里豪志, 松山和広, 田代皓嗣, 原田浩睦

大学ICT推進協議会 2022年度年次大会 2022年12月

　詳細を見る

開催年月日： 2022年12月

記述言語：日本語会議種別：口頭発表（一般）

開催地：仙台国際センター国名：日本国
Cross-reference simulation by Code-To-Code Adapter (CoToCoA) library for the study of multi-scale physics in planetary magnetospheres 国際会議

Yuto Katoh, Keiichiro Fukazawa, @Takeshi Nanri, Yohei Miyake

2021 Ninth International Symposium on Computing and Networking Workshops (CANDARW) 2021年12月

　詳細を見る

開催年月日： 2021年12月

記述言語：英語会議種別：口頭発表（一般）

国名：日本国
実用アプリケーションのスイッチのキーテクノロジーであるSHARPを使用したMPI通信パフォーマンス向上の挑戦と、将来のスイッチテクノロジーへの期待招待

南里豪志

GPU TECHNOLOGY CONFERENCE 2020年10月

　詳細を見る

開催年月日： 2020年10月

記述言語：日本語会議種別：口頭発表（一般）

開催地：オンライン国名：日本国
Scalable Direct-Iterative Hybrid Solver for Sparse Matrices on Multi-Core and Vector Architectures

Kenji Ono, Toshihiro Kato, Satoshi Ohshima, Takeshi Nanri

International Conference on High Performance Computing in Asia-Pacific Region 2019年12月

　詳細を見る

開催年月日： 2020年1月

記述言語：英語

開催地：Fukuoka 国名：日本国
Application of cross-reference framework CoToCoA to Macro- and micro-scale simulations of planetary magnetospheres

Keiichiro Fukazawa, Yuto Katoh, Takeshi Nanri, Yohei Miyake

7th International Symposium on Computing and Networking Workshops, CANDARW 2019 2019年11月

　詳細を見る

開催年月日： 2019年11月

記述言語：英語

開催地：Nagasaki 国名：日本国

In this study, we have introduced the Code-to-Code Adapter (CoToCoA) library to couple the magnetohydrodynamic (MHD) simulation and the Electron Hybrid (EH) simulation of planetary magnetospheres. CoToCoA has been developed newly to connect the different codes easily. The concept of CoToCoA is that we do not add modifications to each code as possible without data transfer functions, and we do not need to know the referred code without data format. With CoToCoA, we have been developing the cross-reference simulation of macro (MHD) and micro (EH) scales in the magnetosphere. Then, we have evaluated the performance of cross-reference simulation using CoToCoA on the massively parallel computer system.
Hybrid Storage System to Achieve Efficient Use of Fast Memory Area

Kazuichi Oe, Takeshi Nanri

7th International Symposium on Computing and Networking, CANDAR 2019 2019年11月

　詳細を見る

開催年月日： 2019年11月

記述言語：英語

開催地：Nagasaki 国名：日本国

Hybrid storage techniques are useful methods to improve the cost performance for input-output (IO) intensive workloads. These techniques choose areas of concentrated IO accesses and migrate them to an upper tier to extract as much performance as possible through greater use of upper tier areas. Automated tiered storage with fast memory and slow flash storage (ATSMF) is a hybrid storage system situated between non-volatile memories (NVMs) and solid-state drives (SSDs). ATSMF aims to reduce the average response time for IO accesses by migrating areas of concentrated IO access from an SSD to an NVM. When a concentrated IO access finishes, the system migrates these areas from the NVM back to the SSD. Unfortunately, the published ATSMF implementation temporarily consumes much NVM capacity upon migrating concentrated IO access areas to NVM, because its algorithm executes NVM migration with high priority. As a result, it often delays evicting areas in which IO concentrations have ended to the SSD. Therefore, to reduce the consumption of NVM while maintaining the average response time, we developed new techniques for making ATSMF more practical. The first is a queue handling technique based on the number of IO accesses for NVM migration and eviction. The second is an eviction method that selects only write-accessed partial regions in finished areas. The third is a technique for variable eviction timing to balance the NVM consumption and average response time. Experimental results indicate that the average response times of the proposed ATSMF are almost the same as those of the published ATSMF, while the NVM consumption is drastically lower.
Performance improvement of high-speed file transfer over JHPCN

Praphan Pavarangkoon, Ken T. Murata, Kazunori Yamamoto, Kazuya Muranaga, Takamichi Mizuhara, Keiichiro Fukazawa, Ryusuke Egawa, Takahiro Katagiri, Masao Ogino, Takeshi Nanri

17th IEEE International Conference on Dependable, Autonomic and Secure Computing, IEEE 17th International Conference on Pervasive Intelligence and Computing, IEEE 5th International Conference on Cloud and Big Data Computing, 4th Cyber Science and Technology Congress, DASC-PiCom-CBDCom-CyberSciTech 2019 2019年8月

　詳細を見る

開催年月日： 2019年8月

記述言語：英語

開催地：Fukuoka 国名：日本国

This paper proposes a novel file transfer tool to improve file transfer performance over Japan high performance computing and networking (JHPCN). We first develop a high-performance and flexible protocol (HpFP) for inter-datacenter transport network. An original HpFP is designed first for specified networks and puts more emphasis on latency and packet loss tolerances than fairness and friendliness, while an enhanced HpFP is more suitable for real network environments. Then, based on the enhanced HpFP, we implement a file transfer tool, called high-performance copy (HCP). The performance of our file transfer tool is evaluated between datacenters of JHPCN using real datasets collected from supercomputer resources. The results show that the HCP achieves higher throughput than traditional tool for file transfer over JHPCN.
Non-volatile memory driver for applying automated tiered storage with fast memory and slow flash storage

Kazuichi Oe, Takeshi Nanri

6th International Symposium on Computing and Networking Workshops, CANDARW 2018 2018年12月

　詳細を見る

開催年月日： 2018年11月

記述言語：英語

開催地：Takayama 国名：日本国

Automated tiered storage with fast memory and slow flash storage (ATSMF) is a hybrid storage system located between non-volatile memories (NVMs) and solid state drives (SSDs). ATSMF aims to reduce average response time for inputoutput (IO) accesses by migrating concentrated IO access areas from SSD to NVM. However, the current ATSMF implementation cannot reduce average response time sufficiently because of the bottleneck caused by the Linux brd driver, which is used for the NVM access driver. The response time of the brd driver is more than ten times larger than memory access speed. To reduce the average response time sufficiently, we developed a block-level driver for NVM called a 'two-mode (2M) memory driver.' The 2M memory driver has both the. map IO access mode and direct IO access mode to reduce the response time while maintaining compatibility with the Linux device-mapper framework. The direct IO access mode has a drastically lower response time than the Linux brd driver because the ATSMF driver can execute the IO access function of 2M memory driver directly. Experimental results also indicate that ATSMF using the 2M memory driver reduces the IO access response time to less than that of ATSMF using the Linux brd driver in most cases.
Design of a Flexible In Situ Framework with a Temporal Buffer for Data Processing and Visualization of Time-Varying Datasets

Kenji Ono, Jorji Nonaka, Hiroyuki Yoshikawa, Takeshi Nanri, Yoshiyuki Morie, Tomohiro Kawanabe, Fumiyoshi Shoji

International Conference on High Performance Computing, ISC High Performance 2018 2018年1月

　詳細を見る

開催年月日： 2018年6月

記述言語：英語

開催地：Frankfurt 国名：ドイツ連邦共和国

This paper presents an in situ framework focused on time-varying simulations, and uses a novel temporal buffer for storing simulation results sampled at user-defined intervals. This framework has been designed to provide flexible data processing and visualization capabilities in modern HPC operational environments composed of powerful front-end systems, for pre-and post-processing purposes, along with traditional back-end HPC systems. The temporal buffer is implemented using the functionalities provided by Open Address Space (OpAS) library, which enables asynchronous one-sided communication from outside processes to any exposed memory region on the simulator side. This buffer can store time-varying simulation results, and can be processed via in situ approaches with different proximities. We present a prototype of our framework, and code integration process with a target simulation code. The proposed in situ framework utilizes separate files to describe the initialization and execution codes, which are in the form of Python scripts. This framework also enables the runtime modification of these Python-based files, thus providing greater flexibility to the users, not only for data processing, such as visualization and analysis, but also for the simulation steering.
Design of an In Transit Framework with Staging Buffer for Flexible Data Processing and Visualization of Time-Varying Data

Kenji Ono, Jorji Nonaka, Yoshiyuki Morie, Takeshi Nanri, Tomohiro Kawanabe

ISC WORKSHOP ON IN SITU VISUALIZATION 2018 2018年6月

　詳細を見る

開催年月日： 2018年6月

記述言語：英語

開催地：Frankfurt 国名：ドイツ連邦共和国
Proposal of Interface for Runtime Memory Manipulation of Applications via PGAS-based Communication Library 招待国際会議

Takeshi Nanri

Workshop on PGAS programming models: Experiences and Implementations (PGAS-EI) 2018年1月

　詳細を見る

開催年月日： 2018年1月

記述言語：英語会議種別：口頭発表（招待・特別）

開催地：Tokyo 国名：日本国
Automated Tiered Storage System Consisting of Memory and Flash Storage to Improve Response Time with Input-Output (IO) Concentration Workloads

Kazuichi Oe, Mitsuru Sato, Takeshi Nanri

5th International Symposium on Computing and Networking, CANDAR 2017 2018年4月

　詳細を見る

開催年月日： 2017年11月

記述言語：英語

開催地：Aomori 国名：日本国

The response time of solid state drives (SSDs) has dramatically reduced according to the spread of non-volatile memory express (NVMe) devices. These devices have response times of less than 100 micro seconds on average. The response time of all-flash-array systems has also drastically reduced through the use of NVMe SSDs. However, there are applications, particularly, virtual desktop infrastructure and in-memory database systems, that require storage systems with even shorter response time. Their workloads were found to contain many input-output (IO) concentrations. We define IO concentration by using a declarative style. Input-output (IO) concentrations are aggregations of IO accesses. They appear in narrow regions of the storage volume and continue for periods of up to about an hour. These narrow regions occupy a few percent of the logical unit number capacity, include most IO accesses, and appear at unpredictable logical block addresses. To drastically reduce the response time of these workloads, we developed automated tiered storage system called 'automated tiered storage with fast memory and slow flash storage' (ATSMF). The memory component of ATSMF is a memory with a non-volatile feature. The system predicts the remaining duration of IO concentration, calculates the response-time increase during migration and response-time decrease after migration, and migrates the IO concentrations if the response-time decrease after migration surpasses the response-time increase during migration. Experimental results indicate that ATSMF is at least 20% faster than flash storage only and its memory access ratio is more than 50%.
Analysis of the quality of academic papers by the words in abstracts

Tetsuya Nakatoh, Kenta Nagatani, Toshiro Minami, Sachio Hirokawa, Takeshi Nanri, Miho Funamori

Thematic track on Human Interface and the Management of Information, held as part of the 19th International Conference on Human–Computer Interaction, HCI International 2017 2017年1月

　詳細を見る

開催年月日： 2017年7月

記述言語：英語

開催地：Vancouver 国名：カナダ

The investigation of related research is very important for research activities. However, it is not easy to choose an appropriate and important academic paper from among the huge number of possible papers. The researcher searches by combining keywords and then selects an paper to be checked because it uses an index that can be evaluated. The citation count is commonly used as this index, but information about recently published papers cannot be obtained. This research attempted to identify good papers using only the words included in the abstract. We constructed a classifier by machine learning and evaluated it using cross validation. As a result, it was found that a certain degree of discrimination is possible.
Parallel Application Experiences Using Advanced Communication Primitives 国際会議

Shinji Sumimoto, Yuichiro Ajima, Takafumi Nose, Kazushige Saga, Naoyuki Shida, Takeshi Nanri

25th Euromicro International Conference on Parallel, Distributed and network-based Processing 2017年3月

　詳細を見る

開催年月日： 2017年3月

記述言語：英語会議種別：口頭発表（一般）

国名：ロシア連邦
Feasibility study for building hybrid storage system consisting of non-volatile DIMM and SSD

Kazuichi Oe, Takeshi Nanri, Koji Okamura

4th International Symposium on Computing and Networking, CANDAR 2016 2017年1月

　詳細を見る

開催年月日： 2016年11月

記述言語：英語

開催地：Hiroshima 国名：日本国

Various vendors develop a byte accessible Nonvolatile Dual-Inline Memory Module (NVDIMM). The performance of the NVDIMM drastically surpasses that of the Solid State Drive (SSD), which is connected by PCI express. However, the cost of the NVDIMM is much higher than that of the SSD. Therefore, a hybrid storage system between the NVDIMM and SSD is an effective technique for improving cost-performance. If a system uses the NVDIMM less while maintaining performance, its cost-performance should be improved. Our previous work involves on-the-fly automated storage tiering (OTF-AST). OTF-AST is a hybrid storage system consisting of an SSD and HDD. It aims to reduce the average response time of IO accesses by migrating only the IO concentration area to the SSD when IO concentration happens. Therefore, we construct OTF-AST with both the DIMM and SSD and evaluate it in order to understand how to build a cost-effective hybrid storage system with these devices. We use a DIMM instead of a byte accessible NVDIMM, which is difficult to obtain. As a result, we found that the original OTF-AST is suitable for a hybrid storage system consisting of the DIMM and SSD. Moreover, we can improve the performance of OTF-AST if replace its migration algorithm with a more positive migration algorithm. This is because the IO access response time barely increases when the data migration between the DIMM and SSD is done. We will build a more positive migration algorithm in the near future.
Effect of Overlapping Halo Exchange with One-Sided Communication 国際会議

Takeshi Nanri, Keiichiro Fukazawa

5th JSST Annual Conference International Conference on Simulation Technology 2016年10月

　詳細を見る

開催年月日： 2016年10月

記述言語：英語会議種別：口頭発表（一般）

開催地：Kyoto 国名：日本国
Development of A Memory Efficient Communication Method for Connecting MPI Programs by using ACP Library 国際会議

Hiroaki Honda, Yoshiyuki Morie, Takeshi Nanri

5th JSST Annual Conference International Conference on Simulation Technology 2016年10月

　詳細を見る

開催年月日： 2016年10月

記述言語：英語会議種別：口頭発表（一般）

開催地：Kyoto 国名：日本国
Efficient communications of particle data in particle-based simulations 国際会議

Ryutaro Susukita, Yoshiyuki Morie, Takeshi Nanri

5th JSST Annual Conference International Conference on Simulation Technology 2016年10月

　詳細を見る

開催年月日： 2016年10月

記述言語：英語会議種別：口頭発表（一般）

開催地：Kyoto 国名：日本国
Performance Evaluation of MHD Simulation Code with X86 CPUs and Manycore Systems 国際会議

Keiichiro Fukazawa, Takayuki Umeda, Takeshi Nanri

5th JSST Annual Conference International Conference on Simulation Technology 2016年10月

　詳細を見る

開催年月日： 2016年10月

記述言語：英語会議種別：口頭発表（一般）

開催地：Kyoto 国名：日本国
Effective calculation with halo communication using halo functions

Keiichiro Fukazawa, Yoshiyuki Morie, Toshiya Takami, Takeshi Nanri, Takeshi Soga

23rd European MPI Users' Group Meeting, EuroMPI 2016 2016年9月

　詳細を見る

開催年月日： 2016年9月

記述言語：英語

開催地：Edinburgh 国名：グレートブリテン・北アイルランド連合王国(英国)

The issue of halo communication is the decrease of parallel scalability. To overcome the issues, we have introduced "Halo thread" to our simulation code. However, we have not solved the issue basically in the strong scaling. In this study, we have developed the Halo functions which perform the halo communication effectively. Then we can perform the calculation and communication in a pipeline and obtained good performance.
The Design of Advanced Communication to Reduce Memory Usage for Exa-scale Systems 国際会議

Shinji Sumimoto, Yuichiro Ajima, Kazushige Saga, Takafumi Nose, Naoyuki Shida, Takeshi Nanri

12th International Meeting On High Performance Computing for Computational Science 2016年9月

　詳細を見る

開催年月日： 2016年9月

記述言語：英語会議種別：口頭発表（一般）

国名：ポルトガル共和国
Improvement of Eisenstat-SSOR preconditioning using tolerance value 国際会議

Seiji FUJINO, Takeshi Nanri

5th IMA Conference on Numerical Linear Algebra and Optimization 2016年9月

　詳細を見る

開催年月日： 2016年9月

記述言語：英語会議種別：口頭発表（一般）

開催地：Birmingham 国名：グレートブリテン・北アイルランド連合王国(英国)
Effective Calculation with Halo communication using Halo Functions 国際会議

Keiichiro Fukazawa, Toshiya Takami, Takeshi Soga, Yoshiyuki Morie, Takeshi Nanri

23rd European MPI Users' Group Meeting 2016年9月

　詳細を見る

開催年月日： 2016年9月

記述言語：英語会議種別：口頭発表（一般）

開催地：Edinburgh 国名：グレートブリテン・北アイルランド連合王国(英国)
Runtime Algorithm Selection of Collective Communication with RMA-based Monitoring Mechanism 国際会議

Takeshi Nanri

4th Annual MVAPICH Users Group Meeting 2016年8月

　詳細を見る

開催年月日： 2016年8月

記述言語：英語会議種別：口頭発表（一般）

開催地：Columbus, Ohio 国名：アメリカ合衆国
NSIM-ACE: An Interconnection Network Simulator for Evaluating Remote Direct Memory Access 国際会議

Ryutaro Susukita, Yoshiyuki Morie, Takeshi Nanri

International Conference on Simulation and Modeling Methodologies, Technologies and Applications 2016年7月

　詳細を見る

開催年月日： 2016年7月

記述言語：英語会議種別：口頭発表（一般）

国名：ポルトガル共和国
The design of advanced communication to reduce memory usage for exa-scale systems

Shinji Sumimoto, Yuichiro Ajima, Kazushige Saga, Takafumi Nose, Naoyuki Shida, Takeshi Nanri

12th International Conference on High Performance Computing for Computational Science, VECPAR 2016 2017年1月

　詳細を見る

開催年月日： 2016年6月

記述言語：英語

開催地：Porto 国名：ポルトガル共和国

Current MPI (Message Passing Interface) communication libraries require larger memories in proportion of the number of processes, and can not be used for exa-scale systems. This paper proposes a global memory based communication design to reduce memory usage for exa-scale communication. To realize exa-scale communication, we propose true global memory based communication primitives called Advanced Communication Primitives (ACPs). ACPs provide global address, which is able to use remote atomic memory operations on the global memory, RDMA (Remote Direct Memory Access) based remote memory copy operation, global heap allocator and global data libraries. ACPs are different from the other communication libraries because ACPs are global memory based so that house keeping memories can be distributed to other processes and programmers explicitly consider memory usage by using ACPs. The preliminary result of memory usage by ACPs is 70 MB on one million processes.
Memory Efficient One-Sided Communucation Library "aCP" in Globary Memory on Raspberry Pi 2

Yoshiyuki Morie, Hiroaki Honda, Takeshi Nanri, Taizo Kobayashi, Hidetomo Shibamura, Ryutaro Susukita, Yuichiro Ajima

36th IEEE International Conference on Distributed Computing Systems, ICDCS 2016 2016年8月

　詳細を見る

開催年月日： 2016年6月

記述言語：英語

開催地：Nara 国名：日本国

Previously, communications in parallel programs forHigh Performance Computing (HPC) and Distributed Computing(DC) are mostly written with two-sided communicationinterfaces that are based on a pair of operations, Send andReceive. Since such interface requires explicit synchronizationbetween both sides of the communication, techniquesfor communication optimization such as overlapping are notefficiently described in many cases. On the other hand, onesidedcommunication interface is becoming important as amethod to describe asynchronous communications to enablehighly overlapped communication with computation. As oneof such interface, in this demonstration, Advanced CommunicationPrimitives (ACP) is introduced. ACP is a portableinterface that supports UDP, IBverbs of InfiniBand and Tofulibrary of K Computer. In addition to that, it is designed tobe memory efficient. For example, with 10 thousand processes, the memory consumption of ACP over UDP is estimated to beless than 1MB. Since the number of computational elements isincreasing more rapidly than the amount of available memory, this memory efficiency is becoming one of the keys for parallelprograms in HPC and DC. To show this characteristics, we runACP library on Raspberry Pi 2, and examine its performanceand memory consumption.
Evaluation of On-Demand Message-Passing Module over RDMA Network

Takeshi Nanri

ACSI2016 2016年1月

　詳細を見る

開催年月日： 2016年1月

記述言語：英語会議種別：口頭発表（一般）

開催地：Fukuoka 国名：日本国
Analysis of Storage Workloads of Input-Output Access Locality and Designing of Hybrid Storage System 国際会議

Kazuichi Oe, Takeshi Nanri, KOJI OKAMURA

1st International Conference on Enterprise Architecture and Information Systems 2016年1月

　詳細を見る

開催年月日： 2016年1月

記述言語：英語会議種別：口頭発表（一般）

国名：日本国
Performance Evaluation of RDMA Communication Patterns by Means of Simulations 国際会議

Ryutaro Susukita, Yoshiyuki Morie, Takeshi Nanri, Hidetomo Shibamura

2015 Joint International Mechanical, Electronic and Information Technology Conference 2015年12月

　詳細を見る

開催年月日： 2015年12月

記述言語：英語会議種別：口頭発表（一般）

開催地：Chonqing 国名：中華人民共和国
On-The-Fly Automated Storage Tiering with Caching and both Proactive and Observational Migration 国際会議

Kazuichi Oe, Takeshi Nanri, KOJI OKAMURA

Workshop on Computer Systems and Architectures (CSA'15) 2015年12月

　詳細を見る

開催年月日： 2015年12月

記述言語：英語会議種別：口頭発表（一般）

開催地：Sapporo 国名：日本国
直接網において複数の通信デバイスを有効に使用する隣接通信アルゴリズムの提案

森江善之, 南里豪志

2015 ハイパフォーマンスコンピューティングと計算科学シンポジウム 2015年5月

　詳細を見る

開催年月日： 2015年5月

記述言語：日本語会議種別：口頭発表（一般）

国名：日本国
Performance and memory usage evaluations for channel interface of Advanced Communication Primitives library 国際会議

Hiroaki Honda, Takeshi Nanri, Yoshiyuki Morie

1st Pan-American Congress on Computational Mechanics (PANACM 2015) 2015年4月

　詳細を見る

開催年月日： 2015年4月

記述言語：英語会議種別：口頭発表（一般）

開催地：Buenos Aires 国名：アルゼンチン共和国
Channel Interface: a Primitive Model for Memory Efficient Communication 国際会議

Takeshi Nanri

23rd Euromicro International Conference on Parallel, Distributed and network-based Processing 2015年3月

　詳細を見る

開催年月日： 2015年3月

記述言語：英語会議種別：口頭発表（一般）

開催地：Turku 国名：フィンランド共和国
Design and Implementation of Channel Interface as a Memory Efficient Communication Model 国際会議

Takeshi Nanri

Annual Meeting on Advanced Computing System and Infrastructure (ACSI) 2015 2015年1月

　詳細を見る

開催年月日： 2015年1月

記述言語：英語会議種別：口頭発表（一般）

開催地：Tsukuba 国名：日本国
Proposal of HINT Interface for Runtime Tuning of Communication Links 国際会議

Takeshi Nanri

22nd Euromicro International Conference on Parallel, Distributed and network-based Processing 2014年2月

　詳細を見る

開催年月日： 2014年2月

記述言語：英語会議種別：口頭発表（一般）

開催地：Turin 国名：イタリア共和国
性能予測と実測を併用した集団通信アルゴリズム選択

児玉大器, 南里豪志

今後のHPC（基盤技術と応用）に関するワークショップ 2013年12月

　詳細を見る

開催年月日： 2013年12月

記述言語：日本語会議種別：口頭発表（一般）

開催地：長崎市国名：日本国
MPI における最適化情報提供のためのインターフェイスに関する評価

南里豪志

今後のHPC（基盤技術と応用）に関するワークショップ 2013年12月

　詳細を見る

開催年月日： 2013年12月

記述言語：日本語会議種別：口頭発表（一般）

開催地：長崎市国名：日本国
プログラムのヒント情報を用いた通信ライブラリ動的最適化技術について

杉山裕宣, 南里豪志

今後のHPC（基盤技術と応用）に関するワークショップ 2013年12月

　詳細を見る

開催年月日： 2013年12月

記述言語：日本語会議種別：口頭発表（一般）

開催地：長崎市国名：日本国
Performance Study of Non-blocking Collective Communication Implementations Toward Adaptive Selection 国際会議

Tsuyoshi Okuma, Takeshi Nanri

Networking, Computing, Systems and Software 2013年12月

　詳細を見る

開催年月日： 2013年12月

記述言語：英語会議種別：口頭発表（一般）

開催地：Matsuyama 国名：日本国
Topology Aware Performance Prediction of Collective Communication Algorithms on Multi-Dimensional Mesh/Torus 国際会議

Hironobu Sugiyama, Takeshi Nanri

Networking, Computing, Systems and Software 2013年12月

　詳細を見る

開催年月日： 2013年12月

記述言語：英語会議種別：口頭発表（一般）

開催地：Matsuyama 国名：日本国
通信ライブラリの自動チューニングを支援する Hint API の提案

南里豪志

第141回ハイパフォーマンスコンピューティング研究会 2013年10月

　詳細を見る

開催年月日： 2013年10月

記述言語：日本語会議種別：口頭発表（一般）

開催地：那覇市国名：日本国
A　neighbor　communication algorithm with making an effective use　of NICs on multidimensional-mesh/torus 国際会議

Yoshiyuki Morie, Takeshi Nanri

International Conference on Simulation Technology 2013年9月

　詳細を見る

開催年月日： 2013年9月

記述言語：英語会議種別：口頭発表（一般）

開催地：Tokyo 国名：日本国
What Communication Library Can do with a Little Hint from Programmers? 国際会議

Takeshi Nanri

MVAPICH User Group Meeting 2013年8月

　詳細を見る

開催年月日： 2013年8月

記述言語：英語会議種別：口頭発表（一般）

開催地：Columbus 国名：アメリカ合衆国
A Cost-Efficient Approach for Automatic Algorithm Selection of Collective Communications 招待国際会議

Takeshi Nanri, Hironobu Sugiyama, FUKAZAWA Keiichiro

SIAM Conference on Computational Science and Engineering 2013年3月

　詳細を見る

開催年月日： 2013年2月 - 2013年3月

記述言語：英語会議種別：口頭発表（一般）

開催地：Boston 国名：アメリカ合衆国
多次元メッシュ/トーラスにおけるプロセス配置に応じた集団通信アルゴリズム選択技術の提案

南里豪志, 杉山裕宣, 森江善之

第138回ハイパフォーマンスコンピューティング研究会 2013年2月

　詳細を見る

開催年月日： 2013年2月

記述言語：日本語会議種別：口頭発表（一般）

開催地：あわら市国名：日本国
多次元メッシュ/トーラスにおける通信衝突を考慮したタスク配置最適化技術

森江善之, 南里豪志

ハイパフォーマンスコンピューティングと計算科学シンポジウム 2013年1月

　詳細を見る

開催年月日： 2013年1月

記述言語：日本語会議種別：口頭発表（一般）

開催地：東京国名：日本国
Evaluation of Implementation Methods for Non-Blocking Collective Communications in Overlapping Communication and Computation 国際会議

Tsuyoshi Okuma, Takeshi Nanri

International workshop on HPC, Krylov Subspace method and its application 2013年1月

　詳細を見る

開催年月日： 2013年1月

記述言語：英語会議種別：口頭発表（一般）

開催地：Beppu 国名：日本国
Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communication 国際会議

Yoshiyuki Morie, Takeshi Nanri

International workshop on HPC, Krylov Subspace method and its application 2013年1月

　詳細を見る

開催年月日： 2013年1月

記述言語：英語会議種別：口頭発表（一般）

開催地：Beppu 国名：日本国
Introduction of ACE(Advanced Communication library for Exa) Project 招待国際会議

Takeshi Nanri

International workshop on HPC, Krylov Subspace method and its application 2013年1月

　詳細を見る

開催年月日： 2013年1月

記述言語：英語会議種別：口頭発表（一般）

開催地：Beppu 国名：日本国
Performance Prediction Technology for Collective Communication Algorithm on Multi-Dimensional Mesh/Torus 国際会議

Hironobu Sugiyama, Takeshi Nanri

International workshop on HPC, Krylov Subspace method and its application 2013年1月

　詳細を見る

開催年月日： 2013年1月

記述言語：英語会議種別：口頭発表（一般）

開催地：Beppu 国名：日本国
通信衝突を削減するタスク配置最適化における通信タイミングの予測方式の影響

森江善之, 南里豪志

第194回計算機アーキテクチャ・第137回ハイパフォーマンスコンピューティング合同研究発表会（HOKKE-20） 2012年12月

　詳細を見る

開催年月日： 2012年12月

記述言語：日本語会議種別：口頭発表（一般）

開催地：札幌市国名：日本国
An Alternative Domain Decomposition Technique for CUDA-based 3D FDTD Methods 国際会議

Matthew Livesey, James Francis Stack, Jr., Fumie Costen, Takeshi Nanri, Norimasa Nakashima, Seiji FUJINO

9th European Radar Conference 2012年11月

　詳細を見る

開催年月日： 2012年10月 - 2012年11月

記述言語：英語会議種別：口頭発表（一般）

開催地：Amsterdam 国名：オランダ王国
Tofu ネットワークにおけるプロセス配置形状による集団通信アルゴリズムの性能解析,

南里豪志

ハイパフォーマンスコンピューティング研究発表会 2012年10月

　詳細を見る

開催年月日： 2012年10月

記述言語：日本語会議種別：口頭発表（一般）

開催地：那覇市国名：日本国

スーパーコンピュータの大規模化に伴って，ノード間インターコネクトネットワークとして，コストの低い多次元メッシュ/トーラストポロジを採用したものを用いる事例が増えている．多次元メッシュ/トーラスは，使用するノード数が同じでも，プロセスが配置されるノード群の形状によって性能が大きく変動する．本研究では，京コンピュータや，その互換機である Fujitsu PRIMEHPC FX10で用いられている Tofuインターコネクトネットワークを対象として，プロセス配置の形状による集団通信アルゴリズムの性能への影響を計測した．得られた性能を，Tofuインターコネクトの性能解析ツールを用いて取得した通信衝突による転送待ち時間と比較したところ，プロセス配置形状による変動がどちらもほぼ同じ傾向を示すことを明らかにした．これらの結果から，集団通信アルゴリズムの選択において，プロセス配置の形状を考慮した性能見積もりが重要であることを示した．
異なるスカラアーキテクチャ（x86、SPARC64）の電磁流体コードによる性能評価

深沢圭一郎, 南里豪志, 高見利也

ハイパフォーマンスコンピューティング研究発表会 2012年10月

　詳細を見る

開催年月日： 2012年10月

記述言語：日本語会議種別：口頭発表（一般）

開催地：那覇市国名：日本国
Impact of GPU Memory Access Patterns on FDTD 国際会議

Matthew Livesey, James Francis Stack, Jr., Fumie Costen, Takeshi Nanri, Norimasa Nakashima, Seiji FUJINO

IEEE Antennas and Propagation Society International Symposium (APSURSI) 2012年7月

　詳細を見る

開催年月日： 2012年7月

記述言語：英語会議種別：口頭発表（一般）

開催地：Chicago 国名：アメリカ合衆国
Efficient Runtime Algorithm Selection of Collective Communication with Topology-Based Performance Models 国際会議

Takeshi Nanri, Motoyoshi Kurokawa

International Conference on Parallel and Distributed Processing Techniques and Applications 2012年7月

　詳細を見る

開催年月日： 2012年7月

記述言語：英語会議種別：口頭発表（一般）

開催地：Las Vegas 国名：アメリカ合衆国
Effective Performance of Large-Scale MHD Simulation for Planetary Magnetosphere with Massively Parallel Computer 国際会議

FUKAZAWA Keiichiro, Takeshi Nanri

JSST2012 International Conference on Simulation Technology 2012年7月

　詳細を見る

開催年月日： 2012年7月

記述言語：英語会議種別：口頭発表（一般）

開催地：Kobe 国名：日本国
Balancing Communication and Execution Technique for Parallelized Sparse Matrix-Vector Multiplication 国際会議

Seiji FUJINO, Takeshi Nanri, Kenichirou Kusaba

4th International Conference on Future Computational Technologies and Applications 2012年7月

　詳細を見る

開催年月日： 2012年7月

記述言語：英語会議種別：口頭発表（一般）

開催地：Nice 国名：フランス共和国
Task Allocation Optimization for Neighboring Communication on Fat Tree 国際会議

Yoshiyuki Morie, Takeshi Nanri

14th IEEE International Conference on High Performance Computing and Communication 2012年6月

　詳細を見る

開催年月日： 2012年6月

記述言語：英語会議種別：口頭発表（一般）

開催地：Liverpool 国名：グレートブリテン・北アイルランド連合王国(英国)
Performance of Large Scale MHD Simulation of Global Planetary Magnetosphere with Massively Parallel Scalar Type Supercomputer Including Post Processing 国際会議

FUKAZAWA Keiichiro, Takeshi Nanri

14th IEEE International Conference on High Performance Computing and Communication 2012年6月

　詳細を見る

開催年月日： 2012年6月

記述言語：英語会議種別：口頭発表（一般）

開催地：Liverpool 国名：グレートブリテン・北アイルランド連合王国(英国)
MPI Allreduce の「京」上での実装と評価

松本幸，安達知也，住元真司，曽我武史，南里豪志，宇野篤也，黒川原佳，庄司文由，横川三津夫

先進的計算基盤システムシンポジウム（SACSIS2012） 2012年5月

　詳細を見る

開催年月日： 2012年5月

会議種別：口頭発表（一般）

開催地：神戸国名：日本国
並列FMOプログラムOpenFMOの性能最適化

稲富雄一、眞木淳、高見利也、本田宏明、小林泰三、南里豪志、青柳睦、南一生

第133回ハイパフォーマンスコンピューティング研究会 2012年3月

　詳細を見る

開催年月日： 2012年3月

会議種別：口頭発表（一般）

開催地：神戸国名：日本国
ランク配置に応じた集団通信アルゴリズム動的選択技術の提案

南里豪志、黒川原佳

第133回ハイパフォーマンスコンピューティング研究会 2012年3月

　詳細を見る

開催年月日： 2012年3月

会議種別：口頭発表（一般）

国名：日本国
スケーラブルな通信ライブラリ実装技術

南里豪志

第8回戦略的高性能計算システム開発に関するワークショップ 2012年2月

　詳細を見る

開催年月日： 2012年2月

記述言語：日本語会議種別：口頭発表（一般）

開催地：東京国名：日本国
通信ライブラリにおける実行時自動チューニング技術招待

南里豪志

第3回自動チューニング技術の現状と応用に関するシンポジウム 2011年12月

　詳細を見る

開催年月日： 2011年12月

会議種別：口頭発表（一般）

開催地：東京大学国名：日本国
MPI Allreduce の「京」上での実装と評価

松本幸，安達知也，田中稔，住元真司，曽我武史，南里豪志

第19回ハイパフォーマンスコンピューティングとアーキテクチャの評価に関する北海道ワークショップ 2011年11月

　詳細を見る

開催年月日： 2011年11月

会議種別：口頭発表（一般）

国名：日本国
Effect of Dynamic Algorithm Selection of All-to-All Communication on Environments with Unstable Network Speed 国際会議

Takeshi Nanri and Motoyoshi Kurokawa

International Conference on High Performance Computing & Simulation, 2011年7月

　詳細を見る

開催年月日： 2011年7月

会議種別：口頭発表（一般）

開催地：Istanbul 国名：トルコ共和国
A Method for Predicting a Penalty of Contentions by Considering Priorities of Routing among Packets on Direct Interconnection Network 国際会議

Yoshiyuki Morie, Takeshi Nanri, Ryutaro Susukita and Koji Inoue,

International Joing Conference on Computational Sciences and Optimization 2011 2011年4月

　詳細を見る

開催年月日： 2011年4月

会議種別：口頭発表（一般）

開催地：Kunming 国名：中華人民共和国
Task Allocation Method for Avoiding Contentions by the Information of Concurrent Communications 国際会議

Yoshiyuki Morie, Takeshi Nanri, and Motoyoshi Kurokawa

The Tenth IASTED International Conference on Parallel and Distributed Computing and Networks 2011年2月

　詳細を見る

開催年月日： 2011年2月

会議種別：口頭発表（一般）

開催地：Innsbruck 国名：オーストリア共和国
通信と計算の負荷を考慮した並列疎行列ベクトル積の動的負荷分散技術

草場健一郎，南里豪志，藤野清次

2010年並列／分散／協調処理に関する『金沢』サマー・ワークショップ 2010年8月

　詳細を見る

開催年月日： 2010年8月

会議種別：口頭発表（一般）

開催地：金沢国名：日本国
Runtime Load-balancing Technique for Sparse Matrix-Vector Multiplication 国際会議

Kenichiro Kusaba, Takeshi Nanri and Seiji Fujino

International Workshop on Innovative Architecture 2010年3月

　詳細を見る

開催年月日： 2010年3月

会議種別：口頭発表（一般）

開催地：Kona 国名：アメリカ合衆国
A Robust Dynamic Optimization for MPI Alltoall Operation 国際会議

Hyacinthe Nzigou Mamadou, Takeshi Nanri, and Kazuaki Murakami

18th International Heterogeneity in Computing Workshop 2009年5月

　詳細を見る

開催年月日： 2009年5月

会議種別：口頭発表（一般）

開催地：Rome 国名：イタリア共和国
階層型並列計算機向けPAGMEつきCG法の実装と性能解析

馬場慎也，南里豪志，藤野清次，染原一仁

計算工学講演会 2009年5月

　詳細を見る

開催年月日： 2009年5月

会議種別：口頭発表（一般）

開催地：東京国名：日本国

Implementation and Performance Evaluation of Parallelized CG method with PAGME - Preconditioning Method on Hierarchical Parallel Computers
A Dynamic Solution for Efficient MPI Collective Communications 国際会議

Hyacinthe Nzigou Mamadou, Feng Long Gu, Vivien Oddou, Takeshi Nanri, Kazuaki Murakami

International Workshop on HPC and Grid Applications 2009年4月

　詳細を見る

開催年月日： 2009年4月

会議種別：口頭発表（一般）

開催地：Sanya, Hainan 国名：中華人民共和国
Proﬁling Technique for Dynamic Optimization According to Waiting Time 国際会議

Takeshi Soga, Takeshi Nanri, Motoyoshi Kurokawa and Kazuaki Murakami

HPC Asia 2009年3月

　詳細を見る

開催年月日： 2009年3月

会議種別：口頭発表（一般）

開催地：Kaohsiung 国名：台湾
Dependence on loop distribution of performance in hybrid-parallel IDR(s) method 国際会議

Shinya Baba, Yusuke Onoue, Takeshi Nanri and Seiji Fujino

HPC Asia 2009年3月

　詳細を見る

開催年月日： 2009年3月

会議種別：口頭発表（一般）

開催地：Kaohsiung 国名：台湾
並列版 PAGME つき CG 法の性能解析

馬場慎也, 南里豪志, 藤野清次, 染原一仁

情報処理学会ハイパフォーマンスコンピューティング研究会 2008年12月

　詳細を見る

開催年月日： 2008年12月

会議種別：口頭発表（一般）

開催地：福岡国名：日本国

Performance analysis of the CG method with parallelized PAGME
性能モデルによる予測を併用した Alltoallアルゴリズム動的選択技術の評価

南里豪志, Hyacinthe Nzigou Mamadou, Feng Long Gu, 村上和彰

情報処理学会ハイパフォーマンスコンピューティング研究会 2008年12月

　詳細を見る

開催年月日： 2008年12月

会議種別：口頭発表（一般）

開催地：福岡国名：日本国

Evaluation of Dynamic Algorithm Selection with Performance Prediction Models on Alltoall Operation
ハイブリッド並列化したIDR(s)法の計算時間に対するプロセス数とスレッド数の組み合わせ依存性について

馬場慎也、南里豪志、藤野清次

情報処理学会ハイパフォーマンスコンピューティング研究会 2008年5月

　詳細を見る

開催年月日： 2008年5月

会議種別：口頭発表（一般）

開催地：東京都国名：日本国

Dependence on combination with number of processes and threads for com- putation times of hybrid-parallel version of IDR(s) Method
Effect of Reordering Internal Messages in MPI Broadcast According to the Load Imbalance 国際会議

Takesi Soga, Takeshi Nanri, Motoyoshi Kurokawa and Kazuaki Murakami

IWIA '08 2008年1月

　詳細を見る

会議種別：口頭発表（一般）

開催地：Hiro 国名：アメリカ合衆国
Performance Analysis and Linear Optimization Modeling of All-to-all Collective Communication Algorithms 国際会議

Hyacinthe Nzigou Mamadou, Takeshi Nanri and Kazuaki Murakami

SBAC-PAD 2007 2007年10月

　詳細を見る

会議種別：口頭発表（一般）

開催地：Gramad 国名：ブラジル連邦共和国
Dynamic Optimization of Load Balance in MPI Broadcast 国際会議

Takesi Soga, Kouji Kurihara, Takeshi Nanri, Motoyoshi Kurokawa and Kazuaki Murakami

Euro PVM/MPI 2007 2007年10月

　詳細を見る

開催地：Paris 国名：フランス共和国
SMMH - A Parallel Heuristic for Combinatorial Optimization Problems 国際会議

Guilherme Domingues, Yoshiyuki Morie, Feng Long Gu , Takeshi Nanri and Kazuaki Murakami

International Conference on Computational Methods in Science and Engineering 2007 2007年9月

　詳細を見る

会議種別：口頭発表（一般）

開催地：Corfu 国名：ギリシャ共和国
Investigating the Performance of Collective Communications on SMP Clusters: A Case for MPI_Allgather 国際会議

Feng Long Gu, Hyacinthe Nzigou Mamadou, Guilherme Domingues, Takeshi Nanri and Kazuaki Murakami

International Conference on Computational Methods in Science and Engineering 2007 2007年9月

　詳細を見る

会議種別：口頭発表（一般）

開催地：Corfu 国名：ギリシャ共和国
Evaluation of the Performance of Parallel Sparse-Matrix Multiplication and the Effect of Dynamic Load-Balancing 国際会議

Takeshi Nanri, Takeshi Soga, Koji Kurihara, Feng Long Gu, Hiroaki Ishihata and Kazuaki Murakami

International Conference on Computational Methods in Science and Engineering 2007 2007年9月

　詳細を見る

会議種別：口頭発表（一般）

開催地：Corfu 国名：ギリシャ共和国
A Study of All-to-all Collective Communication Algorithms on Modern High Performance System Architectures 国際会議

Hyacinthe Nzigou Mamadou, Feng Long Gu, Takeshi Nanri, Kazuaki Murakami

High Performance Computing International Conference (HPC Asia) 2007 2007年9月

　詳細を見る

会議種別：口頭発表（一般）

開催地：Seoul 国名：大韓民国
負荷ばらつきを考慮したMPIブロードキャスト通信の動的最適化に関する研究

栗原康志，Hyacinthe Nzigou Mamadou，南里豪志，末安直樹，松本透，井上弘士，村上和彰

SWoPP2007 2007年8月

　詳細を見る

会議種別：口頭発表（一般）

開催地：旭川市国名：日本国
通信タイミングを考慮した衝突削減のためのMPIランク配置最適化技術

森江善之, 末安直樹, 松本透, 南里豪志, 石畑宏明, 井上弘士, 村上和彰

先進的計算基盤システムシンポジウム (SACSIS2007) 2007年5月

　詳細を見る

会議種別：口頭発表（一般）

開催地：東京国名：日本国
通信タイミングを考慮したMPI ランク配置最適化技術

森江善之, 末安直樹, 松本透, 南里豪志, 石畑宏明, 井上弘士, 村上和彰

HOKKE2007 2007年3月

　詳細を見る

会議種別：口頭発表（一般）

開催地：札幌市国名：日本国
Collective Communication Costs Analysis over Gigabit Ethernet and InfiniBand 国際会議

Hyacinthe Nzigou Mamadou, Takeshi Nanri and Kazuaki Murakami

High Performance Computing - HiPC 2006 2006年12月

　詳細を見る

会議種別：口頭発表（一般）

国名：インド
Implementation of GAMESS on Parallel Computers: TCP/IP versus MPI 国際会議

Feng Long Gu, Takeshi Nanri and Kazuaki Murakami

International Conference of Computational Methods in Sciences and Engineering 2006年10月

　詳細を見る

会議種別：口頭発表（一般）

国名：ギリシャ共和国
並列計算機の大規模化に向けた MPI の Alltoall通信アルゴリズムの性能評価

南里豪志

第10回環瀬戸内応用数理研究部会シンポジウム 2006年7月

　詳細を見る

会議種別：口頭発表（一般）

開催地：沖縄県国名：日本国
Performance comparison of vector-calculations between Itanium2 and other processors 国際会議

T. Nanri, Y. Watanabe, H. Sato

International Workshop on Innovative Architecture 2005年1月

　詳細を見る

会議種別：口頭発表（一般）

開催地：ハワイ国名：アメリカ合衆国
Design and Implementation of an Adaptive Distributed Shared Memory System 国際会議

Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki

International Conference of Parallel and Distributed Computing and Systems 2001年8月

　詳細を見る

開催地：Anaheim 国名：アメリカ合衆国
Preliminary Investigation of Distributed Shared Memory System on a Cluster of High Performance Clusters 国際会議

Takeshi Nanri, Yoshitaka Watanabe, Hiroyuki Sato and Masaaki Shimasaki

European Congress on Computational Methods in Applied Sciences and Engineering 2000年9月

　詳細を見る

開催地：Barcelona 国名：スペイン
Effects of Scheduling Attributes on Multithread-Based Software DSM System 国際会議

Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki

Workshop on Scheduling Algorithms for Parallel/Distributed Computing 1999年7月

　詳細を見る

開催地：Rhodes 国名：ギリシャ共和国
Implementation of PVM-based Distributed Shared Memory System 国際会議

Takeshi Nanri, Hiroyuki Sato and Masaaki Shimasaki

International Conference on Parallel and Distributed Processing Techniques and Applications 1998年7月

　詳細を見る

開催地：Las Vegas 国名：アメリカ合衆国
非ブロッキング集団通信の通信隠蔽効果に関する調査

Takeshi Nanri, Satoshi Ohshima, Kenji Ono

2017年12月

　詳細を見る

記述言語：日本語

国名：その他
スーパーコンピュータシステムITOの性能評価

Satoshi Ohshima, Takeshi Nanri, Yoshitaka Watanabe, Hirofumi Amano, Kenji Ono

2017年12月

　詳細を見る

記述言語：日本語

国名：その他
Attribute-based quality classification of academic papers

Tetsuya Nakatoh, Sachio Hirokawa, Toshiro Minami, Takeshi Nanri, Miho Funamori

2017年11月

　詳細を見る

記述言語：英語

国名：その他

Investigating the relevant literature is very important for research activities. However, it is difficult to select the most appropriate and important academic papers from the enormous number of papers published annually. Researchers search paper databases by combining keywords, and then select papers to read using some evaluation measure—often, citation count. However, the citation count of recently published papers tends to be very small because citation count measures accumulated importance. This paper focuses on the possibility of classifying high-quality papers superficially using attributes such as publication year, publisher, and words in the abstract. To examine this idea, we construct classifiers by applying machine-learning algorithms and evaluate these classifiers using cross-validation. The results show that our approach effectively finds high-quality papers.

▼全件表示

MISC

高スループット非同期集団通信の性能モデル化に向けた予備評価

森江善之, 和田康孝, 小林諒平, 坂本龍一, 南里豪志

情報処理学会研究報告(Web) 2025 ( HPC-198 ) 2025年

　詳細を見る

J-GLOBAL

researchmap
スーパーコンピュータ玄界の性能評価

大島聡史, 南里豪志, 美添一樹

情報処理学会研究報告(Web) 2024 ( HPC-196 ) 2024年

　詳細を見る

J-GLOBAL

researchmap
Halo通信における不連続データ転送のTofuインターコネクトによる実装と性能評価

有迫廉真, 南里豪志

情報処理学会研究報告(Web) 2024 ( HPC-195 ) 2024年

　詳細を見る

J-GLOBAL

researchmap
JHPCN広域分散クラウドとタイルドディスプレイを利用した超高解像度気象衛星画像の複数拠点共有実験の紹介

川鍋友宏, 村田健史, 山本和憲, 村永和哉, 樋口篤志, 豊嶋紘一, 深沢圭一郎, 小野謙二, 南里豪志

日本地球惑星科学連合大会予稿集(Web) 2022 2022年

　詳細を見る

J-GLOBAL

researchmap
FX100における永続型集団通信関数のプロトタイプ実装と評価

森江善之, 畑中正行, 高木将通, 堀敦史, 石川裕, 南里豪志

情報処理学会研究報告(Web) 2018年2月

　詳細を見る

記述言語：日本語

FX100における永続型集団通信関数のプロトタイプ実装と評価
スーパーコンピュータシステムITOの性能評価

大島聡史, 南里豪志, 渡部善隆, 天野浩文, 小野謙二

情報処理学会研究報告(Web) 2017年12月

　詳細を見る

記述言語：日本語

スーパーコンピュータシステムITOの性能評価
ACPライブラリの通信性能およびメモリ使用量の評価

森江善之, 森江善之, 本田宏明, 本田宏明, 南里豪志, 南里豪志

情報処理学会研究報告(Web) 2016年2月

　詳細を見る

記述言語：日本語

ACPライブラリの通信性能およびメモリ使用量の評価
ACPライブラリによるMPI_Comm_spawnの置き換えとOpenFMOへの適用

本田宏明, 森江善之, 南里豪志, 稲富雄一, 高見利也, 本田宏明, 森江善之, 南里豪志, 稲富雄一, 高見利也

情報処理学会研究報告(Web) 2016年2月

　詳細を見る

記述言語：日本語

ACPライブラリによるMPI_Comm_spawnの置き換えとOpenFMOへの適用
ステンシル計算における効率的なHalo通信・計算モデルの開発

深沢圭一郎, 深沢圭一郎, 森江善之, 森江善之, 曽我武史, 曽我武史, 高見利也, 高見利也, 南里豪志, 南里豪志

情報処理学会研究報告(Web) 2016年2月

　詳細を見る

記述言語：日本語

Development of Effective Halo Communication and Calculation Model on Stencil Computation
ACP通信ライブラリを用いたOpenFMOプログラムの実装

本田宏明, 本田宏明, 森江善之, 森江善之, 南里豪志, 南里豪志, 稲富雄一, 稲富雄一, 高見利也, 高見利也

日本コンピュータ化学会年会講演予稿集 2015年10月

　詳細を見る

記述言語：日本語

ACP通信ライブラリを用いたOpenFMOプログラムの実装
エクサスケールコンピューティングに向けたHaloスレッドの電磁流体シミュレーションに対する効果

深沢圭一郎, 森江善之, 曽我武史, 高見利也, 南里豪志, 深沢圭一郎, 森江善之, 曽我武史, 高見利也, 南里豪志

情報処理学会研究報告(Web) 2015年9月

　詳細を見る

記述言語：日本語

Effects of Halo Thread to the Magnetohydrodynamic Simulation toward Exascale Computing
RDMAにおける同期通信のインターコネクトシミュレーション

薄田竜太郎, 森江善之, 南里豪志, 柴村英智

電子情報通信学会技術研究報告 2015年7月

　詳細を見る

記述言語：日本語

Interconnection Network Simulation of Synchronization Communication in RDMA
InfiniBandによるACP基本層の実装と評価

森江善之, 南里豪志, 安島雄一郎, 本田宏明, 曽我武史, 小林泰三, 住元真司, 森江善之, 南里豪志, 安島雄一郎, 本田宏明, 曽我武史, 小林泰三, 住元真司

情報処理学会研究報告(Web) 2015年2月

　詳細を見る

記述言語：日本語

Implementation and Evaluation of ACP Basic layer
ACPライブラリの集団通信インターフェース

本田宏明, 本田宏明, 山田博厚, 森江善之, 森江善之, 南里豪志, 南里豪志, 高見利也, 高見利也

情報処理学会研究報告(Web) 2015年2月

　詳細を見る

記述言語：日本語

ACPライブラリの集団通信インターフェース
RDMA評価のための大規模インターコネクトシミュレータ「NSIM‐ACE」

薄田竜太郎, 森江善之, 南里豪志, 柴村英智

情報処理学会研究報告(Web) 2014年12月

　詳細を見る

記述言語：日本語

NSIM-ACE: A Simulator for Evaluating RDMA on Large-Scale Interconnection Networks
多次元メッシュ/トーラスにおけるプロセス配置に応じた集団通信アルゴリズム選択技術の提案

南里豪志, 杉山裕宣, 森江善之

情報処理学会研究報告(CD-ROM) 2013年4月

　詳細を見る

記述言語：日本語

Proposal of a Method for Selecting Algorithm of Collective Communications on Multi-Dimensional Mesh/Torus
通信衝突を削減するタスク配置最適化における通信タイミングの予測方式の影響

森江善之, 南里豪志

情報処理学会研究報告(CD-ROM) 2013年2月

　詳細を見る

記述言語：日本語

通信衝突を削減するタスク配置最適化における通信タイミングの予測方式の影響
通信衝突削減のためのタスク配置最適化の評価

森江善之, 南里豪志, 石畑宏明, 井上弘士, 村上和彰

情報処理学会研究報告 2008年3月

　詳細を見る

記述言語：日本語

Evaluation of optimization of task allocation for reducing contentions
OpenMP入門(4)

南里豪志

計算工学 2007年7月

　詳細を見る

記述言語：日本語掲載種別：記事・総説・解説・論説等（学術雑誌）
OpenMP入門(3)

南里豪志

計算工学 2007年4月

　詳細を見る

記述言語：日本語掲載種別：記事・総説・解説・論説等（学術雑誌）
通信タイミングを考慮したランク配置最適化技術

森江善之, 末安直樹, 松本透, 南里豪志, 石畑宏明, 井上弘士, 村上和彰

情報処理学会研究報告 2007年3月

　詳細を見る

記述言語：日本語

Optimization of rank allocation considerin communication timing
OpenMP入門(2)

南里　豪志

計算工学 2007年1月

　詳細を見る

記述言語：日本語掲載種別：記事・総説・解説・論説等（学術雑誌）
OpenMP入門(1)

南里　豪志

計算工学 2006年10月

　詳細を見る

記述言語：日本語掲載種別：記事・総説・解説・論説等（学術雑誌）
MPIによる並列プログラミング入門

南里　豪志

プラズマ・核融合学会誌 2003年8月

　詳細を見る

記述言語：日本語掲載種別：記事・総説・解説・論説等（学術雑誌）

▼全件表示

産業財産権

特許権	出願件数: 1件	登録件数: 1件
実用新案権	出願件数: 0件	登録件数: 0件
意匠権	出願件数: 0件	登録件数: 0件
商標権	出願件数: 0件	登録件数: 0件

所属学協会

情報処理学会
IEEE

委員歴

電子情報通信学会九州支部庶務幹事国内

2019年4月 - 2021年3月
IEEE福岡支部 secretary 国内

2019年4月 - 2021年3月
情報処理学会ハイパフォーマンスコンピューティング研究会運営委員国内

2016年4月 - 2020年3月

学術貢献活動

Program Chair 国際学術貢献

7th International Workshop on Large-scale HPC Application Modernization （那覇市） 2020年11月

　詳細を見る

種別：大会・シンポジウム等
Track chair 国際学術貢献

HPC Asia 2020 （ Fukuoka Japan ） 2020年1月

　詳細を見る

種別：大会・シンポジウム等
実行副委員長

AXIES2019 （福岡国際会議場） 2019年12月

　詳細を見る

種別：大会・シンポジウム等
企画委員

男女共同参画シンポジウム（福岡） 2019年9月

　詳細を見る

種別：大会・シンポジウム等
電子情報通信学会誌

2019年4月 - 2021年3月

　詳細を見る

種別：学会・研究会等
Local Arrangement

ACSI2016 （ Fukuoka Japan ） 2016年1月

　詳細を見る

種別：大会・シンポジウム等

参加者数：110
Committee 国際学術貢献

International Workshop LENS (Language, Network and System Software) 2015 （ Tokyo Japan ） 2015年10月

　詳細を見る

種別：大会・シンポジウム等

参加者数：50
座長（Chairmanship）

2009年並列／分散／協調処理に関する『仙台』サマー・ワークショップ（仙台） 2009年8月

　詳細を見る

種別：大会・シンポジウム等
実行委員

SWoPP2007 （旭川市） 2007年8月 - 現在

　詳細を見る

種別：大会・シンポジウム等

参加者数：200
実行委員長

SWoPP2006 （高知市） 2006年7月 - 現在

　詳細を見る

種別：大会・シンポジウム等

参加者数：200

▼全件表示

共同研究・競争的資金等の研究課題

高スケーラブル並列計算の実現に向けた作業用コア共有型オーバラップ技術の開発

研究課題/領域番号：26K14845 2026年4月 - 2029年3月

科学研究費助成事業基盤研究(C)

南里豪志

　詳細を見る

資金種別：科研費

CiNii Research
数値シミュレーションと機械学習の効率的な連成計算手法の研究開発とPINNsへの応用

研究課題/領域番号：25K15146 2025年4月 - 2028年3月

科学研究費助成事業基盤研究(C)

深沢圭一郎, 南里豪志

　詳細を見る

資金種別：科研費

本研究は数値シミュレーションと高頻度に大量のその出力データを必要とするML/AI処理とのメモリ型連成計算を実現する手法を開発する。また、開発された手法を使い物理シミュレーションと物理情報ニューラルネットワーク（PINNs）を連成させ、物理法則を満たした高精度のサロゲートモデルの構築を目指す。高頻度出力データを扱うには、I/O型連成では困難であり、既存のメモリ型連成計算フレームワークにもそれを扱う手法は存在しない。また、高次元でダイナミックに変化する物理シミュレーションではPINNsを用いたサロゲートモデルの高精度化が難しい。本研究ではこれらを解決する手法を研究開発する。

CiNii Research
次世代計算基盤に係る調査研究（文部科学省）

2022年7月 - 2024年3月

理化学研究所

　詳細を見る

担当区分：研究分担者

次世代計算基盤には、SDGs・Society 5.0の実現に向けた課題解決のためのプラットフォームとしての役割が求められる。そこで、今後の科学に「研究DX」をもたらす高度なデジタルツイン実現の基盤として、広範な計算手法・シミュレーション技法や大規模データを駆使しつつ、それらが密に連携しながら全体のワークフロー実行が可能な汎用性の高い計算基盤の実現を目指し、あるべきアーキテクチャやシステムソフトウェア・ライブラリ技術について、アプリケーションとのコデザインを通じた調査研究を行う。
特に、システム設計の基本理念として演算精度も考慮しながら必要な計算性能を確保し、電力制約の下でデータ移動を高度化・効率化する「FLOPS to Byte」指向のシステム構築を、アーキテクチャ開発からアルゴリズム設計、アプリケーション技術に至るまで実践する。
ALL Japan体制のもと、実効的な性能を向上させる次世代計算基盤のシステム構成や要素技術の調査検討、要素技術の開発を、アーキテクチャ・システムソフトウェアとアプリケーションとのコデザインを通じて実施する。
NVDIMM上の時系列バッファ実装による効率的な非同期連成計算の実現

研究課題/領域番号：22K12049 2022年 - 2024年

日本学術振興会科学研究費助成事業基盤研究(C)

南里豪志, 深沢圭一郎, 加藤雄人

　詳細を見る

担当区分：研究代表者資金種別：科研費

複数の事象が関係する問題を計算により解決する手段として、それぞれの事象の解決プログラムを接続する連成計算が注目されている。連成計算における課題の一つに、それぞれのプログラムの進行速度の違いによる同期待ちが有る。本研究は、安価で大容量の不揮発性メモリNVDIMM上に時系列でデータを格納するバッファを実装し、これにより、同期待ちの少ない非同期連成計算の実現を目指す。本研究は、計算機の基盤ソフトウェア技術を専門とする研究者と、様々なシミュレーションプログラムを開発する研究者によるチームで取り組むことで、実用性の高い技術の開発を図る。成果は幅広く利用してもらえるようにGitHub等で公開する。

CiNii Research
システムソフトウェア・ライブラリ調査研究

2022年 - 2023年

文部科学省次世代計算基盤に係る調査研究事業

　詳細を見る

担当区分：研究分担者資金種別：受託研究
不揮発性メモリへ高効率にＲＤＭＡする技術の研究・開発

2020年10月 - 2021年3月

共同研究

　詳細を見る

担当区分：研究代表者資金種別：その他産学連携による資金
量子計算及びイジング計算システムの統合型研究開発（NEDO）

2020年4月 - 2027年3月

産業技術総合研究所

　詳細を見る

担当区分：研究分担者

超スマート社会の実現のため、先進的なモビリティサービスやスマートファクトリ、金融、創薬など多様な産業分野におけるディジタライゼーションの進展と、これに伴う高性能次世代コンピューティングに対する社会的要請が急激に高まっている。本プロジェクトにおいては、3つのNEDO プロジェクト「超伝導パラメトロン素子を用いた量子アニーリング技術の研究開発」（2018年度〜）、「イジングマシン共通ソフトウェア基盤の研究開発」（2018年度〜）、「超伝導体・半導体技術を融合した集積量子計算システムの開発」（2020年度〜）を2021年4月に統合し、フルスタック型の統合型研究開発を産学官連携に基づいて実施する
量子計算及びイジング計算システムの統合型研究開発

2020年 - 2027年

NEDO 高効率・高速処理を可能とするAIチップ・次世代コンピューティングの技術開発

　詳細を見る

担当区分：研究分担者資金種別：受託研究
不揮発性メモリへ高効率にＲＤＭＡする技術の研究・開発

2019年9月 - 2020年3月

共同研究

　詳細を見る

担当区分：研究代表者資金種別：その他産学連携による資金
NVDIMM上の通信バッファによるスケーラブルな非同期通信レイヤの開発

研究課題/領域番号：19K11991 2019年 - 2021年

日本学術振興会科学研究費助成事業基盤研究(C)

南里豪志

　詳細を見る

担当区分：研究代表者資金種別：科研費

DIMMスロットに装着可能な不揮発性メモリNVDIMMは、DRAMより省電力かつ安価で大容量化が容易なメモリデバイスとして注目されている。本研究では、このNVDIMMを通信ライブラリ内部のバッファ領域として用いる通信レイヤを開発する。これにより、大規模並列計算機での非ブロッキング一対一通信による通信隠蔽が可能となるため、並列アプリケーションのスケーラビリティ向上が期待できる。また、DRAM上バッファとNVDIMM上バッファを、通信頻度等の実行時の状況に応じて切り替えることにより、1～10μ秒と予想されているNVDIMMの遅延時間による性能への影響の軽減を図る。

CiNii Research
超並列において高スケーラビリティを実現するステンシル計算・通信モデルの開発

2018年 - 2020年

日本学術振興会科学研究費助成事業基盤研究(C)

　詳細を見る

担当区分：研究分担者資金種別：科研費
エクサスケールスパコンの省エネ化に向けたシステム電力管理戦略の研究

2018年 - 2020年

日本学術振興会科学研究費助成事業基盤研究(B)

　詳細を見る

担当区分：研究分担者資金種別：科研費
Development of Time-Reversal Method for Detecting Multiple Moving Targets Behind the Wall 国際共著

2017年4月 - 2018年3月

JHPCN (Japan)

　詳細を見る

担当区分：研究代表者

There are many imaging systems in the world for see-through the wall or cancer detection such as MRI for medical imaging. However the current technologies are not cheap nor not available everywhere. One of the cheap alternatives to such expensive systems is microwave imaging using the Time Reversal (TR) method which was first introduced in acoustics. TR has found applications in various disciplines ranging from non-destructive testing, underwater communications and medicine. TR has also been studied for Ground Penetrating Radar (GPR) as well as Through the Wall Imaging (TWI). The TR method with some super-resolution techniques such as Decomposition Of the Time-Reversal Operator (DORT in its French acronym) or MUltiple SIgnal Classification (MUSIC) requires more than 150 Fast Fourier Transform and more than 20000 singular value decomposition for a very small imaging system which consists of 13 antenna elements. Therefore the current approach is far from the real-time system due to the long computational time. Furthermore there is a high demand on the detection of multiple moving targets but the work in this field is scarce. The detection of multiple moving targets behind the wall is the one of the most challenging scenarios in through-the-wall microwave imaging. So far Fumie Costen at University of Manchester has developed the spatio-temporal windowing for the differential MDM (multi-static data matrix) for time reversal algorithm to detect multiple moving objects in a simple canonical case. This project will develop and verify an algorithm to detect the multiple moving targets with high computational efficiency.
スケーラブル通信ライブラリを用いた次世代惑星電磁圏連成計算技術の創出

2017年 - 2019年

日本学術振興会科学研究費助成事業挑戦的研究（萌芽）

　詳細を見る

担当区分：研究分担者資金種別：科研費
MPI向け準備型集団通信インタフェースの研究開発

2015年 - 2017年

日本学術振興会科学研究費助成事業基盤研究(C)

　詳細を見る

担当区分：研究代表者資金種別：科研費
並列言語ＣＡＦプログラム向け通信隠蔽技術の研究開発

研究課題/領域番号：24500068 2012年 - 2014年

日本学術振興会科学研究費助成事業基盤研究(C)

　詳細を見る

担当区分：研究代表者資金種別：科研費
省メモリ技術と動的最適化技術によるスケーラブル通信ライブラリの開発（JST CREST 研究領域「ポストペタスケール高性能計算に資するシステムソフトウェア技術の創出」）

2011年10月 - 2017年3月

九州大学（日本）

　詳細を見る

担当区分：研究代表者

将来、スーパーコンピュータの性能向上に向けて計算機のさらなる大規模化が予想されている。しかし、計算機内部の通信を担当する通信ライブラリは、現在の設計のままでは大規模化に伴う使用メモリ量の増加やチューニング作業の複雑化によって、実用性が大幅に低下する。そこで本プロジェクトでは、通信ライブラリが使用する通信バッファ領域を抑えながら、実アプリケーションにおいて数千万～数億プロセスまでの性能向上を維持することを目標とし、その実現に向けて通信ライブラリ実装技術とスケーラブルなアプリケーション作成技術を研究開発する。このうち通信ライブラリ実装技術としては、通信インタフェース、基本通信プロトコル、および通信路制御の各レイヤを対象に、省メモリ化技術と動的最適化技術を研究開発する。
省メモリ技術と動的最適化技術によるスケーラブル通信ライブラリの開発

2011年 - 2016年

科学研究費助成事業戦略的創造研究推進事業

　詳細を見る

担当区分：研究代表者資金種別：科研費以外の競争的資金
1億コア超の大規模並列計算環境に耐える通信ライブラリおよび数値計算ライブラリの研究

2011年

教育研究プログラム・研究拠点形成プロジェクト（特別枠：追加採択分）

　詳細を見る

担当区分：研究代表者資金種別：学内資金・基金等
並列言語ＣＡＦ向け動的通信最適化技術の開発

研究課題/領域番号：21700036 2009年 - 2011年

科学研究費助成事業若手研究(B)

　詳細を見る

担当区分：研究代表者資金種別：科研費
ＩＰｖ６とＭｙｒｉｎｅｔによる階層型クラスタ上のＯｐｅｎＭＰ処理環境の開発

研究課題/領域番号：18700065 2006年 - 2008年

科学研究費助成事業若手研究(B)

　詳細を見る

担当区分：研究代表者資金種別：科研費
ペタスケール・システムインターコネクト技術の開発（文部科学省「次世代IT基盤構築のための研究開発」、研究開発領域「将来のスーパーコンピューティングのための要素技術の研究開発」（平成１７年度〜１９年度））

2005年4月 - 2008年3月

九州大学（日本）

　詳細を見る

担当区分：研究分担者

ペタフロップス超級スーパーコンピュータシステムの構成において数千〜数十万規模の高速計算ノードを相互結合するシステムインターコネクト技術を対象に、現状のシステムよりもコスト対性能比で１桁上を目指して高性能化、高機能化、低コスト化を同時に達成するための３つの要素技術、すなわち、①光パケットスイッチと超小型光リンク技術、②動的通信最適化によるMPI高速化、③システムインターコネクトの総合性能評価技術を開発する。
階層型クラスタシステム上のＯｐｅｎＭＰプログラム翻訳実行環境の開発に関する研究

研究課題/領域番号：15700033 2003年 - 2005年

科学研究費助成事業若手研究(B)

　詳細を見る

担当区分：研究代表者資金種別：科研費
超並列において高スケーラビリティを実現するステンシル計算・通信モデルの開発

研究課題/領域番号：18K11336

深沢圭一郎, 南里豪志

　詳細を見る

資金種別：科研費

本研究では、エクサスケール環境においてスケーラビリティ減衰が無いステンシル計算・通信モデルの開発、及びそこで利用されるHalo通信関数の開発を行うことを目的とした。
まずステンシルシミュレーションにおいて、「計算」と「通信が必要な計算と通信」にスレッドを分けるモデルを開発した。これにより、通信が終わったことを知るための同期が必要無く、並列性能劣化を回避することができた。次に、そこで利用された通信モデルを関数群（Halo関数）にまとめ、他のアプリケーションでも容易に利用可能とした。これらの性能を2000ノード利用した環境で測定を行い、高いスケーラビリティを確認した。

CiNii Research

▼全件表示

教育活動概要

大学院生向けに、高性能並列計算およびネットワークに関する講義を担当している。
学部向けに、プログラミングおよびネットワークに関する講義を担当している。

担当授業科目

通信ネットワークB

2025年12月 - 2026年2月冬学期
通信ネットワークA

2025年10月 - 2025年12月秋学期
高性能並列計算法特論Ⅱ

2024年6月 - 2024年8月夏学期
High-Performance Parallel Computing II

2024年6月 - 2024年8月夏学期
【通年】情報理工学研究Ⅰ

2024年4月 - 2025年3月通年
【通年】情報理工学講究

2024年4月 - 2025年3月通年
【通年】情報理工学演習

2024年4月 - 2025年3月通年
【修士】高性能並列計算法特論

2024年4月 - 2024年9月前期
情報理工学論議Ⅰ

2024年4月 - 2024年9月前期
情報理工学論述Ⅰ

2024年4月 - 2024年9月前期
情報理工学読解

2024年4月 - 2024年9月前期
高性能並列計算法特論Ⅰ

2024年4月 - 2024年6月春学期
High-Performance Parallel Computing I

2024年4月 - 2024年6月春学期
(IUPE)Int. to Information Processing II

2023年12月 - 2024年2月冬学期
情報ネットワーク特論

2023年12月 - 2024年2月冬学期
通信ネットワークB

2023年12月 - 2024年2月冬学期
通信ネットワークⅡ

2023年12月 - 2024年2月冬学期
（後期）通信ネットワーク

2023年10月 - 2024年3月後期
情報理工学論議Ⅱ

2023年10月 - 2024年3月後期
情報理工学論述Ⅱ

2023年10月 - 2024年3月後期
情報理工学演示

2023年10月 - 2024年3月後期
(IUPE)Int. to Information Processing I

2023年10月 - 2023年12月秋学期
通信ネットワークA

2023年10月 - 2023年12月秋学期
通信ネットワークⅠ

2023年10月 - 2023年12月秋学期
High-Performance Parallel Computing II

2023年6月 - 2023年8月夏学期
高性能並列計算法特論Ⅱ

2023年6月 - 2023年8月夏学期
【通年】情報理工学講究

2023年4月 - 2024年3月通年
【通年】情報理工学演習

2023年4月 - 2024年3月通年
【通年】情報理工学研究Ⅰ

2023年4月 - 2024年3月通年
情報理工学論議Ⅰ

2023年4月 - 2023年9月前期
情報理工学論述Ⅰ

2023年4月 - 2023年9月前期
情報理工学読解

2023年4月 - 2023年9月前期
【修士】高性能並列計算法特論

2023年4月 - 2023年9月前期
High-Performance Parallel Computing I

2023年4月 - 2023年6月春学期
高性能並列計算法特論Ⅰ

2023年4月 - 2023年6月春学期
サイバーセキュリティ基礎論

2023年4月 - 2023年6月春学期
サイバーセキュリティ基礎論

2023年4月 - 2023年6月春学期
電気情報工学入門

2023年4月 - 2023年6月春学期
(IUPE)Int. to Information Processing II

2022年12月 - 2023年2月冬学期
情報ネットワーク特論

2022年12月 - 2023年2月冬学期
通信ネットワークB

2022年12月 - 2023年2月冬学期
情報理工学論議Ⅱ

2022年10月 - 2023年3月後期
情報理工学論述Ⅱ

2022年10月 - 2023年3月後期
情報理工学演示

2022年10月 - 2023年3月後期
（後期）通信ネットワーク

2022年10月 - 2023年3月後期
(IUPE)Int. to Information Processing I

2022年10月 - 2022年12月秋学期
通信ネットワークA

2022年10月 - 2022年12月秋学期
高性能並列計算法特論Ⅱ

2022年6月 - 2022年8月夏学期
High-Performance Parallel Computing II

2022年6月 - 2022年8月夏学期
情報理工学研究Ⅰ

2022年4月 - 2023年3月通年
情報理工学演習

2022年4月 - 2023年3月通年
情報理工学講究

2022年4月 - 2023年3月通年
【修士】高性能並列計算法特論

2022年4月 - 2022年9月前期
情報理工学読解

2022年4月 - 2022年9月前期
情報理工学論述Ⅰ

2022年4月 - 2022年9月前期
情報理工学論議Ⅰ

2022年4月 - 2022年9月前期
High-Performance Parallel Computing

2022年4月 - 2022年9月前期
サイバーセキュリティ基礎論

2022年4月 - 2022年6月春学期
サイバーセキュリティ基礎論

2022年4月 - 2022年6月春学期
高性能並列計算法特論Ⅰ

2022年4月 - 2022年6月春学期
High-Performance Parallel Computing I

2022年4月 - 2022年6月春学期
情報ネットワーク特論

2021年12月 - 2022年2月冬学期
(IUPE)Int. to Information Processing II

2021年12月 - 2022年2月冬学期
情報ネットワーク特論

2021年12月 - 2022年2月冬学期
(IUPE)Int. to Information Processing II

2021年12月 - 2022年2月冬学期
(IUPE)Int. to Information Processing l

2021年10月 - 2021年12月秋学期
(IUPE)Int. to Information Processing l

2021年10月 - 2021年12月秋学期
(IUPE)Int. to Information Processing II

2020年12月 - 2021年2月冬学期
(IUPE)Int. to Information Processing II

2020年12月 - 2021年2月冬学期
(IUPE)Int. to Information Processing II

2020年12月 - 2021年2月冬学期
情報ネットワーク特論

2020年10月 - 2021年3月後期
(IUPE)Int. to Information Processing l

2020年10月 - 2020年12月秋学期
(IUPE)Int. to Information Processing l

2020年10月 - 2020年12月秋学期
(IUPE)Int. to Information Processing l

2020年10月 - 2020年12月秋学期
(IUPE)Int. to Information Processing II

2019年12月 - 2020年2月冬学期
(IUPE)Int. to Information Processing II

2019年12月 - 2020年2月冬学期
情報ネットワーク特論

2019年10月 - 2020年3月後期
情報ネットワーク特論

2019年10月 - 2020年3月後期
(IUPE)Int. to Information Processing l

2019年10月 - 2019年12月秋学期
(IUPE)Int. to Information Processing l

2019年10月 - 2019年12月秋学期
Introduction to Information Processing

2019年4月 - 2019年6月春学期
(IUPE) Introduction to Information Processing

2019年4月 - 2019年6月春学期
(IUPE) Introduction to Information Processing

2019年4月 - 2019年6月春学期
情報ネットワーク特論

2018年10月 - 2019年3月後期
Introduction to Information Processing

2018年4月 - 2018年6月春学期
Introduction to Information Processing

2018年4月 - 2018年6月春学期
情報ネットワーク特論

2017年10月 - 2018年3月後期
Introduction to Information Processing

2017年4月 - 2017年9月前期
Introduction to Information Processing

2017年4月 - 2017年6月春学期
Introduction to Information Processing

2017年4月 - 2017年6月春学期
情報ネットワーク特論

2016年10月 - 2017年3月後期
Introduction to Information Processing

2016年4月 - 2016年9月前期
情報ネットワーク特論

2015年10月 - 2016年3月後期
Introduction to Information Processing

2015年4月 - 2015年9月前期
情報ネットワーク特論

2014年10月 - 2015年3月後期
Introduction to Information Processing

2014年4月 - 2014年9月前期
情報ネットワーク特論

2013年10月 - 2014年3月後期
情報ネットワーク特論

2012年10月 - 2013年3月後期
情報ネットワーク特論

2011年10月 - 2012年3月後期
情報ネットワーク特論

2010年10月 - 2011年3月後期
情報処理概論

2010年4月 - 2010年9月前期
情報処理概論

2009年4月 - 2009年9月前期
情報処理概論

2008年4月 - 2008年9月前期
情報処理概論

2007年10月 - 2008年3月後期
情報処理概論

2007年4月 - 2007年9月前期
情報処理概論

2006年4月 - 2006年9月前期
情報処理概論

2005年4月 - 2005年9月前期
基幹教育セミナー

2025年6月 - 2025年8月夏学期
高性能並列計算法特論Ⅱ

2025年6月 - 2025年8月夏学期
[G]High-Performance Parallel Computing II

2025年6月 - 2025年8月夏学期
【通年】情報理工学演習

2025年4月 - 2026年3月通年
【通年】情報理工学研究Ⅰ

2025年4月 - 2026年3月通年
【通年】情報理工学講究

2025年4月 - 2026年3月通年
情報理工学論述Ⅰ

2025年4月 - 2025年9月前期
情報理工学読解

2025年4月 - 2025年9月前期
情報理工学論議Ⅰ

2025年4月 - 2025年9月前期
高性能並列計算法特論Ⅰ

2025年4月 - 2025年6月春学期
[G]High-Performance Parallel Computing I

2025年4月 - 2025年6月春学期
(IUPE)Int. to Information Processing II

2024年12月 - 2025年2月冬学期
情報ネットワーク特論

2024年12月 - 2025年2月冬学期
通信ネットワークB

2024年12月 - 2025年2月冬学期
通信ネットワークⅡ

2024年12月 - 2025年2月冬学期
情報理工学演示

2024年10月 - 2025年3月後期
情報理工学論議Ⅱ

2024年10月 - 2025年3月後期
情報理工学論述Ⅱ

2024年10月 - 2025年3月後期
（後期）通信ネットワーク

2024年10月 - 2025年3月後期
(IUPE)Int. to Information Processing I

2024年10月 - 2024年12月秋学期
通信ネットワークA

2024年10月 - 2024年12月秋学期
通信ネットワークⅠ

2024年10月 - 2024年12月秋学期
High-Performance Parallel Computing II

2024年6月 - 2024年8月夏学期
高性能並列計算法特論Ⅱ

2024年6月 - 2024年8月夏学期
【通年】情報理工学演習

2024年4月 - 2025年3月通年
【通年】情報理工学研究Ⅰ

2024年4月 - 2025年3月通年
【通年】情報理工学講究

2024年4月 - 2025年3月通年
【修士】高性能並列計算法特論

2024年4月 - 2024年9月前期
情報理工学読解

2024年4月 - 2024年9月前期
情報理工学論議Ⅰ

2024年4月 - 2024年9月前期
情報理工学論述Ⅰ

2024年4月 - 2024年9月前期
High-Performance Parallel Computing I

2024年4月 - 2024年6月春学期
高性能並列計算法特論Ⅰ

2024年4月 - 2024年6月春学期

▼全件表示

他大学・他機関等の客員・兼任・非常勤講師等

2024年九州工業大学情報工学部区分:非常勤講師国内外の区分:国内
2023年九州工業大学情報工学部区分:非常勤講師国内外の区分:国内
2023年岡山大学工学部区分:非常勤講師国内外の区分:国内
2023年放送大学区分:客員教員
2022年放送大学区分:客員教員
2022年九州工業大学情報工学部区分:非常勤講師国内外の区分:国内
2021年放送大学区分:非常勤講師国内外の区分:国内

学期、曜日時限または期間：面接授業（計８コマ）担当

▼全件表示

その他教育活動及び特記事項

2023年クラス担任学部
2011年その他特記事項システム情報科学研究院の青柳研究室に参加し、学部生 2名の卒業研究について、実質的な指導を担当した。また、システム情報科学研究院の村上研究室に参加し、修士2年生 1名の卒業研究について、実質的な指導を担当した。

　詳細を見る

システム情報科学研究院の青柳研究室に参加し、学部生 2名の卒業研究について、実質的な指導を担当した。
また、システム情報科学研究院の村上研究室に参加し、修士2年生 1名の卒業研究について、実質的な指導を担当した。

大学全体における各種委員・役職等

2017年4月 - 現在男女共同参画推進室
2008年4月 - 2012年3月広報委員
2007年4月 - 2012年3月百年誌編集委員会

社会貢献活動

スーパーコンピュータ超入門

九州大学情報基盤研究開発センター九州大学情報基盤研究開発センター 2020年10月

　詳細を見る

対象：社会人・一般,　学術団体,　企業,　市民団体,　行政機関

種別：セミナー・ワークショップ

スーパーコンピュータという言葉は知っているが、どんなものか良く分からない、という方を対象に、スーパーコンピュータの役割やパーソナルコンピュータとの違いなどを紹介する。
並列プログラミングにおける国際的な標準規格 MPI (Message Passing Interface) の仕様策定会議に参加

2016年

　詳細を見る

並列プログラミングにおける国際的な標準規格 MPI (Message Passing Interface) の仕様策定会議に参加
社会人向けスパコン実践スクール　今のパソコンは、昔の大型計算機と言われた計算機を遥かに凌ぐスペックを有しており、また簡単に手に入るようになった。　今回のセミナーでは、オフィスで使用している程度のパソコンを使用して、８ノードの並列計算機（ＰＣクラスタ）を構成し、ＬｉｎｕｘやＭＰＩなどのソフトウェアをインストールして、ネットワークに接続し、実際に自分のパソコンからシミュレーション・コードを走らせて、その性能を評価してみます。

財団法人計算科学振興財団、大学院GP「大学連合による計算科学の最先端人材育成」神戸ポートアイランド内　神戸大学ＢＴセンター 2009年6月

　詳細を見る

対象：社会人・一般,　学術団体,　企業,　市民団体,　行政機関

種別：セミナー・ワークショップ

諸外国を対象とした高度専門職業人教育活動

2020年2月 - 2020年3月国立研究開発法人科学技術振興機構「さくらサイエンスプラン」科学技術研修コース「ミャンマーの数学科の大学院生が数学のスーパーコンピューティングへの応用を学ぶ」

学生／研修生の主な所属国：ミャンマー連邦