九州大学 研究者情報
発表一覧
井上 弘士(いのうえ こうじ) データ更新日:2024.04.07

教授 /  システム情報科学研究院 情報知能工学部門 先端情報・通信機構


学会発表等
1. Koki Ishida, Ilkwon Byun, Ikki Nagaoka, Kousuke Fukumitsu, Masamitsu Tanaka, Satoshi Kawakami, Teruo Tanimoto, Takatsugu Ono, Jangwoo Kim, and Koji Inoue, SuperNPU: An Extremely Fast Neural Processing Unit Using Superconducting Logic Devices, IEEE/ACM International Symposium on Microarchitecture (MICRO), 2020.10, Superconductor single-flux-quantum (SFQ) logic family has been recognized as a highly promising solution for the post-Moore's era, thanks to its ultra-fast and low-power switching characteristics. Therefore, researchers have made a tremendous amount of effort in various aspects to promote the technology and automate its circuit design process (e.g., low-cost fabrication, design tool development). However, there has been no progress in designing a convincing SFQ-based architectural unit due to the architects' lack of understanding of the technology's potentials and limitations at the architecture level. In this paper, we present how to architect an SFQ-based architectural unit by providing design principles with an extreme-performance neural processing unit (NPU). To achieve the goal, we first implement an architecture-level simulator to model an SFQ-based NPU accurately. We validate this model using our die-level prototypes, design tools, and logic cell library. This simulator accurately measures the NPU's performance, power consumption, area, and cooling overheads. Next, driven by the modeling, we identify key architectural challenges for designing a performance-effective SFQ-based NPU (e.g., expensive on-chip data movements and buffering). Lastly, we present SuperNPU, our example SFQ-based NPU architecture, which effectively resolves the challenges. Our evaluation shows that the proposed design outperforms a conventional state-of-the-art NPU by 23 times. With free cooling provided as done in quantum computing, the performance per chip power increases up to 490 times. Our methodology can also be applied to other architecture designs with SFQ-friendly characteristics..
2. Teruo Tanimoto, Shuhei Matsuo, Satoshi Kawakami, Yutaka Tabuchi, Masao Hirokawa, and Koji Inoue, How many trials do we need for reliable NISQ computing?, The First International Workshop on Quantum Computing: Circuits Systems Automation and Applications, 2020.07.
3. Teruo Tanimoto, Shuhei Matsuo, Satoshi Kawakami, Yutaka Tabuchi, Masao Hirokawa, and Koji Inoue, Practical error modeling toward realistic NISQ simulation, The First International Workshop on Quantum Computing: Circuits Systems Automation and Applications, 2020.07.
4. Koki Ishida, Masamitsu Tanaka, Ikki Nagaoka, Takatsugu Ono, Satoshi Kawakami, Teruo Tanimoto, Akira Fujimaki, Koji Inoue, 32 GHz 6.5 mW Gate-Level-Pipelined 4-bit Processor using Superconductor Single-Flux-Quantum Logic, 2020 Symposia on VLSI Technology and Circuits, 2020.06.
5. G Georgakoudis, N Jain, T Ono, K Inoue, S Miwa, A Bhatele, Evaluating the Impact of Energy Efficient Networks on HPC Workloads, 26th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC), 2020.01.
6. Keitaro Oka, Satoshi Kawakami, Teruo Tanimoto, Takatsugu Ono, Koji Inoue, Enhancing a manycore-oriented compressed cache for GPGPU, International Conference on High Performance Computing in Asia-Pacific Region, 2020.01.
7. Susumu Mashimo, Ryota Shioya, Koji Inoue, Energy Efficient Runahead Execution on a Tightly Coupled Heterogeneous Core, International Conference on High Performance Computing in Asia-Pacific Region, 2020.01.
8. Keitaro Oka, Satoshi Kawakami, Teruo Tanimoto, Takatsugu Ono, Koji Inoue, Enhancing a manycore-oriented compressed cache for GPGPU, International Conference on High Performance Computing in Asia-Pacific Region, 2020.01.
9. Susumu Mashimo, Ryota Shioya, Koji Inoue, Energy Efficient Runahead Execution on a Tightly Coupled Heterogeneous Core, International Conference on High Performance Computing in Asia-Pacific Region, 2020.01.
10. Susumu Mashimo, Akifumi Fujita, Reoma Matsuo, Seiya Akaki, Akifumi Fukuda, Toru Koizumi, Junichiro Kadomoto, Hidetsugu Irie, Masahiro Goshima, Koji Inoue, Ryota Shioya, An Open Source FPGA-Optimized Out-of-Order RISC-V Soft Processor, IEEE International Conference on Field Programmable Technology, 2019.12.
11. Ikki Nagaoka, Masamitsu Tanaka, Koji Inoue, Akira Fujimaki, A 48GHz 5.6mW gate-level-pipelined multiplier using single-flux quantum logic, IEEE International Solid-State Circuits Conference (ISSCC 2019), 2019.02.
12. Takatsugu Ono, Zhe Chen and Koji Inoue, Improving Lifetime in MLC Phase Change Memory using Slow Writes, International Japan-Africa Conference on Electronics, Communication and Computations, 2018.12.
13. Yusuke Inoue, Takatsugu Ono and Koji Inoue, Situation-Based Dynamic Frame-Rate Control for On-Line Object Tracking, International Japan-Africa Conference on Electronics, Communication and Computations, 2018.12.
14. Masamitsu Tanaka, Yuki Hatanaka, Yuichi Matsui, Ikki Nagaoka, Koki Ishida, Kyosuke Sano, Taro Yamashita, Takatsugu Ono, Koji Inoue, Akira Fujimaki, 30-GHz Operation of Datapath for Bit-Parallel, Gate-Level-Pipelined Rapid Single-Flux-Quantum Microprocessors, Applied Superconductivity Conference, 2018.10.
15. Omar M. Saad, K. Inoue, Ahmed Shalaby, Lotf Samy, and Mohammed S. Sayed, Autoencoder based Features Extraction for Automatic Classification of Earthquakes and Explosions, the 17th IEEE/ACIS International Conference on Computer and Information Science, 2018.06.
16. Ryuichi Sakamoto, Tapasya Patki, Thang Cao, Masaaki Kondo, Koji Inoue, Masatsugu Ueda, Daniel Ellsworth, Barry Rountree, Martin Schulz, Analyzing Resource Trade-offs in Hardware-overprovisioned Supercomputers, the 32nd International Parallel and Distributed Processing, 2018.05.
17. Mihiro Sonoyama, Takatsugu Ono, Osamu Muta, Haruichi Kanaya, Koji Inoue, Wireless Spoofing-Attack PreventionUsing Radio-Propagation Characteristics, IEEE International Conference on Dependable, Autonomic and Secure Computing, 2017.11.
18. Teruo Tanimoto, Takatsugu Ono, Koji Inoue, CPCI Stack: Metric for Accurate Bottleneck Analysis on OoO Microprocessors, International Symposium on Computing and Networking, 2017.11.
19. 畑中 湧貴, 松井 裕一, 田中 雅光, 佐野 京佑, 藤巻 朗, 石田 浩貴, 小野 貴継, 井上 弘士, 単一磁束量子ゲートレベルパイプラインマイクロプロセッサに向けた要素回路設計 (超伝導エレクトロニクス), 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 2017.08.
20. Masamitsu Tanaka, Ryo Sato, Yuki Hatanaka, Yuichi Matsui, Hiroyuki Akaike, Akira Fujimaki, Koki Ishida, Takatsugu Ono, Koji Inoue, High-Throughput Bit-Parallel Arithmetic Logic Unit Using Rapid Single-Flux-Quantum Logic, International Superconductive Electronics Conference, 2017.06.
21. Ryuichi Sakamoto, Thang Cao, Masaaki Kondo, Koji Inoue, Masatsugu Ueda, Tapasya Patki, Daniel Ellsworth, Barry Rountree, and Martin Schulz, Production Hardware Overprovisioning: Real-world Performance Optimization using an Extensible Power-aware Resource Management Framework, IEEE International Parallel & Distributed Processing Symposium (IPDPS 2017), 2017.05.
22. 今村 智史, Keitaro Oka, Yuichiro Yasui, 稲富 雄一, Katsuki Fujisawa, Toshio Endo, Koji Ueno, Keiichiro Fukazawa, Nozomi Hata, Yuta Kakibuka, Inoue Koji, Takatsugu Ono, Evaluating the Impacts of Code-Level Performance Tunings on Power Efficiency, IEEE International Conference on Big Data, 2016.12.
23. 今村 智史, Yuichiro Yasui, Inoue Koji, Takatsugu Ono, Hiroshi Sasaki, Katsuki Fujisawa, Power-Efficient Breadth-First Search with DRAM Row Buffer Locality-Aware Address Mapping, the 1st High Performance Graph Data Management and Processing workshop, 2016.11.
24. 石田 浩貴, 田中 雅光, 小野 貴継, 井上 弘士, 単一磁束量子回路を用いたシフトレジスタ型キャッシュメモリ・アーキテクチャの提案 (電子部品・材料) -- (デザインガイア2016 : VLSI設計の新しい大地), 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 2016.11.
25. Koki Ishida, Masamitsu Tanaka, Takatsugu Ono, Inoue Koji, Single-Flux-Quantum Cache Memory Architecture, International SoC Design Conference, 2016.10.
26. 藤井 達也, 小野 貴継, 金谷 晴一, 井上 弘士, 受信信号強度を用いたデバイス認証方式における攻撃可能条件の定式化 (コンピュータシステム), 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 2016.08.
27. 石原 亨, 新家 昭彦, 井上 弘士, 野崎 謙悟, 納富 雅也, 光パスゲート論理に基づく並列加算回路の提案と光電混載回路シミュレータによる動作検証 (回路とシステム), 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 2016.06.
28. 藤井 卓, 小野 貴継, 井上 弘士, モデル予測制御を対象としたメニーコアプロセッサ向け投機実行法の制御性能評価 (VLSI設計技術), 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 2016.01.
29. 井上 優良, 小野 貴継, 井上 弘士, 物体追跡システムの低消費エネルギー化を目的とした動的フレームレート制御法 (電子部品・材料), 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 2015.12.
30. 井上 優良, 小野 貴継, 井上 弘士, 物体追跡システムの低消費エネルギー化を目的とした動的フレームレート制御法 (集積回路), 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 2015.12.
31. 稲富 雄一, Tapasya Patki, Inoue Koji, Mutsumi Aoyagi, Barry Rountree, Martin Schulz, David Lowenthal, Yasutaka Wada, Keiichiro Fukazawa, Masatsugu Ueda, Masaaki Kondo, Ikuo Miyoshi, Analyzing and Mitigating the Impact of Manufacturing Variability in Power-Constrained Supercomputing, The International Conference for High Performance Computing, Networking, Storage and Analysis, 2015.11.
32. Takeshi Soga, Hiroshi Sasaki, Tomoya Hirao, Masaaki Kondo, Inoue Koji, A flexible hardware barrier mechanism for many-core processors, Asia and South Pacific Design Automation Conference, 2015.01.
33. Satoshi Imamura, Hiroshi Sasaki, Inoue Koji, Dimitrios S. Nikolopoulos, Power-capped DVFS and thread allocation with ANN models on modern NUMA systems, IEEE International Conference on Computer Design, 2014.10.
34. Yuki Abe, Hiroshi Sasaki, Shinpei Kato, Inoue Koji, Masato Edahiro, Martin Peres, Power and Performance Characterization and Modeling of GPU-accelerated Systems, the 28th IEEE International Parallel & Distributed Processing Symposium, 2014.05.
35. FUKAZAWA Keiichiro, Tomonori Tsuhata, Kyohei Yoshida, Masakazu Kuze, Masatsugu Ueda, 稲富 雄一, Inoue Koji, Performance and Power Consumption Evaluation of MHD Simulation for Magnetosphere on Parallel Computer System with CPU Power Capping, Extreme Green & Energy Efficiency in Large Scale Distributed Systems, 2014.05.
36. 江川 瀬里奈, 井上 弘士, フレームレートの動的最適化に基づく低消費エネルギー物体追跡システムの提案 (集積回路 デザインガイア2013 : VLSI設計の新しい大地), 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 2013.11, 動画像上の指定した対象物体の位置座標を各フレームで推定するオンライン物体追跡は,自動車の安全技術の一つである障害物追跡や居眠り検知などに広く応用され,重要な技術となっている.最近ではバッテリ駆動を基本とする移動体における応用が拡大しており,追跡精度を向上するだけでなく,低消費エネルギー化も同時に達成することが求められる.そこで本稿では,物体追跡システムの低消費エネルギー化を目的とした動的フレームレート最適化方式を提案する.本方式では,物体追跡システム全体の消費エネルギーに基づいて最適なフレームレートに動的変更することにより,必要以上のフレーム取得や処理に要する消費エネルギーを削減する.消費エネルギーモデルを用いて本方式の実装・評価を行った結果,従来方式と同程度の追跡精度で消費エネルギーを70%以上削減できることが分かった..
37. Hiroshi Sasaki, Satoshi Imamura, Inoue Koji, Coordinated Power-Performance Optimization in Manycores, the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013.09.
38. Satoshi Kawakami, Akihito Iwanaga, Inoue Koji, Many-core Acceleration for Model Predictive Control Systems, Int’l Workshop on Manycore Embedded Systems, 2013.06.
39. 川上 哲志, 岩永 明人, 井上 弘士, メニーコアプロセッサにおける実時間モデル予測制御のための投機実行法, 先進的計算基盤システムシンポジウム論文集, 2013.05.
40. Keitaro Oka, Hiroshi Sasaki, Inoue Koji, Line Sharing Cache: Exploring Cache Capacity with Frequent Line Value Locality, Asia and South Pacific Design Automation Conference, 2013.01.
41. Inoue Koji, SMYLEProject:TowardHigh-Performance,Low-PowerComputingonManycore-Processor SoCs, Asia and South Pacific Design Automation Conference (ASP-DAC), 2013.01.
42. Masaaki Kondo, Son Truong Nguyen, Takeshi Soga, Tomoya Hirao, Hiroshi Sasaki, Inoue Koji, SMYLEref: A Reference Architecture for Manycore-Processor SoCs, Asia and South Pacific Design Automation Conference (ASP-DAC), 2013.01, , , , , Hiroshi Sasaki, and Koji Inoue,
".
43. Junya Kaida, Takuji Hieda, Ittetsu Taniguchi, Hiroyuki Tomiyama, Yuko Hara-Azumi, Inoue Koji, Task Mapping Techniques for Embedded Many-core SoCs, International SoC Design Conference, 2012.11.
44. Yuki Abe, Hiroshi Sasaki, Martin Peres, Inoue Koji, Kazuaki Murakami, Shinpei Kato, Power and Performance Analysis of GPU-Accelerated Systems, Workshop on Power-Aware Computing and Systems, 2012.10.
45. Hiroshi Sasaki, Teruo Tanimoto, Koji Inoue, and Hiroshi Nakamura, Scalability-Based Manycore Partitioning, International Conference on Parallel Architectures and Compilation Techniques, 2012.09.
46. Farhad Mehdipour, Krishna Chaitanya Nunna, Inoue Koji, Kazuaki Murakami, A Three-Dimensional Integrated Accelerator, Euromicro Conference on Digital System Design, 2012.09.
47. Koji Inoue and Masaaki Kondo, SMYLE: Scalable Many-core for Low-Energy computing (Invited), 12th International Forum on Embedded MPSoC and Multicore, 2012.07.
48. Yuki Abe, 佐々木 広, Inoue Koji, Kazuaki Murakami, Shinpei Kato, On the Power and Performance Analysis of GPU-Accelerated Systems, Poster session, 2012 USENIX Annual Technical Conference, 2012.06.
49. Satoshi Imamura, Hiroshi Sasaki, Naoto Fukumoto, Koji Inoue, and Kazuaki Murakami, Optimizing Power-Performance Trade-off for Parallel Applications through Dynamic Core-count and Frequency Scaling, 2nd Workshop on Runtime Environments/Systems, Layering, and Virtualized Environments (RESoLVE '12), 2012.03.
50. Lovic Gauthier, Farhad Mehdipour, Koji Inoue, Shinya Ueno, Hiroshi Sasaki, Efficient Barrier Synchronization for 2D Meshed NoC-based Many-core Processors, The 17th Workshop on Synthesis And System Integration of Mixed Information technologies, 2012.03.
51. F. Mehdipour, K. C. Nunna, L. Gauthier, K. Inoue and K. Murakami, A Thermal-Aware Mapping Algorithm for Reducing Peak Temperature of an Accelerator Deployed in a 3D Stack, International 3D System Integration Conference, 2012.01.
52. T. Hanada, H. Sasaki, K. Inoue and K. Murakami, Performance Evaluation of 3D Stacked Multi-Core Processors with Temperature Consideration, International 3D System Integration Conference, 2012.01.
53. Hiroaki Honda, Farhad Mehdipour, Hiroshi Kataoka, Inoue Koji, Kazuaki J. Murakami, Performance evaluations of finite difference applications realized on a single flux quantum circuits-based reconfigurable accelerator, Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011, 2011.12, Hardware accelerators integrating to general purpose processors are increasingly employed to achieve lower power consumption and higher processing speed, however, energy consumption of high performance accelerators has become a great issue on large scale parallel computer system. We have investigated the applicability of Single-Flux-Quantum (SFQ) circuits as a part of superconductivity technology in high-performance computing systems. Although it is possible to develop extraordinary low power processor by SFQ devices, conditional branch and loop back controls are difficult to be implemented by current SFQ technology. Therefore, we have proposed Reconfigurable Data- Path (RDP) accelerator which is avoiding those limitations of SFQ technology, while trying to get benefits of these circuits. In this research, we have implemented two-dimensional Heat (2D-Heat) and Finite Difference Time Domain (2D-FDTD) applications for investigating efficiency of using SFQ-RDP accelerator. According to performance evaluation results for above applications, execution times are 50.6 and 79.0 times smaller than those of the general purpose processor, and comparable with ones reported for GPU (Graphics Processing Units).Hardware accelerators integrating to general purpose processors are increasingly employed to achieve lower power consumption and higher processing speed, however, energy consumption of high performance accelerators has become a great issue on large scale parallel computer system. We have investigated the applicability of Single-Flux-Quantum (SFQ) circuits as a part of superconductivity technology in high-performance computing systems. Although it is possible to develop extraordinary low power processor by SFQ devices, conditional branch and loop back controls are difficult to be implemented by current SFQ technology. Therefore, we have proposed Reconfigurable Data-Path (RDP) accelerator which is avoiding those limitations of SFQ technology, while trying to get benefits of these circuits. In this research, we have implemented two-dimensional Heat (2D-Heat) and Finite Difference Time Domain (2D-FDTD) applications for investigating efficiency of using SFQ-RDP accelerator. According to performance evaluation results for above applications, execution times are 50.6 and 79.0 times smaller than those of the general purpose processor, and comparable with ones reported for GPU (Graphics Processing Units)..
54. Koji Inoue, Adaptive Execution on 3D Microprocessors, 11th International Forum on Embedded MPSoC and Multicore, 2011.07.
55. Koji Inoue, Adaptive Execution on 3D Microprocessors, 11th International Forum on Embedded MPSoC and Multicore, 2011.07.
56. Koji Inoue, 3D memory architecture, D43D: 3rd Design for 3D Silicon Integration Workshop, 2011.06.
57. 柴村 英智, 三輪 英樹, 薄田 竜太郎, 平尾 智也, 安島 雄一郎, 三吉 郁夫, 清水 俊幸, 石畑 宏明, 井上 弘士, パケットペーシングによる全対全通信の最適化とシミュレーション評価, ハイパフォーマンスコンピューティングと計算科学シンポジウム, 2011.01.
58. 福本 尚人, 井上 弘士, 村上 和彰, 演算/メモリ性能バランスを考慮したマルチコア向けオンチップメモリ貸与法, ハイパフォーマンスコンピューティングと計算科学シンポジウム, 2011.01.
59. H. Kataoka, H. Honda, F. Mehdipour, K. Inoue, and K. Murakami, Reducing Preprocessing Overhead Times in a Reconfigurable Accelerator of Finite Difference Applications, In Proc. Symp. on Application Accelerators in High Performance Computing (SAAHPC'10), 2010.07.
60. Farhad Mehdipour, Hamid Noori, Bahman Javadi, Hiroaki Honda, Koji Inoue, Kazuaki Murakami, A Combined Analytical and Simulation-Based Model for Performance Evaluation of a Reconfigurable Instruction Set Processor, The 14th Asia and South-Pacific Design Automation Conference (ASP-DAC 2009), 2009.01.
61. R. Susukita, H. Ando, M. Aoyagi, H. Honda, Y. Inadomi, K. Inoue, S. Ishizuki, Y. Kimura, H. Komatsu, M. Kurokawa, K. Murakami, H. Shibamura, S. Yamamura, Y. Yu, Performance Prediction of Large-scale Parallel System and Application using Macro-level Simulation, the International Conference for High Performance Computing, Networking, Storage and Analysis (SC08), 2008.11.
62. N. Fukumoto, T. Mihara, K. Inoue, and K. Murakami, Analyzing the Impact of Data Prefetching on Chip MultiProcessors, IEEE Asia-Pacific Computer Systems Architecture Conference (ACSAC'08), 2008.08.
63. H. Noori, F. Mehdipour, K. Inoue, and K. Murakami, Enhancing Energy Efficiency of Processor-Based Embedded Systems through Post-Fabrication ISA Extension, International Symposium on Low Power Electronics and Design (ISLPED'08), 2008.08.
64. H. Noori, M. Goudarzi, K. Inoue, and K. Murakami, Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection, International Symposium on VLSI (ISVLSI'08), 2008.08.
65. F. Mehdipour, H. Noori, M. S. Zamani, K. Inoue, and K. Murakami, Design Space Exploration for a Coarse Grain Accelerator, Asia and South Pacific Design Automation Conference (ASPDAC'08), 2008.01.
66. J. Zushi, G. Zeng, H. Tomiyama, H. Takada, and K. Inoue, Improved Policies for Drowsy Caches in Embedded Processors, Internal Symposium on Electronics Design, Test & Applications, 2008.01.
67. F. Mehdipour, H. Noori, M. S. Zamani, K. Inoue, and K. Murakami, Design Space Exploration for a Coarse Grain Accelerator, Asia and South Pacific Design Automation Conference, 2008.01.
68. J. Zushi, G. Zeng, H. Tomiyama, H. Takada, and K. Inoue, Improved Policies for Drowsy Caches in Embedded Processors, Internal Symposium on Electronics Design, Test & Applications, 2008.01.
69. H. Noori, F. Mehdipour, M. Goudarzi, S. Yamaguchi, K. Inoue, and K. Murakami, Energy Consumption Evaluation of an Adaptive Extensible Processor, Reconfigurable and Adaptive Architecture Workshop, 2007.12.
70. T. Mihara, K. Inoue, and K. Murakami, Adaptive Management of Cache Block Replication for High-Performance CMP, WorkshopOn Chip MultiProcessor: Processor Architecture and Memory Hierarchy related Issues, 2007.09.
71. H. Honda, T. Hayashi, Y. Inadomi, K. Inoue, and K. Murakami, Implementation and Evaluation of Fock Matrix Calculation Program on the Cell Processor, International Conference of Computational Method in Sciences and Enginnering, 2007.09.
72. T. Takami, J. Maki, J. Ooba, Y. Inadomi, H. Honda, R. Susukita, K. Inoue, T. Kobayashi, R. Nogita, and M. Aoyagi, Multi-physics Extension of OpenFMO, FrameworkInternational Conference of Computational Method in Sciences and Enginnering, 2007.09.
73. J. Maki, Y. Inadomi, T. Takami, R. Susukita, H. Honda, J. Ooba, T. Kobayashi, R. Nogita, K. Inoue and M. Aoyagi, One-sided Communication Implementation in FMO Method, International Conference on High Performance Computing, Grid and e-Science in Asia Pacific Regiion, 2007.09.
74. H. Noori, M. Goudarzi, K. Inoue, and K. Murakami, The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems, International Conference on Embedded Systems and Applications, 2007.06.
75. 小野貴継 井上弘士 村上和彰, メモリアクセスの特徴を活用した高速かつ正確なメモリアーキテクチャ・シミュレーション法, 先進的計算基盤システムシンポジウム, 2007.05.
76. 森江善之, 末安直樹 松本透, 南里豪志, 石畑宏明, 井上弘士, 村上和彰, 通信タイミングを考慮した衝突削減のためのMPIランク配置最適化技術, 先進的計算基盤システムシンポジウム, 2007.05.
77. R. Komiya, K. Inoue, and K. Murakami, Dynamic Management Technique to Mitigate Performance Degradation for Low-Leakage Caches, The 10th IEEE Symposium on Low-Power and High-Speed Chips, 2007.04.
78. H. Noori, F. Mehdipour, K. Murakami, K. Inoue, and M. Goudarzi, H. Noori, F. Mehdipour, K. Murakami, K. Inoue, and M. Goudarzi, "Generating and Executing Multi-Exit Custom Instructions for an Adaptive Extensible Processor, The European Event for Electronic System Design & Test (DATE'07), 2007.04.
79. K. Inoue, H. Tanaka, V. Moshnyaga, K. Murakami, A Low Power I-Cache Design with Tag-Comparison Reuse, The International Symposium on System-On-Chip, 2004.11.
80. Koji Inoue, Energy-Security Tradeoff in a Secure Cache Architecture Against Buffer Overflow Attacks, Workshop on Architectural Support for Security and Anti-Virus (WASSA), 2004.10.
81. R. Komiya, K. Inoue, V. Moshnyaga, K. Murakami, Quantitative Evaluation of Leakage Reduction Algorithm for L1 Data Caches, The International SoC Design Conference (ISOCC), 2004.10.
82. Vasily G. Moshnyaga, Koji Inoue, Mizuka Fukagawa, Reducing energy consumption of video memory by bit-width compression, Proceedings of the 2002 International Symposium on Low Power Electronics and Design, 2002.01, [URL], A new architectural technique to reduce energy dissipation of video memory is propose. Unlike existing approaches, the technique exploits the pixel correlation in video sequences, dynamically adjusting the memory bit-width to the number of bits changed per pixel. Instead of treating the data bits independently, we group the most significant bits together, activating the corresponding group of bit-lines adaptively to data variation. The method is not restricted to the specific bit-patterns nor depends on the storage phase. It works equally well on read and write accesses, as well as during precharging. Simulation results show that using this method we can reduce the total energy consumption of video memory by 20% without affecting the picture quality..
83. Inoue Koji, Koji Kai, Kazuaki Murakami, Dynamically variable line-size cache exploiting high on-chip memory bandwidth of merged DRAM/logic LSIs, Proceedings of the 1999 5th International Symposium on High-Performance Computer Architecture, HPCA, 1999.01, This paper proposes a novel cache architecture suitable for merged DRAM/logic LSIs, which is called `dynamically variable line-size cache (D-VLS cache)'. The D-VLS cache can optimize its line-size according to the characteristic of programs, and attempts to improve the performance by exploiting the high on-chip memory bandwidth. In our evaluation, it is observed that the performance improvement achieved by a direct-mapped D-VLS cache is about 27%, compared to a conventional direct-mapped cache with fixed 32-byte lines..
84. Koji Inoue, V. G. Moshnyaga, K. Murakami, A history-based i-cache for low-energy multimedia applications, Proceedings of the 2002 International Symposium on Low Power Electronics and Design, [URL], This paper proposes a history-based tag-comparison scheme for reducing energy consumption of direct-mapped instruction caches. The proposed cache efficiently exploits program-execution footprints recorded in the Branch Target Buffer (BTB), and attempts to detect and eliminate unnecessary tag checks at run time. Simulation results show that our approach can eliminate up to 95% of tag checks, saving the cache energy by 17%, while affecting the processor performance by only 0.2%..
85. Koji Inoue, Tohru Ishihara, Kazuaki Murakami, Way-predicting set-associative cache for high performance and low energy consumption, Proceedings of the 1999 International Conference on Low Power Electronics and Design (ISLPED), [URL], This paper proposes a new approach using way prediction for achieving high performance and low energy consumption of set-associative caches. By accessing only a single cache way predicted, instead of accessing all the ways in a set, the energy consumption can be reduced. This paper shows that the way-predicting set-associative cache improves the ED (energy-delay) product by 60-70% compared to a conventional set-associative cache..

九大関連コンテンツ

pure2017年10月2日から、「九州大学研究者情報」を補完するデータベースとして、Elsevier社の「Pure」による研究業績の公開を開始しました。