九州大学 研究者情報
発表一覧
IWANA BRIAN KENJI(いわな ぶらいあん けんじ) データ更新日:2024.04.01

准教授 /  システム情報科学研究院 情報知能工学部門


学会発表等
1. Brian Kenji Iwana, Using Motif-Based Features to Improve Signal Classification with Temporal Neural Networks, Asian Conference on Pattern Recognition (ACPR), 2023.11.
2. Brian Kenji Iwana, Vision Conformer: Incorporating Convolutions into Vision Transformer Layers, International Conference on Document Analysis and Recognition (ICDAR), 2023.08.
3. Brian Kenji Iwana, On Mini-Batch Training with Varying Length Time Series, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022.05.
4. Brian Kenji Iwana, Font Style that Fits an Image -- Font Generation Based on Image Context, International Conference on Document Analysis and Recognition (ICDAR), 2021.09.
5. Brian Kenji Iwana, Time Series Data Augmentation for Neural Networks by Time Warping with a Discriminative Teacher, International Conference on Pattern Recognition (ICPR), 2021.01.
6. Brian Kenji Iwana, What is the Reward for Handwriting? --- Handwriting Generation by Imitation Learning, International Conference on Frontiers in Handwriting Recognition (ICFHR), 2020.08, Analyzing the handwriting generation process is an important issue and has been tackled by various generation models, such as kinematics based models and stochastic models. In this study, we use a reinforcement learning (RL) framework to realize handwriting generation with the careful future planning ability. In fact, the handwriting process of human beings is also supported by their future planning ability; for example, the ability is necessary to generate a closed trajectory like '0' because any shortsighted model, such as a Markovian model, cannot generate it. For the algorithm, we employ generative adversarial imitation learning (GAIL). Typical RL algorithms require the manual definition of the reward function, which is very crucial to control the generation process. In contrast, GAIL trains the reward function along with the other modules of the framework. In other words, through GAIL, we can understand the reward of the handwriting generation process from handwriting examples. Our experimental results qualitatively and quantitatively show that the learned reward catches the trends in handwriting generation and thus GAIL is well suited for the acquisition of handwriting behavior.
.
7. Brian Kenji Iwana, Effect of Text Color on Word Embeddings, International Workshop on Document Analysis Systems (DAS), 2020.06, In natural scenes and documents, we can find the correlation between a text and its color. For instance, the word, "hot", is often printed in red, while "cold" is often in blue. This correlation can be thought of as a feature that represents the semantic difference between the words. Based on this observation, we propose the idea of using text color for word embeddings. While text-only word embeddings (e.g. word2vec) have been extremely successful, they often represent antonyms as similar since they are often interchangeable in sentences. In this paper, we try two tasks to verify the usefulness of text color in understanding the meanings of words, especially in identifying synonyms and antonyms. First, we quantify the color distribution of words from the book cover images and analyze the correlation between the color and meaning of the word. Second, we try to retrain word embeddings with the color distribution of words as a constraint. By observing the changes in the word embeddings of synonyms and antonyms before and after re-training, we aim to understand the kind of words that have positive or negative effects in their word embeddings when incorporating text color information.
.
8. Brian Kenji Iwana, Explaining Convolutional Neural Networks using Softmax Gradient Layer-wise Relevance Propagation, International Conference on Computer Vision Workshops, 2019.10, [URL], Convolutional Neural Networks (CNN) have become state-of-the-art in the field of image classification. However, not everything is understood about their inner representations. This paper tackles the interpretability and explainability of the predictions of CNNs for multi-class classification problems. Specifically, we propose a novel visualization method of pixel-wise input attribution called Softmax-Gradient Layer-wise Relevance Propagation (SGLRP). The proposed model is a class discriminate extension to Deep Taylor Decomposition (DTD) using the gradient of softmax to back propagate the relevance of the output probability to the input image. Through qualitative and quantitative analysis, we demonstrate that SGLRP can successfully localize and attribute the regions on input images which contribute to a target object's classification. We show that the proposed method excels at discriminating the target objects class from the other possible objects in the images. We confirm that SGLRP performs better than existing Layer-wise Relevance Propagation (LRP) based methods and can help in the understanding of the decision process of CNNs..
9. Brian Kenji Iwana, Selective Super-Resolution for Scene Text Images, International Conference on Document Analysis and Recognition (ICDAR), 2019.09, [URL], In this paper, we realize the enhancement of super-resolution using images with scene text. Specifically, this paper proposes the use of Super-Resolution Convolutional Neural Networks (SRCNN) which are constructed to tackle issues associated with characters and text. We demonstrate that standard SRCNNs trained for general object super-resolution is not sufficient and that the proposed method is a viable method in creating a robust model for text. To do so, we analyze the characteristics of SRCNNs through quantitative and qualitative evaluations with scene text data. In addition, analysis using the correlation between layers by Singular Vector Canonical Correlation Analysis (SVCCA) and comparison of filters of each SRCNN using t-SNE is performed. Furthermore, in order to create a unified super-resolution model specialized for both text and objects, a model using SRCNNs trained with the different data types and Content-wise Network Fusion (CNF) is used. We integrate the SRCNN trained for character images and then SRCNN trained for general object images, and verify the accuracy improvement of scene images which include text. We also examine how each SRCNN affects super-resolution images after fusion..
10. Brian Kenji Iwana, Modality Conversion of Handwritten Patterns by Cross Variational Autoencoders, International Conference on Document Analysis and Recognition (ICDAR), 2019.09, [URL], This research attempts to construct a network that can convert online and offline handwritten characters to each other. The proposed network consists of two Variational Auto-Encoders (VAEs) with a shared latent space. The VAEs are trained to generate online and offline handwritten Latin characters simultaneously. In this way, we create a cross-modal VAE (Cross-VAE). During training, the proposed Cross-VAE is trained to minimize the reconstruction loss of the two modalities, the distribution loss of the two VAEs, and a novel third loss called the space sharing loss. This third, space sharing loss is used to encourage the modalities to share the same latent space by calculating the distance between the latent variables. Through the proposed method mutual conversion of online and offline handwritten characters is possible. In this paper, we demonstrate the performance of the Cross-VAE through qualitative and quantitative analysis..
11. Brian Kenji Iwana, Dynamic Weight Alignment for Temporal Convolutional Neural Networks, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019.05, [URL], In this paper, we propose a method of improving temporal Convolutional Neural Networks (CNN) by determining the optimal alignment of weights and inputs using dynamic programming. Conventional CNN convolutions linearly match the shared weights to a window of the input. However, it is possible that there exists a more optimal alignment of weights. Thus, we propose the use of Dynamic Time Warping (DTW) to dynamically align the weights to the input of the convolutional layer. Specifically, the dynamic alignment overcomes issues such as temporal distortion by finding the minimal distance matching of the weights and the inputs under constraints. We demonstrate the effectiveness of the proposed architecture on the Unipen online handwritten digit and character datasets, the UCI Spoken Arabic Digit dataset, and the UCI Activities of Daily Life dataset..
12. Brian Kenji Iwana, Dynamic Weight Alignment for Temporal Convolutional Neural Networks, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019.05, In this paper, we propose a method of improving temporal Convolutional Neural Networks (CNN) by determining the optimal alignment of weights and inputs using dynamic programming. Conventional CNN convolutions linearly match the shared weights to a window of the input. However, it is possible that there exists a more optimal alignment of weights. Thus, we propose the use of Dynamic Time Warping (DTW) to dynamically align the weights to the input of the convolutional layer. Specifically, the dynamic alignment overcomes issues such as temporal distortion by finding the minimal distance matching of the weights and the inputs under constraints. We demonstrate the effectiveness of the proposed architecture on the Unipen online handwritten digit and character datasets, the UCI Spoken Arabic Digit dataset, and the UCI Activities of Daily Life dataset..
13. Brian Kenji Iwana, Capturing Micro Deformations from Pooling Layers for Offline Signature Verification, International Conference on Document Analysis and Recognition (ICDAR), 2018.08, In this paper, we propose a novel Convolutional Neural Network (CNN) based method that extracts the location information (displacement features) of the maximums in the max-pooling operation and fuses it with the pooling features to capture the micro deformations between the genuine signatures and skilled forgeries as a feature extraction procedure. After the feature extraction procedure, we apply support vector machines (SVMs) as writer-dependent classifiers for each user to build the signature verification system. The extensive experimental results on GPDS-150, GPDS-300, GPDS-1000, GPDS-2000, and GPDS-5000 datasets demonstrate that the proposed method can discriminate the genuine signatures and their corresponding skilled forgeries well and achieve state-of-the-art results on these datasets..
14. Brian Kenji Iwana, Introducing Local Distance-based Features to Temporal Convolutional Neural Networks, International Conference on Frontiers in Handwriting Recognition (ICFHR), 2018.08, In this paper, we propose the use of local distance-based features determined by Dynamic Time Warping (DTW) for temporal Convolutional Neural Networks (CNN). Traditionally, DTW is used as a robust distance metric for time series patterns. However, this traditional use of DTW only utilizes the scalar distance metric and discards the local distances between the dynamically matched sequence elements. This paper proposes recovering these local distances, or DTW features, and utilizing them for the input of a CNN. We demonstrate that these features can provide additional information for the classification of isolated handwritten digits and characters. Furthermore, we demonstrate that the DTW features can be combined with the spatial coordinate features in multi-modal fusion networks to achieve state-of-the-art accuracy on the Unipen online handwritten character datasets.
.

九大関連コンテンツ

pure2017年10月2日から、「九州大学研究者情報」を補完するデータベースとして、Elsevier社の「Pure」による研究業績の公開を開始しました。