Updated on 2025/06/09

Information

 

写真a

 
ZHAO YUTING
 
Organization
Faculty of Information Science and Electrical Engineering Department of Advanced Information Technology Assistant Professor
School of Engineering Department of Electrical Engineering and Computer Science(Concurrent)
Graduate School of Information Science and Electrical Engineering Department of Information Science and Technology(Concurrent)
Title
Assistant Professor
Contact information
メールアドレス
Tel
0928023668
External link

Research Areas

  • Informatics / Intelligent informatics

Degree

  • Doctor of Philosophy

Research History

  • Kyushu University Faculty of Information Science and Electrical Engineering Department of Advanced Information Technology  Assistant Professor 

    2023.4 - Present

Education

  • Tokyo Metropolitan University    

    2020.4 - 2023.3

Research Interests・Research Keywords

  • Research theme: Machine Learning, Artificial Intelligence

    Keyword: Natural Language Processing

    Research period: 2023.4 - Present

Awards

  • The 10th AAMT NAGAO AWARD Student Award

    2023.6  

Papers

  • Multimodal Neural Machine Translation based on Image‑Text Semantic Correspondence

    Yuting Zhao

    Journal of the Japanese Society for Artificial Intelligence   ( 39 )   55 - 55   2024.1

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • Multimodal Neural Machine Translation based on Image‑Text Semantic Correspondence Invited

    Yuting Zhao

    Asia‑Pacific Association for Machine Translation Journal   ( 79 )   10 - 15   2023.12

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

  • Multimodal Robustness for Neural Machine Translation Reviewed

    Yuting Zhao, Ioan Calapodescu

    Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022   8505 - 8516   2022.12

     More details

    Language:Others   Publishing type:Research paper (other academic)  

    In this paper, we look at the case of a Generic text-to-text NMT model that has to deal with data coming from various modalities, like speech, images, or noisy text extracted from the web. We propose a two-step method, based on composable adapters, to deal with this problem of Multimodal Robustness. In the first step, we separately learn domain adapters and modality specific adapters, to deal with noisy input coming from various sources: ASR, OCR, or noisy text (UGC). In a second step, we combine these components at runtime via dynamic routing or, when the source of noise is unknown, via two new transfer learning mechanisms (Fast Fusion and Multi Fusion). We show that our method provides a flexible, state-of-the-art, architecture able to deal with noisy multimodal inputs.

  • Region-attentive multimodal neural machine translation Reviewed

    Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu

    Neurocomputing   476   1 - 13   2022.3

     More details

    Language:Others   Publishing type:Research paper (scientific journal)  

    We propose a multimodal neural machine translation (MNMT) method with semantic image regions called region-attentive multimodal neural machine translation (RA-NMT). Existing studies on MNMT have mainly focused on employing global visual features or equally sized grid local visual features extracted by convolutional neural networks (CNNs) to improve translation performance. However, they neglect the effect of semantic information captured inside the visual features. This study utilizes semantic image regions extracted by object detection for MNMT and integrates visual and textual features using two modality-dependent attention mechanisms. The proposed method was implemented and verified on two neural architectures of neural machine translation (NMT): recurrent neural network (RNN) and self-attention network (SAN). Experimental results on different language pairs of Multi30k dataset show that our proposed method improves over baselines and outperforms most of the state-of-the-art MNMT methods. Further analysis demonstrates that the proposed method can achieve better translation performance because of its better visual feature use.

    DOI: 10.1016/j.neucom.2021.12.076

  • Word-Region Alignment-Guided Multimodal Neural Machine Translation Reviewed

    Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu

    IEEE/ACM Transactions on Audio Speech and Language Processing   30   244 - 259   2022.1

     More details

    Language:Others   Publishing type:Research paper (scientific journal)  

    We propose word-region alignment-guided multimodal neural machine translation (MNMT), a novel model for MNMT that links the semantic correlation between textual and visual modalities using word-region alignment (WRA). Existing studies on MNMT have mainly focused on the effect of integrating visual and textual modalities. However, they do not leverage the semantic relevance between the two modalities. We advance the semantic correlation between textual and visual modalities in MNMT by incorporating WRA as a bridge. This proposal has been implemented on two mainstream architectures of neural machine translation (NMT): the recurrent neural network (RNN) and the transformer. Experiments on two public benchmarks, English-German and English-French translation tasks using the Multi30k dataset and English-Japanese translation tasks using the Flickr30kEnt-JP dataset prove that our model has a significant improvement with respect to the competitive baselines across different evaluation metrics and outperforms most of the existing MNMT models. For example, 1.0 BLEU scores are improved for the English-German task and 1.1 BLEU scores are improved for the English-French task on the Multi30k test2016 set; and 0.7 BLEU scores are improved for the English-Japanese task on the Flickr30kEnt-JP test set. Further analysis demonstrates that our model can achieve better translation performance by integrating WRA, leading to better visual information use.

    DOI: 10.1109/TASLP.2021.3138719

  • TMEKU System for the WAT2021 Multimodal Translation Task

    Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu

    WAT 2021 - 8th Workshop on Asian Translation, Proceedings of the Workshop   174 - 180   2021.8

     More details

    Language:Others   Publishing type:Research paper (other academic)  

    We introduce our TMEKU1 system submitted to the English→Japanese Multimodal Translation Task for WAT 2021. We participated in the Flickr30kEnt-JP task and Ambiguous MSCOCO Multimodal task under the constrained condition using only the officially provided datasets. Our proposed system employs soft alignment of word-region for multimodal neural machine translation (MNMT). The experimental results evaluated on the BLEU metric provided by the WAT 2021 evaluation site show that the TMEKU system has achieved the best performance among all the participated systems. Further analysis of the case study demonstrates that leveraging word-region alignment between the textual and visual modalities is the key to performance enhancement in our TMEKU system, which leads to better visual information use.

  • Neural machine translation with semantically relevant image regions

    Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu

    Proceedings of the Twenty-seventh Annual Meeting of the Association for Natural Language Processing   2021.4

     More details

    Language:Others  

  • Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions Reviewed

    Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu

    Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, EAMT 2020   105 - 114   2020.11

     More details

    Language:Others   Publishing type:Research paper (other academic)  

    Existing studies on multimodal neural machine translation (MNMT) have mainly focused on the effect of combining visual and textual modalities to improve translations. However, it has been suggested that the visual modality is only marginally beneficial. Conventional visual attention mechanisms have been used to select the visual features from equallysized grids generated by convolutional neural networks (CNNs), and may have had modest effects on aligning the visual concepts associated with textual objects, because the grid visual features do not capture semantic information. In contrast, we propose the application of semantic image regions for MNMT by integrating visual and textual features using two individual attention mechanisms (double attention). We conducted experiments on the Multi30k dataset and achieved an improvement of 0.5 and 0.9 BLEU points for English!German and English!French translation tasks, compared with the MNMT with grid visual features. We also demonstrated concrete improvements on translation performance benefited from semantic image regions.

  • Application of Unsupervised NMT Technique to Japanese--Chinese Machine Translation

    Yuting Zhao, Longtu Zhang, Mamoru Komachi

    Proceedings of the Annual Conference of JSAI 33rd   2019.6

     More details

    Language:Others  

  • TMU Japanese-Chinese unsupervised NMT system for WAT 2018 translation task

    Longtu Zhang, Yuting Zhao, Mamoru Komachi

    Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation   981 - 987   2018.12

     More details

    Language:Others   Publishing type:Research paper (other academic)  

    This paper describes the unsupervised neural machine translation system of Tokyo Metropolitan University for the WAT 2018 translation task, focusing on Chinese-Japanese translation. Neural machine translation (NMT) has recently achieved impressive performance on some language pairs, although the lack of large parallel corpora poses a major practical problem for its training. In this work, only monolingual data are used to train the NMT system through an unsupervised approach. This system creates synthetic parallel data through back-translation and leverages language models trained on both source and target domains. To enhance the shared information in the bilingual word embeddings further, a decomposed ideograph and stroke dataset for ASPEC Chinese-Japanese Language pairs was also created. BLEU scores of 32.99 for ZHJA and 26.39 for JA-ZH translation were recorded, respectively (both using stroke data).

▼display all

Presentations

  • A Short Introduction to Multimodal Machine Translation Invited International conference

    Yuting Zhao

    Dalian University of Technology  2023.9 

     More details

    Event date: 2023.9

    Language:Others   Presentation type:Oral presentation (general)  

    Country:Other  

  • Multimodal MNMT based on Image-Text Semantic Correspondence Invited

    Yuting Zhao

    2023年度第3回AAMT/Japio特許翻訳研究会  2023.7 

     More details

    Event date: 2023.7

    Language:Others   Presentation type:Oral presentation (general)  

    Country:Other  

  • Multimodal Neural Machine Translation based on Image-Text Semantic Correspondence Invited

    Yuting Zhao

    第18回AAMT長尾賞/第10回AAMT長尾賞学生奨励賞 受賞記念講演  2023.6 

     More details

    Event date: 2023.6

    Language:Others   Presentation type:Oral presentation (invited, special)  

    Country:Other  

    Other Link: https://aamt.info/event/generalmeeting2023-seminar

  • Multimodal Robustness for Neural Machine Translation

    Yuting Zhao

    The 2022 Conference on Empirical Methods in Natural Language Processing  2022.12 

     More details

    Event date: 2022.12

    Language:English  

    Country:Other  

    Multimodal Robustness for Neural Machine Translation

  • TMEKU System for the WAT2021 Multimodal Translation Task System

    Yuting Zhao

    The 8th Workshop on Asian Translation  2021.8 

     More details

    Event date: 2021.8

    Language:English  

    Country:Other  

    TMEKU System for the WAT2021 Multimodal Translation Task System

  • Neural Machine Translation with Semantically Relevant Image Regions

    Yuting Zhao

    The 27th Annual Meeting of the Language Processing Society of Japan  2021.3 

     More details

    Event date: 2021.3

    Language:English  

    Country:Other  

    Neural Machine Translation with Semantically Relevant Image Regions

  • Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions

    Yuting Zhao

    The 22nd Annual Conference of the European Association for Machine Translation  2020.11 

     More details

    Event date: 2020.11

    Language:English  

    Country:Other  

    Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions

  • Application of Unsupervised NMT Technique to Japanese‑Chinese Machine Translation

    Yuting Zhao

    The 33rd Annual Conference of the Japanese Society for Artificial Intelligence  2019.6 

     More details

    Event date: 2019.6

    Language:English  

    Country:Other  

    Application of Unsupervised NMT Technique to Japanese‑Chinese Machine Translation

▼display all

Industrial property rights

Patent   Number of applications: 0   Number of registrations: 0
Utility model   Number of applications: 1   Number of registrations: 0
Design   Number of applications: 0   Number of registrations: 0
Trademark   Number of applications: 0   Number of registrations: 0

Educational Activities

  • ASSISTANT PROFESSOR

Award for Educational Activities

  • The 10th AAMT NAGAO AWARD Student Award

       

Visiting, concurrent, or part-time lecturers at other universities, institutions, etc.

  • 2024  東京都立大学  Classification:Affiliate faculty  Domestic/International Classification:Japan 

  • 2023  東京都立大学  Classification:Affiliate faculty  Domestic/International Classification:Japan 

Other educational activity and Special note

  • 2024  Class Teacher  学部