Tomoyuki Tsuchiya Last modified date:2021.06.25

Associate Professor / Department of Multicultural Society / Faculty of Languages and Cultures

1. Tomoyuki Tsuchiya, Extracting and Analyzing English Multi-word Expressions with Slots: A Case Study of 'take', 言語処理学会 第27回年次大会 発表論文集, 1134-1137, 2021.03.
2. Formulaic Language for Language Creativity.
Prediction of intonation markers from linguistic and acoustic features, 9th International Conference on Language Resources and Evaluation, LREC 2014 Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014, 311-315, 2014.01, Because of the tremendous effort required for recording and transcription, large-scale spoken language corpora have been hardly developed in Japanese, with a notable exception of the Corpus of Spontaneous Japanese (CSJ). Various research groups have individually developed conversation corpora in Japanese, but these corpora are transcribed by different conventions and have few annotations in common, and some of them lack fundamental annotations, which are prerequisites for conversation research. To solve this situation by sharing existing conversation corpora that cover diverse styles and settings, we have tried to automatically transform a transcription made by one convention into that made by another convention. Using a conversation corpus transcribed in both the Conversation - Analysis-style (CA-style) and CSJ-style, we analyzed the correspondence between CA's 'intonation markers' and CSJ's 'tone labels,' and constructed a statistical model that converts tone labels into intonation markers with reference to linguistic and acoustic features of the speech. The result showed that there is considerable variance in intonation marking even between trained transcribers. The model predicted with 85% accuracy the presence of the intonation markers, and classified the types of the markers with 72% accuracy..
4. Survey of Conversational Behavior.
Towards the design of a balanced corpus of everyday Japanese conversation, 10th International Conference on Language Resources and Evaluation, LREC 2016 Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, 4434-4439, 2016.01, In 2016, we set about building a large-scale corpus of everyday Japanese conversation-a collection of conversations embedded in naturally occurring activities in daily life. We will collect more than 200 hours of recordings over six years, publishing the corpus in 2022. To construct such a huge corpus, we have conducted a pilot project, one of whose purposes is to establish a corpus design for collecting various kinds of everyday conversations in a balanced manner. For this purpose, we conducted a survey of everyday conversational behavior, with about 250 adults, in order to reveal how diverse our everyday conversational behavior is and to build an empirical foundation for corpus design. The questionnaire included when, where, how long, with whom, and in what kind of activity informants were engaged in conversations. We found that ordinary conversations show the following tendencies: i) they mainly consist of chats, business talks, and consultations; ii) in general, the number of participants is small and the duration of the conversation is short; iii) many conversations are conducted in private places such as homes, as well as in public places such as offices and schools; and iv) some questionnaire items are related to each other. This paper describes an overview of this survey study, and then discusses how to design a large-scale corpus of everyday Japanese conversation on this basis..
6. Organizing a University Online English Course without Classes.
7. Media-dependent Characteristics of Formulaic Expressions.
8. Embodiment through Media.