九州大学 研究者情報
研究者情報 (研究者の方へ)入力に際してお困りですか?
基本情報 研究活動 教育活動 社会活動
冨浦 洋一(とみうら よういち) データ更新日:2023.11.27

教授 /  システム情報科学研究院 情報学部門 知能科学


主な研究テーマ
匂いの定量化に関する研究
キーワード:匂い物質,匂いコード,embedding,SMILES、嗅球
2021.03.
英語学習者のエッセイに対するフィードバックの半自動化に関する研究
キーワード:大規模言語モデル、BERT、教師無し学習
2023.02.
研究データの検索に関する研究
キーワード:研究データ
2022.04.
紙媒体で保存されている史資料の大規模情報基盤構築のための総合的研究
キーワード:史資料,テキスト化,全文検索,情報基盤
2018.08~2022.03.
嗅球上の糸球体の活性パターン画像を利用した嗅覚情報処理
キーワード:嗅覚,匂い情報プリミティブ,糸球体のクラスタリング
2014.10.
機関リポジトリを活用した潜在的研究クラスタの抽出
キーワード:Author Topic Model,トピック分析,共同研究,リサーチアドミニストレーター
2013.04~2016.03.
言語処理技術を用いた任意の英文書の内容に関する問題と解答の自動生成
キーワード: 質問と回答の自動生成,多読,学習支援,自然言語処理
2012.04~2016.03.
意見文の背後にある価値観の自動推定
キーワード:価値観,意見文,SVM,潜在変数,Gibbs Sampling
2012.02~2021.03.
学術情報検索支援のための論文の組織化
キーワード:クラスタリング,k-means,潜在変数,統計的言語モデル,分布類似度,Gibbs Sampling
2011.04~2016.03.
WWW上の文書の組織化に関する研究
キーワード:文書分類,トピック推定,クラスタ間関係の推定,潜在クラス,統計的言語モデル,BIC
2010.10~2012.03.
確率的変換に基づくインターネット調査手法の解析
キーワード:インターネット調査,匿名性,確率変換
2007.07~2009.12.
Web上からの母語話者/非母語話者英語論文コーパスの作成・公開とその利用
キーワード:Web文書,母語話者性の判定,学習者コーパス,英語教育,英作文支援システム,自然言語処理
2007.10~2012.03.
留学生の日本語作文における不自然な共起表現の代替表現の提示方法に関する研究
キーワード:出現環境の類似性,自然な共起表現,日本語作文支援システム,自然言語処理
2008.06~2011.03.
大規模言語コーパスからの意味的知識の獲得
キーワード:意味カテゴリ,格フレーム,因果関係,自己組織化,確率言語モデル,自然言語処理
2003.08~2010.09.
確率言語モデルに基づく英語文書の母語話者性の推定
キーワード:Web文書,母語話者性の判定,確率言語モデル,仮説検定,自然言語処理
2003.06~2009.09.
WWW上の文書を利用した対話システムに関する研究
キーワード:対話,open-domain, 対話の表層的結束性,対話の意味的結束性
2005.09~2009.08.
言語コーパスからの語の共起性の推定
キーワード:語の共起性,統語的曖昧さ解消,重回帰モデル,自然言語処理
1999.06~2004.03.
共起データに基づく名詞の多次元空間への配置
キーワード:単語ベクトル,共起性,用例主導法,自然言語処理
2002.07~2003.10.
類似言語および小規模ターゲット文書に頑強な言語識別手法に関する研究
キーワード:言語推定,類似言語,WWW,情報検索
2004.08~2009.09.
Web上の語の共起性に基づいた日本語共起の翻訳支援
キーワード:翻訳,Web文書,共起性,Word Sense Disambiguation
2003.08~2005.09.
従事しているプロジェクト研究
Computational Social Science Research
2013.04, 代表者:Douglas W. Oard, University of Maryland
計算機を利用した自然言語処理等の情報学のアプローチで社会科学(Social Science)における分析を支援する研究.
.
研究業績
主要原著論文
1. Xiaofan Zheng, Yoichi Tomiura, Kenshi Hayashi , Investigation of the structure-odor relationship using a Transformer model, Journal of Cheminformatics, https://doi.org/10.1186/s13321-022-00671-y, 2022.12, The relationships between molecular structures and their properties are subtle and complex, and the properties of odor are no exception. Molecules with similar structures, such as a molecule and its optical isomer, may have completely different odors, whereas molecules with completely distinct structures may have similar odors. Many works have attempted to explain the molecular structure-odor relationship from chemical and data-driven perspectives. The Transformer model is widely used in natural language processing and computer vision, and the attention mechanism included in the Transformer model can identify relationships between inputs and outputs. In this paper, we describe the construction of a Transformer model for predicting molecular properties and interpreting the prediction results. The SMILES data of 100,000 molecules are collected and used to predict the existence of molecular substructures, and our proposed model achieves an F1 value of 0.98. The attention matrix is visualized to investigate the substructure annotation performance of the attention mechanism, and we find that certain atoms in the target substructures are accurately annotated. Finally, we collect 4462 molecules and their odor descriptors and use the proposed model to infer 98 odor descriptors, obtaining an average F1 value of 0.33. For the 19 odor descriptors that achieved F1 values greater than 0.45, we also attempt to summarize the relationship between the molecular substructures and odor quality through the attention matrix..
2. Yasuko Hagiwara, Emi Ishita, Yukiko Watanabe, Yoichi Tomiura, Identifying Scholarly Search Skills Based on Resource and Document Selection Behavior among Researchers and Master’s Students in Engineering, College & Research Libraries, https://doi.org/10.5860/crl.83.4.610, 83, 4, 610-630, 2022.07.
3. Satoshi Fukuda, Emi Ishita, Yoichi Tomiura, Douglas W. Oard, Automating the Choice Between Single or Dual Annotation for Classifier Training, The 23rd International Conference on Asia-Pacific Digital Libraries (ICADL 2021), 10.1007/978-3-030-91669-5_19, 233-248, 2021.12, Many emerging digital library applications rely on automated classifiers that are trained using manually assigned labels. Accurately labeling training data for text classification requires either highly trained coders or multiple annotations, either of which can be costly. Previous studies have shown that there is a quality-quantity trade-off for this labeling process, and the optimal balance between quality and quantity varies depending on the annotation task. In this paper, we present a method that learns to choose between higher-quality annotation that results from dual annotation and higher-quantity annotation that results from the use of a single annotator per item. We demonstrate the effectiveness of this approach through an experiment in which a binary classifier is constructed for assigning human value categories to sentences in newspaper editorials..
4. Xiaofan Zheng, Yoichi Tomiura, Kenshi Hayashi, Takaaki Soeda, Profile-Decomposing Output of Multi-Channel Odor Sensor Array, ECS Meeting Abstracts, MA2020-01, 2020.05.
5. Keiya Maekawa, Yoichi Tomiura, Satoshi Fukuda, Emi Ishita, Hideaki Uchiyama, Improving OCR for Historical Documents by Modeling Image Distortion, Lecture Notes in Computer Science, 10.1007/978-3-030-34058-2_31, 11853, 312-316, 2019.11.
6. Satoshi Fukuda, Yoichi Tomiura, Emi Ishita, Research Paper Search Using a Topic-Based Boolean Query Search and a General Query-Based Ranking Model, Lecture Notes in Computer Science, 10.1007/978-3-030-27618-8_5, 11707, 65-75, 2019.08.
7. Emi Ishita, Satoshi Fukuda, Toru Oga, Douglas W. Oard, Kenneth R. Fleischmann, Yoichi Tomiura, An Shou Cheng, Toward Three-Stage Automation of Annotation for Human Values, iConference 2019, 2019.03, Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost..
8. Shinjiro Okaku, Yoichi Tomiura, Emi Ishita, Shosaku Tanaka, Towards Generating Multiple-Choice Tests for Supporting Extensive Reading, Proc. the Seventh International Conference on Mobile, Hybrid, and On-line Learning (eLmL 2015), 2015.02, We propose a method for generating multiple-choice test for an English text selected by a learner and its answer, that are used to make a self-assessment whether the learner comprehends the text after reading it. In our method, the system extracts several important sentences from the text, and replaces one word in each of these sentences with its synonym (if possible). One of these sentences is then selected as a correct optional sentence, while further changes to the polarities or nouns in the remaining sentences are carried out to generate distractor optional sentences for the multiple-choice test. Our method has potential to make extensive reading in English more effective..
9. Yasuhiro Takayama, Yoichi Tomiura, Emi Ishita, Douglas W. Oard, Kenneth R. Fleischmann, An-Shou Cheng, A Word-Scale Probabilistic Latent Variable Model for Detecting Human Values, Proc. 23th ACM International Conference on Information and Knowledge Management (CIKM 2014), 1-10, 2014.12, This paper describes a probabilistic latent variable model that is designed to detect human values such as justice or freedom that a writer has sought to reflect or appeal to when participating in a public debate. The proposed model treats the words in a sentence as having been chosen based on specific values; values reflected by each sentence are then estimated by aggregating values associated with each word. The model can determine the human values for the word in light of the influence of the previous word. This design choice was motivated by syntactic structures such as noun+noun, adjective+noun, and verb+adjective. The classifier based on the model was evaluated on a test collection containing 102 manually annotated documents focusing on one contentious political issue --- Net neutrality, achieving the highest reported classification effectiveness for this task. We also compared our proposed classifier with human second annotator. As a result, the proposed classifier effectiveness is statistically comparable with human annotators..
10. Toshiaki Funatsu, Yoichi Tomiura, Emi Ishita, Kosuke Furusawa, Extracting Representative Words of a Topic Determined by Latent Dirichlet Allocation, Proc. The Sixth International Conference on Information, Process, and Knowledge Management (eKNOW 2014), 2014.03, Determining the topic of a document is necessary to understand the content of the document efficiently. Latent Dirichlet Allocation (LDA) is a method of analyzing topics. In LDA, a topic is treated as an unobservable variable to establish a probabilistic distribution of words. We can interpret the topic with a list of words that appear with high probability in the topic. This method works well when determining a topic included in many documents having a variety of contents. However, it is difficult to interpret the topic just using conventional LDA when determining the topic in a set of article abstracts found by a keyword search, because their contents are limited and similar. We propose a method to estimate representative words of each topic from an LDA result. Experimental results show that our method provides better information for interpreting a topic than LDA does..
11. 田中省作,安東奈穂子,冨浦洋一, コーパス構築と著作権 ― Web を源とした質情報付き英語科学論文コーパス, 英語コーパス研究, 19, pp.31--41, 2012.06, Web文書を利用したコーパスの構築と利用に関して,番号69の論文で述べたプロジェクトを実例として,改正著作権法(2009年改正,2010年施行)の下での取り扱いについて議論を報告した..
12. 田中省作,柴田雅博,冨浦洋一, Webを源とした質情報付き英語科学論文コーパスの構築法, 英語コーパス研究, 18, 61-71, 2011.06.
13. 中野てい子,冨浦洋一, 日本語学習者の動詞選択における誤用と正用の関係:作文支援のための基礎研究, 自然言語処理, 第18巻,第1号,pp.3--29, 2011.01.
14. 中野てい子,冨浦洋一, 日本語作文支援における共起を利用した代替候補提示システム, 日本教育工学会論文誌, 第34巻,第3号,pp.181--189, 2010.12.
15. Teiko NAKANO, Yoichi TOMIURA, Providing Appropriate Alternative Co-occurrence Candidates; Towards a Japanese Composition Support System, Proc. of the Ninth IASTED International Conference on Web-Based Education, pp. 173--179, 2010.03.
16. 柴田雅博,冨浦洋一,西口友美, 雑談自由対話を実現するためのWWW上の文書からの妥当な候補文選択手法
(2009年), 人工知能学会論文誌, 第24巻,第6号,pp.508--520 , 2009.11.
17. Masahiro Shibata, Tomomi Nishiguchi, Yoichi Tomiura , Dialog System for Open-ended Conversation Using Web Documents, Informatica, Vol.33, No.3, pp.277-284, 2009.10.
18. M. Shibata, Y. Tomiura, T. Mizuta, Identification among Similar Languages Using Statistical Hypothesis Testing
, Proc. of Pacific Association for Computational Linguistics (PACLING'09) , pp.47--52 , 2009.09.
19. 田上敦士,佐々木力,長谷川輝之,阿野茂浩,冨浦洋一, 確率的変換に基づくインターネット調査手法の解析, 電子情報通信学会論文誌, Vol.J92-B,No.4,pp.729--740, 2009.04, ネットワーク上での匿名性を保証するアンケート調査手法として,回答(可/否)を確率的に変換した値を送信し,収集者は,可否回答の割合を受信した値の標本平均として推定する手法を提案した.本手法では,調査人数と推定値の許容誤差および信頼度が与えられると,これらの条件を満たす確率変換の分散の上限が一意に定まることを示した.また,回答を送信値から推定する場合の誤り率を用いて匿名度を定義し,与えられた条件(調査人数,推定誤差および信頼度)を満たす,匿名性の点で最良の確率変換を求めることで,本アンケート調査手法の設計方法を示した.
※調査人数,推定値の許容誤差と信頼度,確率変換の分散の間の関係の導出,匿名性の点で最良の確率変換の導出を担当..
20. 冨浦 洋一,青木 さやか,柴田 雅博,行野 顕正, 仮説検定に基づく英文書の母語話者性の判別, 自然言語処理, Vol.16, No.1, pp.23-46, 2009.01.
21. Masahiro Shibata, Tomomi Nishiguchi, Yoichi Tomiura, A Method for Automatically Generating Proper Responses to User's Utterances in Open-ended Conversation by Retrieving Documents on the Web, Proc. of 2008 IEEE International Conference on Information Reuse and Integration (IEEE IRI'08), pp.268-279, 2008.07.
22. Atsushi TAGAMI, Chikara SAKAKI, Teruyuki HASEGAWA, Shigehiro ANO, Yoichi TOMIURA, Optimization of Answering Method with Probability Conversion, Proc. of 2008 International Symposium on Applications and the Internet (SAINT'08), pp.249-252, 2008.07.
23. Atsushi TAGAMI, Chikara SAKAKI, Teruyuki HASEGAWA, Shigehiro ANO, Yoichi TOMIURA, Analysis of Answering Method with Probability Conversion for Internet Research, Fifth IEEE Consumer Communications & Networking Conference (CCNC'08), pp.110-111, 2008.01.
24. 行野 顕正,田中 省作,冨浦 洋一,柴田 雅博, 統計的アプローチによる英語スラッシュ・リーディング教材の自動生成, 情報処理学会論文誌, 第48巻,第1号, 2007.01.
25. 青木 さやか,冨浦 洋一,行野 顕正,谷川 龍司, 言語識別技術を応用した英語における母語話者文書・非母語話者文書の判別, 情報科学技術レターズ, 第5巻,pp.85--88, 2006.09.
26. 本木 実,冨浦 洋一,高橋 直人, 記号列を入出力とするニューラルネットの学習法, 情報処理学会論文誌, 第47巻,第8号,pp.2279--2791, 2006.08.
27. 行野 顕正,田中 省作,冨浦 洋一,松本 英樹, 低頻度 byte 列を活用した言語識別, 情報処理学会論文誌, 第47巻,第4号,pp.1287--1294, 2006.04.
28. 田中 省作,藤井 宏,冨浦 洋一,徳見 道夫, NS/NNS論文分類モデルに基づく日本人英語科学論文の特徴抽出, 英語コーパス研究, 第13号,pp.75--87, 2006.01.
29. Y. Tomiura, S. Tanaka, T. Hitaka, Estimating Satisfactoriness of Selectional Restriction from Corpus without Thesaurus, ACM Transactions on Asian Language Information Processing, Vol.4, No.4, pp.400--416, 2005.12.
30. 藤井 宏,冨浦洋一,田中省作, Skew Divergence に基づく文書の母語話者性の推定, 自然言語処理(言語処理学会論文誌), Vol. 12, No. 4, pp.79-96, 2005.08.
31. K. YUKINO, S. TANAKA, Y. TOMIURA, H. MATSUMOTO, Robust Language Identification for Similar Languages and short texts using Low-Frequent Byte Strings, Pacific Association for Computational Linguistics 2005 (Pacling 2005), pp.368-373, 2005.08.
32. 柴田雅博,冨浦洋一,田中省作, Web上の語の共起性に基づいたコロケーションの翻訳支援, 情報処理学会論文誌, 第46巻,第6号,pp.1480-1491, 2005.06.
33. M. Motoki, Y. Tomiura, N. Takahashi, Problems of FGREP Module and Their Solution, 3rd IEEE International Conference on Cognitive Informatics (ICCI2004), 10.1109/COGINF.2004.1327479, 220-227, pp.220-227, 2004.08.
34. Masahiro SHIBATA, Yoichi TOMIURA, Shosaku TANAKA, A Method for Retrieving Translations of Collocation in Web Data, Asian Symposium on Natural Language Processing to Overcome Language Barriers (in conjunction with IJCNLP-04), 2004.03.
35. 冨浦洋一,田中省作,日高 達, 共起データに基づく名詞の多次元空間への配置, 人工知能学会論文誌, 19巻,1号A, pp.1-9, 2004.01.
36. 冨浦洋一,日高 達, 言語コーパスからの語の共起性の推定, 情報処理学会論文誌, 第45巻,第1号,pp.324-332, 2004.01.
37. TAKAHASHI Naoto, MOTOKI Minoru, SHIMAZU Yoshio, TOMIURA Yoichi, HITAKA Toru, PP-attachment Ambiguity Resolution Using a Neural Network wiht Modified FGREP Method, the 2nd Workshop on Natural Language Processing and Neural Networks (post-conference workshop of NLPRS2001), pp.1-7, 2001.11.
38. 田辺利文,冨浦洋一,日高達, 係り受け文脈自由文法とその日本語への適用, 情報処理学会論文誌, 第41巻, 第1号, pp.36 - 45, 2000.01.
39. 冨浦洋一,日高達, スパ−スな学習デ−タにおける確率係り受け文脈自由文法の確率パラメタの推定法, 情報処理学会論文誌, 第40巻, 第11号, pp.4055 - 4063, 1999.11.
40. 田中省作,冨浦洋一,日高達, 意味範疇の散らばりに基づいた名詞の統語範疇の分類, 情報処理学会論文誌, 第40巻, 第9号, pp.3387 - 3396, 1999.09.
41. 冨浦洋一,日高達, k-NN 推定法に基づく統語的あいまいさ解消法, 電子情報通信学会論文誌 D-II, Vol.J80-D-II, No.9, pp.2475 - 2481, 1997.09.
42. 冨浦洋一 中村貞吾 日高達, 名詞句「NPのNP」の意味構造, 情報処理学会論文誌, 情報処理学会論文誌第36巻第6号 pp.1441 - 1448, 1995.06.
43. 冨浦洋一 市丸夏樹 日高達, 常識推論における推論の選択と文脈処理への応用, 情報処理学会論文誌, Vol.35, No.11, pp.2239 - 2248, 1994.11.
44. 冨浦洋一,中村貞吾,日高達, 最左部分語検索向き辞書データ構造:Prefix-Closed B-tree, 情報処理学会論文誌, 第35巻第5号 pp.779-789, 1994.05.
45. T. NAKAMURA, Y. TOMIURA, T. HITAKA, Semantic Validity of Japanese Noun Phrases with Adnominal Particles, Proc. of the nd Pacific Rim International Conference on Artificial Intelligence, Vol.1, No.2, pp.433--437, 1992.09.
46. Y. TOMIURA, T. NAKAMURA, T. HITAKA, S. YOSHIDA, Logical Form of Hierarchical Relation on Verbs and Extracting it from Definition Sentences in a Japanese Dictionary, Proc. of the th International Conference on Computational Linguistics(Coling-92), Vol.2, No.14, pp.574-580, 1992.07.
47. 冨浦洋一 日高達 吉田将, 語義文からの動詞間の上位-下位関係の抽出, 情報処理学会論文誌, Vol.32, No.1, pp.42 - 49, 1991.01.
主要総説, 論評, 解説, 書評, 報告書等
主要学会発表等
1. Satoshi Fukuda, Emi Ishita, Yoichi Tomiura, Douglas W. Oard, Automating the Choice Between Single or Dual Annotation for Classifier Training, The 23rd International Conference on Asia-Pacific Digital Libraries (ICADL 2021), 2021.12, Many emerging digital library applications rely on automated classifiers that are trained using manually assigned labels. Accurately labeling training data for text classification requires either highly trained coders or multiple annotations, either of which can be costly. Previous studies have shown that there is a quality-quantity trade-off for this labeling process, and the optimal balance between quality and quantity varies depending on the annotation task. In this paper, we present a method that learns to choose between higher-quality annotation that results from dual annotation and higher-quantity annotation that results from the use of a single annotator per item. We demonstrate the effectiveness of this approach through an experiment in which a binary classifier is constructed for assigning human value categories to sentences in newspaper editorials..
2. Xiaofan Zheng, Yoichi Tomiura, Kenshi Hayashi, Takaaki Soeda, Profile-Decomposing Output of Multi-Channel Odor Sensor Array, IMCS 2020, 2020.05.
3. Keiya Maekawa, Yoichi Tomiura, Satoshi Fukuda, Emi Ishita, Hideaki Uchiyama, Improving OCR for Historical Documents by Modeling Image Distortion, 21st International Conference on Asia-Pacific Digital Libraries (ICADL 2019), 2019.11, [URL], Archives hold printed historical documents, many of which have deteriorated. It is difficult to extract text from such images without errors using optical character recognition (OCR). This problem reduces the accuracy of information retrieval. Therefore, it is necessary to improve the performance of OCR for images of deteriorated documents. One approach is to convert images of deteriorated documents to clear images, to make it easier for an OCR system to recognize text. To perform this conversion using a neural network, data is needed to train it. It is hard to prepare training data consisting of pairs of a deteriorated image and an image from which deterioration has been removed; however, it is easy to prepare training data consisting of pairs of a clear image and an image created by adding noise to it. In this study, PDFs of historical documents were collected and converted to text and JPEG images. Noise was added to the JPEG images to create a dataset in which the images had noise similar to that of the actual printed documents. U-Net, a type of neural network, was trained using this dataset. The performance of OCR for an image with noise in the test data was compared with the performance of OCR for an image generated from it by the trained U-Net. An improvement in the OCR recognition rate was confirmed..
4. Satoshi Fukuda, Yoichi Tomiura, Emi Ishita, Research Paper Search Using a Topic-Based Boolean Query Search and a General Query-Based Ranking Model, 30th International Conference on Database and Expert Systems Applications (DEXA 2019), 2019.08.
5. Kohei Omori, Yoichi Tomiura, Kenshi Hayashi, Statistical analysis for clustering of areas on the olfactory bulb and estimation of the physico-chemical properties detected by glomeruli in each area, ISOT 2016, 2016.06.
6. Shinjiro Okaku, Yoichi Tomiura, Emi Ishita, Shosaku Tanaka, Towards Generating Multiple-Choice Tests for Supporting Extensive Reading, The Seventh International Conference on Mobile, Hybrid, and On-line Learning (eLmL 2015), 2015.02, We propose a method for generating multiple-choice test for an English text selected by a learner and its answer, that are used to make a self-assessment whether the learner comprehends the text after reading it. In our method, the system extracts several important sentences from the text, and replaces one word in each of these sentences with its synonym (if possible). One of these sentences is then selected as a correct optional sentence, while further changes to the polarities or nouns in the remaining sentences are carried out to generate distractor optional sentences for the multiple-choice test. Our method has potential to make extensive reading in English more effective..
7. Yasuhiro Takayama, Yoichi Tomiura, Emi Ishita, Douglas W. Oard, Kenneth R. Fleischmann, An-Shou Cheng, A Word-Scale Probabilistic Latent Variable Model for Detecting Human Values, ACM International Conference on Information and Knowledge Management (CIKM2014), 2014.12, This paper describes a probabilistic latent variable model that is designed to detect human values such as justice or freedom that a writer has sought to reflect or appeal to when participating in a public debate. The proposed model treats the words in a sentence as having been chosen based on specific values; values reflected by each sentence are then estimated by aggregating values associated with each word. The model can determine the human values for the word in light of the influence of the previous word. This design choice was motivated by syntactic structures such as noun+noun, adjective+noun, and verb+adjective. The classifier based on the model was evaluated on a test collection containing 102 manually annotated documents focusing on one contentious political issue --- Net neutrality, achieving the highest reported classification effectiveness for this task. We also compared our proposed classifier with human second annotator. As a result, the proposed classifier effectiveness is statistically comparable with human annotators..
8. Toshiaki Funatsu, Yoichi Tomiura, Emi Ishita, Kosuke Furusawa, Extracting Representative Words of a Topic Determined by Latent Dirichlet Allocation, eKNOW 2014 (Digital World 2014), 2014.03, Determining the topic of a document is necessary to understand the content of the document efficiently. Latent Dirichlet Allocation (LDA) is a method of analyzing topics. In LDA, a topic is treated as an unobservable variable to establish a probabilistic distribution of words. We can interpret the topic with a list of words that appear with high probability in the topic. This method works well when determining a topic included in many documents having a variety of contents. However, it is difficult to interpret the topic just using conventional LDA when determining the topic in a set of article abstracts found by a keyword search, because their contents are limited and similar. We propose a method to estimate representative words of each topic from an LDA result. Experimental results show that our method provides better information for interpreting a topic than LDA does..
9. M. Shibata, T. Funatsu, Y. Tomiura, Extraction of Alternative Candidates for Unnatural Adjective-Noun Co-occurrence Construction of English, Pacific Association for Computational Linguistics (PACLING'11), 2011.07.
10. Teiko NAKANO, Yoichi TOMIURA, Evaluation of a Japanese Composition Support System, IADIS International Conference e-Society 2010, 2010.03.
11. Teiko NAKANO, Yoichi TOMIURA, Providing Appropriate Alternative Co-occurrence Candidates; Towards a Japanese Composition Support System, The Ninth IASTED International Conference on Web-Based Education (WBE2010), 2010.03.
12. 田中省作,冨浦洋一,安東奈穂子,柴田雅博, Webを源とした英語科学技術論文コーパスの構築 -技術的方法論と法的観点からの検討―, 英語コーパス学会第34回大会, 2009.10.
13. Masahiro SHIBATA, Yoichi TOMIURA, Takaaki MIZUTA, Identification among Similar Languages Using Statistical Hypothesis Testing, Pacific Association for Computational Linguistics (PACLING'09), 2009.09.
14. Masahiro Shibata, Tomomi Nishiguchi, Yoichi Tomiura, A Method for Automatically Generating Proper Responses to User's Utterances in Open-ended Conversation by Retrieving Documents on the Web, 2008 IEEE International Conference on Information Reuse and Integration (IEEE IRI'08), 2008.07.
15. Atsushi TAGAMI, Chikara SAKAKI, Teruyuki HASEGAWA, Shigehiro ANO, Yoichi TOMIURA, Optimization of Answering Method with Probability Conversion, 2008 International Symposium on Applications and the Internet (SAINT'08), 2008.07.
16. Atsushi TAGAMI, Chikara SAKAKI, Teruyuki HASEGAWA, Shigehiro ANO, Yoichi TOMIURA, Analysis of Answering Method with Probability Conversion for Internet Research, Fifth IEEE Consumer Communications & Networking Conference (CCNC'08), 2008.01.
17. Masahiro Shibata, Youichi Tomiura, Hideki Matsumoto, Tomomi Nishiguchi, Kensei Yukino, Akihiro Hino, Developing a Dialog System for New Idea Generation Support, 21st International Conference on the Computer Processing of Oriental Languages, 2006.12.
18. 青木 さやか,冨浦 洋一,行野 顕正,谷川 龍司, 言語識別技術を応用した英語における母語話者文書・非母語話者文書の判別, FIT2006, 2006.09.
19. K. Yukino, S. Tanaka, Y. Tomiura, H. Matsumoto, Robust Language Identification for Similar Languages and short texts using Low-Frequent Byte Strings, Pacific Association for Computational Linguistics 2005 (Pacling 2005), 2005.08.
20. M. Motoki, Y. Tomiura, N. Takahashi, Problems of FGREP Module and Their Solution, 3rd IEEE International Conference on Cognitive Informatics (ICCI2004), 2004.08.
21. M. Shibata, Y. Tomiura, S. Tanaka, A Method for Retrieving Translations of Collocation in Web Data, IJCNLP-04 Satellite Symposium, 2004.03.
22. 冨浦洋一,田中省作,日高達, 言語コーパスからの語の共起性の推定, 言語処理学会第8回年次大会, 2002.03.
23. 田中省作,飯田健二,冨浦洋一,日高 達, 名詞句「NP の NP」の意味関係とその統計的性質, 言語処理学会第4回年次大会, 1998.03.
学会活動
所属学会名
情報処理学会
人工知能学会
言語処理学会
学協会役員等への就任
2019.04~2021.03, 情報処理学会, 支部委員.
2017.04~2019.03, 情報処理学会, 支部長.
2016.04~2017.03, 情報処理学会(九州支部), 支部委員.
2010.04~2014.03, 言語処理学会, 評議員.
2002.05~2004.05, 情報処理学会九州支部, 幹事.
1998.04~2002.03, 言語処理学会, 評議員.
1995.05~2000.04, 電子情報通信学会, 言語理解とコミュニケーション専門委員会幹事.
1995.04~2000.03, 情報処理学会, 自然言語処理研究会研究連絡員.
学会大会・会議・シンポジウム等における役割
2022.02.28~2022.03.04, iConference2022, Conference chairs, a member of host meeting, session chair.
2018.12.15~2018.12.16, 10th Asia Library and Information Research Group (ALIRG) Workshop, General Chair.
2018.11.22~2018.11.22, 20th International Conference on Asia-Pacific Digital Library (ICADL 2018), Workshopオーガナイザー.
2018.09.27~2018.09.28, 第71回電気関係学会九州支部連合大会, 大会委員長.
2015.09.26~2015.09.27, 第68回電気・情報関係学会九州支部連合大会, 座長(Chairmanship).
2015.07.12~2015.07.16, ESKM2015, Special Session (Library Science) Organizer.
2009.09.28~2009.09.29, 第62回電気関係学会九州支部連合大会, 座長(Chairmanship).
2006.09.06~2006.09.06, FIT2006, 座長(Chairmanship).
2005.03.13~2005.03.16, 言語処理学会第11回年次大会, プログラム委員.
2003.10.01~2003.10.03, 情報処理学会九州支部若手の会セミナー, 司会(Moderator).
2003.09.26~2003.09.27, 電気関係学会九州支部第56回連合大会, 座長(Chairmanship).
2002.08.06~2002.08.08, 情報処理学会九州支部若手の会セミナー, 司会(Moderator).
2003.03.18~2003.03.20, 言語処理学会第9回年次大会, 発表賞選考委員.
2002.03.18~2002.03.20, 言語処理学会第8回年次大会, 発表賞選考委員.
2015.09.25~2015.09.27, 電気関係学会九州支部連合大会, 座長.
2015.07.12~2015.07.16, 6th IIAI International Conference on e-Services and Knowledge Management (IIAI ESKM 2014), Special Session Organizer & Session Chair.
2014.08.31~2014.09.04, 5th IIAI International Conference on e-Services and Knowledge Management (IIAI ESKM 2014), Special Session Organizer.
2014.03.18~2014.03.20, 言語処理学会第20回年次大会, 大会賞選考委員.
2014.05.26~2014.05.31, The 9th International Conference on Language Resources and Evaluation (LREC2014), Scientific Committee, member.
2012.09.20~2012.09.22, 3rd IIAI International Conference on e-Services and Knowledge Management (IIAI ESKM 2012), Program Committee, member.
2012.05.21~2012.05.27, The 8th International Conference on Language Resources and Evaluation (LREC2012), Scientific Committee, member.
2010.08.23~2010.08.27, The 23rd International Conference on Computational Linguistics (COLING2010), Program Committee, member.
2010.05.17~2010.05.23, The 7th International Conference on Language Resources and Evaluation (LREC2010), Scientific Committee, member.
2005.03.14~2005.03.18, 言語処理学会第11回年次大会, プログラム委員.
2003.09.26~2003.09.27, 電気関係学会九州支部第56回連合大会, プログラム委員長.
1997.07~1997.07.18, Pacific Association for Computational Linguistics (PACLING'97), Conference Committee, member.
1999.08~1999.08, Pacific Association for Computational Linguistics (PACLING'99), Conference Committee, member.
学会誌・雑誌・著書の編集への参加状況
2008.04~2012.06, 人工知能学会 学会誌, 国内, シニア編集委員.
2006.06~2008.03, 人工知能学会 学会誌, 国内, 編集委員.
2001.06~2005.06, 人工知能学会 学会誌, 国内, 編集委員.
1998.04~2002.03, 情報処理学会 学会誌, 国内, 編集委員.
学術論文等の審査
年度 外国語雑誌査読論文数 日本語雑誌査読論文数 国際会議録査読論文数 国内会議録査読論文数 合計
2016年度
2015年度
2014年度
2013年度
2012年度
2011年度
2010年度
2009年度
2008年度
2007年度
2006年度
2005年度
2004年度  
2003年度
2002年度  
受賞
Best Poster Award, The 21st International Conference on Asian Digital Libraries, 2019.11.
優秀論文賞, 電子情報通信学会通信ソサイエティ, 2010.09.
Best Paper Award, Pacific Association for Computational Linguistics, 2009.09.
FIT2006 論文賞, 情報科学技術フォーラム推進委員会, 2006.09.
Best Paper Award, Pacific Association for Computational Linguistics, 2005.08.
研究賞, 情報処理学会, 1991.10.
研究資金
科学研究費補助金の採択状況(文部科学省、日本学術振興会)
2022年度~2026年度, 基盤研究(S), 分担, 匂いの時空間揺らぎ情報に基づく人探索.
2021年度~2023年度, 挑戦的研究(萌芽), 代表, 嗅球糸球体層の活性パターン画像と分子パラメタに基づく物質の匂い情報の定量化.
2018年度~2020年度, 基盤研究(A), 分担, 匂いイメージセンサによる匂い痕跡画像の要素臭プロファイル分解.
2018年度~2021年度, 基盤研究(B), 分担, テキストからわかる価値観を対象にした内容分析とその半自動化手法に関する総合的研究.
2015年度~2017年度, 基盤研究(A), 分担, 匂いの質と空間の可視化センシング.
2015年度~2018年度, 基盤研究(A), 代表, ユーザーの視点に立った高度な学術論文検索支援に関する総合的研究.
2013年度~2016年度, 基盤研究(B), 分担, 意見文からなる大規模テキスト集合に潜む人々の価値観を推定するための基礎的研究.
2013年度~2015年度, 挑戦的萌芽研究, 分担, 機関リポジトリを活用した潜在的研究クラスタの創出.
2012年度~2014年度, 挑戦的萌芽研究, 代表, 言語処理技術を用いた任意の英文書の内容に関する問題と解答の自動生成.
2012年度~2014年度, 基盤研究(C), 分担, 機関リポジトリを活用した大学別発信型語彙リストのオーダメイド作成法.
2008年度~2011年度, 基盤研究(B), 代表, Web上からの母語話者/非母語話者英語論文コーパスの作成・公開とその利用.
2008年度~2011年度, 基盤研究(C), 分担, 英文法コーパスの構築とその応用.
2005年度~2007年度, 基盤研究(C), 分担, 「多読教材の良さ」の再考と Extensive Slash Reading 学習システムの構築.
2001年度~2003年度, 基盤研究(C), 代表, 不完全データに基づく数量化II類と語の共起性判定への応用.
競争的資金(受託研究を含む)の採択状況
2017年度~2021年度, セコム科学技術振興財団 一般研究助成, 分担, 分子を認識する二次元プラズモニックガスセンサアレイによる匂いの痕跡識別システム.
2001年度~2002年度, 大川情報通信基金研究助成, 代表, 語の共起性推定に関する研究.
共同研究、受託研究(競争的資金を除く)の受入状況
2005.06~2007.03, 代表, ゲームにおけるキャラクタの発話管理システムに関する研究.
2006.11~2007.03, 代表, Blog 等の非文法的文の自動書き換えプログラムの作成と実験.
2007.07~2008.03, 代表, 確率関数を利用したネットワークベース調査方式.
学内資金・基金等への採択状況
2016年度~2018年度, 教育の質向上支援プログラム(EEP), 代表, 教育の国際化に対応した学修支援環境の構築.

九大関連コンテンツ

pure2017年10月2日から、「九州大学研究者情報」を補完するデータベースとして、Elsevier社の「Pure」による研究業績の公開を開始しました。