九州大学 研究者情報
論文一覧
冨浦 洋一(とみうら よういち) データ更新日:2024.04.25

教授 /  システム情報科学研究院 情報学部門 知能科学


原著論文
1. Tokinori Suzuki, Douglas W Oard, Emi Ishita, Yoichi Tomiura, Automatically Detecting References from the Scholarly Literature to Records in Archives, Proceedings of International Conference on Asian Digital Libraries 2023, 100-107, 2023.12, Scholars use references in books and articles to materials found in archives as one way of finding those materials, but present systems for archival access do not exploit that information. To change that, the first step is to find archival references in the scholarly literature; that is the focus of this paper. Several classifier designs are compared using a few thousand manually annotated footnotes and endnotes assembled from a large set of open access papers on history. The results indicate that fairly high recall and precision can be achieved..
2. Motokazu Yamasaki, Yoichi Tomiura, Toshiyuki Shimizu, Investigation of ChatGPT Use in Research Data Retrieval, Proceedings of International Conference on Asian Digital Libraries 2023, 36-40, 2023.12, In recent years, huge amounts of research data have been generated, and it has become important to search them efficiently and accurately in order to make use of research data. Existing search engines and keyword-based search methods require users to enter appropriate keywords or phrases, and it is difficult to obtain satisfactory results if users do not have detailed information about the desired data. In this study, we investigated whether ChatGPT could be used to reach the desired research data by users who are not familiar with them. Specifically, we investigated whether users could find the research data cited in a research paper by entering the abstract of the paper into ChatGPT and then asking for the data necessary to write the research paper. The results showed that research data could be found in 65% of the cases, confirming that the use of ChatGPT increases the discoverability of research data..
3. Xiaofan Zheng, Masato Matsuoka, Kenshi Hayashi, Yoichi Tomiura, Extract spatial distribution of a specific gas from mixed gas data measured by the LSPR gas sensor, 10.1109/SENSORS56945.2023.10324923, 1-4, 2023.10, Visualizing invisible gas molecules can be a great help to our lives. At present, gas sensors can already visualize the spatial distribution of gas mixture, however, the visualization of a specific gas requires further analysis of the measurement data. In this study, matrix decomposition is used to analyze the measurement data of localized surface plasmon resonance (LSPR) gas sensor. To satisfy the linear relationship between the concentration of gas and the output of the device required for applying matrix decomposition, we formulated a procedure for processing the measurement data instead of using them directly. To obtain the diffusion trace of a specific gas, we designed a method to obtain the characteristic output of the specific gas, then by using the characteristic output as the known information, the corresponding diffusion trace can be estimated better through the matrix decomposition algorithm. We used the designed method to analyze the measurement data, and the results show that our method can obtain the spatial distribution of some gas..
4. Xiaofan Zheng, Yoichi Tomiura, Kenshi Hayashi , Investigation of the structure-odor relationship using a Transformer model, Journal of Cheminformatics, https://doi.org/10.1186/s13321-022-00671-y, 2022.12, The relationships between molecular structures and their properties are subtle and complex, and the properties of odor are no exception. Molecules with similar structures, such as a molecule and its optical isomer, may have completely different odors, whereas molecules with completely distinct structures may have similar odors. Many works have attempted to explain the molecular structure-odor relationship from chemical and data-driven perspectives. The Transformer model is widely used in natural language processing and computer vision, and the attention mechanism included in the Transformer model can identify relationships between inputs and outputs. In this paper, we describe the construction of a Transformer model for predicting molecular properties and interpreting the prediction results. The SMILES data of 100,000 molecules are collected and used to predict the existence of molecular substructures, and our proposed model achieves an F1 value of 0.98. The attention matrix is visualized to investigate the substructure annotation performance of the attention mechanism, and we find that certain atoms in the target substructures are accurately annotated. Finally, we collect 4462 molecules and their odor descriptors and use the proposed model to infer 98 odor descriptors, obtaining an average F1 value of 0.33. For the 19 odor descriptors that achieved F1 values greater than 0.45, we also attempt to summarize the relationship between the molecular substructures and odor quality through the attention matrix..
5. Yasuko Hagiwara, Emi Ishita, Yukiko Watanabe, Yoichi Tomiura, Identifying Scholarly Search Skills Based on Resource and Document Selection Behavior among Researchers and Master’s Students in Engineering, College & Research Libraries, https://doi.org/10.5860/crl.83.4.610, 83, 4, 610-630, 2022.07.
6. Tokinori Suzuki, Shintaro Deguchi, Yoichi Tomiura, Using the Scatter of Opinions to Predict Responses to Tweets, Proceedings of 2022 12th International Congress on Advanced Applied Informatics, IIAI-AAI 2022, 39-42, 2022.07.
7. Satoshi Fukuda, Emi Ishita, Yoichi Tomiura, Douglas W. Oard, Automating the Choice Between Single or Dual Annotation for Classifier Training, Porceedings of the 23rd International Conference on Asia-Pacific Digital Libraries (ICADL 2021), 10.1007/978-3-030-91669-5_19, 233-248, 2021.12, Many emerging digital library applications rely on automated classifiers that are trained using manually assigned labels. Accurately labeling training data for text classification requires either highly trained coders or multiple annotations, either of which can be costly. Previous studies have shown that there is a quality-quantity trade-off for this labeling process, and the optimal balance between quality and quantity varies depending on the annotation task. In this paper, we present a method that learns to choose between higher-quality annotation that results from dual annotation and higher-quantity annotation that results from the use of a single annotator per item. We demonstrate the effectiveness of this approach through an experiment in which a binary classifier is constructed for assigning human value categories to sentences in newspaper editorials..
8. Mei Kodama, Emi Ishita, Yukiko Watanabe, Yoichi Tomiura, Usage of E-books During the COVID-19 Pandemic: A Case Study of Kyushu University Library, Japan, Diversity, Divergence, Dialogue, 10.1007/978-3-030-71305-8_40, 12646 LNCS, 475-483, 2021.03.
9. Emi Nishida, Emi Ishita, Yukiko Watanabe, Yoichi Tomiura, Description of research data in laboratory notebooks: Challenges and opportunities, Proceedings of the Association for Information Science and Technology, 10.1002/pra2.388, 57, 1, e388, 2020.10.
10. Emi Ishita, Satoshi Fukuda, Yoichi Tomiura, Douglas W. Oard, Using text classification to improve annotation quality by improving annotator consistency, Proceedings of the Association for Information Science and Technology, 10.1002/pra2.301, 57, 1, e301, 2020.10.
11. Xiaofan Zheng, Yoichi Tomiura, Kenshi Hayashi, Takaaki Soeda, Profile-Decomposing Output of Multi-Channel Odor Sensor Array, ECS Meeting Abstracts, MA2020-01, 2020.05.
12. Emi Ishita, Satoshi Fukuda, Toru Oga, Yoichi Tomiura, Douglas W. Oard, Kenneth R. Fleischmann, Cost-Effective Learning for Classifying Human Values, Proceedings of 15th iConference 2020, 2020.03.
13. Keiya Maekawa, Yoichi Tomiura, Satoshi Fukuda, Emi Ishita, Hideaki Uchiyama, Improving OCR for Historical Documents by Modeling Image Distortion, Lecture Notes in Computer Science, 10.1007/978-3-030-34058-2_31, 11853, 312-316, 2019.11.
14. Takaaki Soeda, Zhongyuan Yang, Fumihiro Sassa, Yoichi Tomiura, Kenshi Hayashi, 2D LSPR multi gas sensor array with 4-segmented subpixel using Au/Ag core shell structure, 18th IEEE Sensors, SENSORS 2019
2019 IEEE Sensors, SENSORS 2019 - Conference Proceedings
, 10.1109/SENSORS43011.2019.8956635, 2019.10, [URL], LSPR (Localized Surface Plasmon Resonance) based 2D (2 Dimensional) gas imaging sensor system which can capture spatial distribution of each constituent of mixed gas have been developed. The gas image sensor detects the gas promoted optical changes occurred on the LSPR substrate by CCD camera. Basically, LSPR gas sensor does not have a molecular selectivity, then the identification of gas species is difficult. To overcome the disadvantage, pixelated LSPR substrate based on Au/Ag core-shell structure which has different gas response properties is fabricated by photo-induced metal growth by mask-less exposure system using a commercial video projector..
15. Satoshi Fukuda, Yoichi Tomiura, Emi Ishita, Research Paper Search Using a Topic-Based Boolean Query Search and a General Query-Based Ranking Model, Lecture Notes in Computer Science, 10.1007/978-3-030-27618-8_5, 11707, 65-75, 2019.08.
16. Emi Ishita, Satoshi Fukuda, Toru Oga, Douglas W. Oard, Kenneth R. Fleischmann, Yoichi Tomiura, An Shou Cheng, Toward Three-Stage Automation of Annotation for Human Values, iConference 2019, 2019.03, Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost..
17. Emi Ishita, Yasuko Hagiwara, Yukiko Watanabe, Yoichi Tomiura, Which Parts of Search Results do Researchers Check when Selecting Academic Documents?, 18th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2018
JCDL 2018 - Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries
, 10.1145/3197026.3203867, 345-346, 2018.05, [URL], Our goal is to propose an alternative retrieval system of academic documents based on researcher's behavior in practice. In this study, a questionnaire survey was conducted. Question items were developed from findings in the previous observational study for researcher's behavior. From the results of 46 respondents, the top three elements checked in the search results were title, abstract, and the full-text version. They also checked structure "Introduction" in the full-text rather than other structures when they found previous research in an unfamiliar field. These results indicate that researchers use different ways for selecting documents based on the type of documents they look for..
18. Liang Shang, Chuanjun Liu, Yoichi Tomiura, Kenshi Hayashi, Odorant clustering based on molecular parameter-feature extraction and imaging analysis of olfactory bulb odor maps, Sensors and Actuators, B: Chemical, 10.1016/j.snb.2017.08.024, 255, 508-518, 2018.02, [URL], Progress in the molecular biology of olfaction has revealed a close relationship between the structural features of odorants and the response patterns they elicit in the olfactory bulb. Molecular feature-related response patterns, termed odor maps (OMs), may represent information related to basic odor quality. Thus, studying the relationship between OMs and the molecular features of odorants is helpful for better understanding the relationships between odorant structure and odor. Here, we explored the correlation between OMs and the molecular parameters (MPs) of odorants by taking OMs from rat olfactory bulbs and extracting feature profiles of the corresponding odorant molecules. 178 images of glomerular activities in olfactory bulb that are corresponding to odorants were taken from the OdorMapDB, a publicly accessible database. The gray value of each pixel was extracted from the images (178 × 357 pixels) to fabricate an image matrix for each odorant. Forty-six molecular feature parameters were calculated using BioChem3D software, which was used to construct a second matrix for each odorant. Correlation analysis between the two matrixes was first carried out by establishing coefficient maps. Results from hierarchical clustering showed that all parameters could be segregated into seven clusters, and each cluster showed a relatively similar response pattern in the olfactory bulb. Using the information from the OMs and MPs, we mapped odorants in 2D space by incorporating dimension-reducing techniques based on principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE). Artificial neural network models based on the OM and MP feature values were proposed as a means to identify odorant functional groups. An OM-PCA-based model calibrated via extreme learning machine (ELM) was 94.81% and 93.02% accuracy for the calibration and validation sets, respectively. Similarly, an MP-t-SNE-based model calibrated by ELM was 86.67% and 93.35% accuracy for the calibration set and the validation set, respectively. Thus, this research supports a structure-odor relationship from a data-analysis perspective..
19. Satoshi Fukuda, Yoichi Tomiura, Using Topic Analysis Techniques to Support Comprehensive Research Paper Searches, 21st International Conference on Asian Language Processing, IALP 2017, 10.1109/IALP.2017.8300606, 314-317, 2018.01, In an academic paper search to confirm the originality of a user's research, it is important that the search returns comprehensive results relevant to the user's information need. To achieve comprehensive search results, users often relax initially restrictive search formula by adding synonyms and expressions similar to the search words with operator OR, and/or replacing AND with OR operations. However, it is difficult to anticipate all the terms that authors of relevant papers might have used. In addition, the replacement of AND with OR in search phrases can return a large number of unrelated papers. To overcome these issues, we propose a research paper search method based on topic analysis, which uses Boolean search based on the topics assigned to the search words in the search formula and the abstracts that contain any search word. Our method considers synonyms and expressions similar to the search words, which a user might not anticipate, while limiting the number of papers unrelated to the information need in the search result. To investigate the effectiveness of our method, we conducted experiments using the NTCIR-1 and 2 datasets, and confirmed that our method shows a reduction effect on unrelated papers, while maintaining high coverage..
20. Liang Shang, Chuanjun Liu, Yoichi Tomiura, Kenshi Hayashi, Machine-Learning-Based Olfactometer:
Prediction of Odor Perception from Physicochemical Features of Odorant Molecules, Analytical Chemistry, 10.1021/acs.analchem.7b02389, 89, 22, 11999-12005, 2017.11, [URL], Gas chromatography/olfactometry (GC/O) has been used in various fields as a valuable method to identify odor-active components from a complex mixture. Since human assessors are employed as detectors to obtain the olfactory perception of separated odorants, the GC/O technique is limited by its subjectivity, variability, and high cost of the trained panelists. Here, we present a proof-of-concept model by which odor information can be obtained by machine-learning-based prediction from molecular parameters (MPs) of odorant molecules. The odor prediction models were established using a database of flavors and fragrances including 1026 odorants and corresponding verbal odor descriptors (ODs). Physicochemical parameters of the odorant molecules were acquired by use of molecular calculation software (DRAGON). Ten representative ODs were selected to build the prediction models based on their high frequency of occurrence in the database. The features of the MPs were extracted via either unsupervised (principal component analysis) or supervised (Boruta, BR) approaches and then used as input to calibrate machine-learning models. Predictions were performed by various machine-learning approaches such as support vector machine (SVM), random forest, and extreme learning machine. All models were optimized via parameter tuning and their prediction accuracies were compared. A SVM model combined with feature extraction by BR-C (confirmed only) was found to afford the best results with an accuracy of 97.08%. Validation of the models was verified by using the GC/O data of an apple sample for comparison between the predicted and measured results. The prediction models can be used as an auxiliary tool in the existing GC/O by suggesting possible OD candidates to the panelists and thus helping to give more objective and correct judgment. In addition, a machine-based GC/O in which the panelist is no longer needed might be expected after further development of the proposed odor prediction technique..
21. Yasuko Hagiwara, Emi Ishita, Emiko Mizutani, Kana Fukushima, Yukiko Watanabe, Yoichi Tomiura, Identifying Key Elements of Search Results for Document Selection in the Digital Age: An Observational Study, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10647 LNCS, 10.1007/978-3-319-70232-2_20, 237-242, 2017.11, Academic database systems are vitally important tools for enabling researchers to find relevant, useful articles. Identifying how researchers select documents from search results is an extremely useful measure for improving the functions or interfaces of academic retrieval systems. This study aims to reveal which elements are checked, and in what order, when researchers select from among search results. It consists of two steps: an observational study of search sessions performed by researchers who volunteered, and a questionnaire to confirm whether extracted elements and patterns are used. This article reports findings from the observational study and introduces questions we developed based on the study. In the observational study we obtained data on nine participants who were asked to search for documents using information retrieval systems. The search sessions were recorded using a voice recorder and by capturing screen images. The participants were also asked to state which elements they checked in selecting documents, along with the reasons for their selections. Three patterns of order of checking were found. In pattern 1, seven researchers used titles and abstracts as the primary elements. In pattern 2, the others used titles and then accessed the full text before making a decision on their selection. In pattern 3, one participant searched for images and accessed the full text from the link in those pictures. We also found participants used novel elements for selecting. We subsequently developed items for a questionnaire reflecting the findings..
22. Emi Ishita, Toru Oga, An Shou Cheng, Kenneth R. Fleischmann, Yasuhiro Takayama, Douglas W. Oard, Yoichi Tomiura, Toward Automating Detection of Human Values in the Nuclear Power Debate, Proceedings of the Association for Information Science and Technology, 10.1002/pra2.2017.14505401127, 54, 1, 714-715, 2017.10, [URL], This paper describes the creation of a corpus of newspaper articles about the Fukushima nuclear disaster, a coding frame for content analysis of human values expressed or reflected in that corpus and preliminary results for automation of the content analysis. Understanding the human values that motivate sentiment towards an idea can help to characterize the basis for that sentiment, and this work is a first step toward applying that approach to positions on controversial events reported in the news..
23. Takafumi Yamamoto, Yoichi Tomiura, Constructing Corpus of Scientific Abstracts Annotated with Sentence Roles, Proc. Seventh International Conference on E-Service and Knowledge Management, 2016.07.
24. Yasuhiro Takayama, Yoichi Tomiura, Kenneth R. Fleischmann, An-Shou Cheng, Douglas W. Oard, Emi Ishita, Automatic Dictionary Extraction and Content Analysis Associated with Human Values, Information Engineering Express, International Institute of Applied Informatics, Vol.1, No.4, 107 – 118, 2015.12.
25. Emi Ishita, Douglas W. Oard, Kenneth R. Fleischmann, Yoichi Tomiura, Yasuhiro Takayama, An-Shou Cheng, Learning curves for automating content analysis: How much human annotation is needed ? , Sixth International Conference on E-Service and Knowledge Management (Special Session on Library Science), 2015.07.
26. Yasuhiro Takayama, Yoichi Tomiura, Kenneth R. Fleischmann, An-Shou Cheng, Douglas W. Oard, Emi Ishita, An Automatic Dictionary Extraction and Annotation Method Using Simulated Annealing for Detecting Human Values, Sixth International Conference on E-Service and Knowledge Management (Special Session on Library Science), 2015.07.
27. Kenneth R. Fleischmann, Yasuhiro Takayama, An-Shou Cheng, Yoichi Tomiura, Douglas W. Oard, Emi Ishita, Thematic Analysis of Words that Invoke Values in the Net Neutrality Debate, Proc. i Conference 2015, 2015.03.
28. Shinjiro Okaku, Yoichi Tomiura, Emi Ishita, Shosaku Tanaka, Towards Generating Multiple-Choice Tests for Supporting Extensive Reading, Proc. the Seventh International Conference on Mobile, Hybrid, and On-line Learning (eLmL 2015), 2015.02, We propose a method for generating multiple-choice test for an English text selected by a learner and its answer, that are used to make a self-assessment whether the learner comprehends the text after reading it. In our method, the system extracts several important sentences from the text, and replaces one word in each of these sentences with its synonym (if possible). One of these sentences is then selected as a correct optional sentence, while further changes to the polarities or nouns in the remaining sentences are carried out to generate distractor optional sentences for the multiple-choice test. Our method has potential to make extensive reading in English more effective..
29. Yasuhiro Takayama, Yoichi Tomiura, Emi Ishita, Douglas W. Oard, Kenneth R. Fleischmann, An-Shou Cheng, A Word-Scale Probabilistic Latent Variable Model for Detecting Human Values, Proc. 23th ACM International Conference on Information and Knowledge Management (CIKM 2014), 1-10, 2014.12, This paper describes a probabilistic latent variable model that is designed to detect human values such as justice or freedom that a writer has sought to reflect or appeal to when participating in a public debate. The proposed model treats the words in a sentence as having been chosen based on specific values; values reflected by each sentence are then estimated by aggregating values associated with each word. The model can determine the human values for the word in light of the influence of the previous word. This design choice was motivated by syntactic structures such as noun+noun, adjective+noun, and verb+adjective. The classifier based on the model was evaluated on a test collection containing 102 manually annotated documents focusing on one contentious political issue --- Net neutrality, achieving the highest reported classification effectiveness for this task. We also compared our proposed classifier with human second annotator. As a result, the proposed classifier effectiveness is statistically comparable with human annotators..
30. Shinjiro Okaku, Yoichi Tomiura, Kou Shu, Shosaku Tanaka, Towards Generating Multiple-Choice Tests for Evaluating Comprehension of Arbitrary English Texts, Proc. IIAI 3rd International Conference on Advanced Applied Informatics, 220-225, 2014.08.
31. Shuhei Otani, Yoichi Tomiura, Extraction of Key Expressions Indicating the Important Sentence from Article Abstracts, Proc. IIAI 3rd International Conference on Advanced Applied Informatics, 216-219, 2014.08, In this study, we aim to extract key expressions that indicate the important sentence describing the originalities or contributions from article abstracts. The expense of searching academic information increases because of increases in the number of articles, discipline subdivisions, and promotion of interdisciplinary research. Improving the extraction and presentation of the main points from article abstracts will contribute to reducing academic information search expenses. We extracted pseudo-important sentences from each article abstract based on the ratio of the number of words in the identified sentence that appear in the article title to the number of all words in the sentence. After that, we evaluated the ratio of the number of the pseudo-important sentences including each Ngram to the number of all sentences including that N-gram. We then extracted the N-grams with a ratio as high as the key expressions..
32. Yuichiro Kobayashi, Shosak Tanaka, Yoichi Tomiura, Yoshinori Miyazaki, Michio Tokumi, Identifying discipline-specific expressions based on institutional repository, Proc. Digital Humanities Australasia 2014, 2014.03, From the early 1960s, English for the Specific Purposes (ESP) has become one of the prominent fields of English teaching (e.g. Hutchinson and Waters, 1987). It aims to provide learners with academic English skills in a context of tertiary education. In terms of English vocabulary, the Academic Word List (AWL) contains 570 word families selected from academic texts (Coxhead, 2000). The list has been widely known to both academic English teachers and researchers, but does not cover context-bound or topic-dependent vocabulary frequently used in each discipline and sub-discipline.

The most significant words in each academic context differ depending on the discipline, university, laboratory or researcher. Therefore, it is necessary to compile a large corpus which covers a wide range of academic domains to create a more appropriate word list for researchers in different academic fields. What is more, it is important for language learners to acquire discipline-specific expressions to achieve native-like performance in an ESP context.

The purpose of this study is to identify discipline-specific expressions, especially multi-word expressions, using institutional repositories and natural language processing techniques. An institutional repository is an online collection of the intellectual output of a research institution. The recent trend of constructing institutional repository in Japan allows researchers to share the same academic resources. The present study demonstrates the effectiveness of creating a multi-word list specific to an institution.

We used a gapped n-gram approach for the identification of discipline-specific expressions. An ‘n-gram’ is a contiguous sequence of items, such as words or part-of speech tags, and a ‘gapped n-gram,’ or ‘skipgram,’ is a refinement of the n-gram approach designed to detect non-contiguous item associations (e.g. Cheng, Greaves, and Warren, 2006). Our algorithm for extracting gapped n-grams is based on the method proposed by Kozawa, Sakai, Sugiki, and Matsubara (2010). First, we filtered the texts in the institutional repository by the number of words to exclude too short or too long writings. Second, we used TreeTagger, an automatic chunking program, to identify the basic phrase and clause structures. Third, we detected general patterns of multi-word expressions by lemmatizing words except for participial verbs and by replacing determiners and cardinal numbers into part-of-speech tags. Finally, we counted the number of gapped n-grams occurred in the writings collected from each academic disciplines.

This method enabled us to effectively detect frequent patterns characteristic of each discipline. We applied the method to the QIR, the institutional repository of Kyushu University, using the repository of the Faculty of Information Science and Electrical Engineering (ISEE), and identified discipline-specific expressions for each faculty at the university as an example of the automatic detection of gapped n-grams. We collected 229 writings in the QIR, and successfully obtained 1,061 frequent phrasal expressions specific to ISEE. We expect this study contributes to teaching English for the Specific Purposes and the application of digital library to linguistic research..
33. Toshiaki Funatsu, Yoichi Tomiura, Emi Ishita, Kosuke Furusawa, Extracting Representative Words of a Topic Determined by Latent Dirichlet Allocation, Proc. The Sixth International Conference on Information, Process, and Knowledge Management (eKNOW 2014), 2014.03, Determining the topic of a document is necessary to understand the content of the document efficiently. Latent Dirichlet Allocation (LDA) is a method of analyzing topics. In LDA, a topic is treated as an unobservable variable to establish a probabilistic distribution of words. We can interpret the topic with a list of words that appear with high probability in the topic. This method works well when determining a topic included in many documents having a variety of contents. However, it is difficult to interpret the topic just using conventional LDA when determining the topic in a set of article abstracts found by a keyword search, because their contents are limited and similar. We propose a method to estimate representative words of each topic from an LDA result. Experimental results show that our method provides better information for interpreting a topic than LDA does..
34. Yasuhiro Takayama, Yoichi Tomiura, Emi Ishita, Zheng Wang, Douglas W. Oard, Kenneth R. Fleischmann, An-Shou Cheng, Improving Automatic Sentence-Level Annotation of Human Values Using Augmented Feature Vectors, Proc. Pacific Association for Computational Linguistics (PACLING'13), 2013.09, This paper describes an effort to improve identification of human values that are directly or indirectly invoked within the prepared statements of witnesses before legislative and regulatory hearings. We automatically code human values at the sentence level using supervised machine learning techniques trained on a few thousand annotated sentences. To simulate an actual situation, we treat a quarter of the data as labeled for training and the remaining three quarters of the data as unlabeled for test. We find that augmenting the feature space using a combination of lexical and statistical co-occurrence evidence can yield about a 6% relative improvement in F1 using a Support Vector Machine classifier. .
35. 田中省作,安東奈穂子,冨浦洋一, コーパス構築と著作権 ― Web を源とした質情報付き英語科学論文コーパス, 英語コーパス研究, 19, pp.31--41, 2012.06, Web文書を利用したコーパスの構築と利用に関して,番号69の論文で述べたプロジェクトを実例として,改正著作権法(2009年改正,2010年施行)の下での取り扱いについて議論を報告した..
36. Masahiro Shibata, Toshiaki Funatsu, Yoichi Tomiura, Extraction of Alternative Candidates for Unnatural Adjective-Noun Co-occurrence Construction of English, Procedia - Social and Behavioral Sciences, 27, pp.32--41, 2011.12.
37. 田中省作,柴田雅博,冨浦洋一, Webを源とした質情報付き英語科学論文コーパスの構築法, 英語コーパス研究, 18, 61-71, 2011.06.
38. 渡邊由紀子,冨浦洋一,吉田素文,岡崎敦, 九州大学大学院「ライブラリーサイエンス専攻」の構想と意義, 情報管理, 54, 2, pp.53--62, 2011.05.
39. 田上敦士,阿野茂浩,冨浦洋一, 位置情報に基づくP2Pネットワークを用いた情報通信プラットフォーム, 情報処理学会論文誌, 52, 2, 347--358, Vol.52,No.2,pp.347--358, 2011.02.
40. 中野てい子,冨浦洋一, 日本語学習者の動詞選択における誤用と正用の関係:作文支援のための基礎研究, 自然言語処理, 第18巻,第1号,pp.3--29, 2011.01.
41. 中野てい子,冨浦洋一, 日本語作文支援における共起を利用した代替候補提示システム, 日本教育工学会論文誌, 第34巻,第3号,pp.181--189, 2010.12.
42. Atsushi TAGAMI, Shigehiro ANO, Yoichi TOMIURA, Simulation Analysis of Moving Peer Influence on Location-aware P2P Network, Proc. of International Conference on Advanced Information Networking and Applications (AINA'10), pp.1121--1127, 2010.04.
43. Teiko NAKANO, Yoichi TOMIURA, Providing Appropriate Alternative Co-occurrence Candidates; Towards a Japanese Composition Support System, Proc. of the Ninth IASTED International Conference on Web-Based Education, pp. 173--179, 2010.03.
44. Teiko NAKANO, Yoichi TOMIURA, Evaluation of a Japanese Composition Support System, Proc. of IADIS International Conference e-Society 2010, pp.396--400, 2010.03.
45. 柴田雅博,冨浦洋一,西口友美, 雑談自由対話を実現するためのWWW上の文書からの妥当な候補文選択手法
(2009年), 人工知能学会論文誌, 第24巻,第6号,pp.508--520 , 2009.11.
46. Masahiro Shibata, Tomomi Nishiguchi, Yoichi Tomiura , Dialog System for Open-ended Conversation Using Web Documents, Informatica, Vol.33, No.3, pp.277-284, 2009.10.
47. 田中省作,冨浦洋一,安東奈穂子,柴田雅博, Webを源とした英語科学論文コーパスの構築 ―技術的方法論と法的観点からの検討―, 英語コーパス学会第34回大会, 2009.10.
48. M. Shibata, Y. Tomiura, T. Mizuta, Identification among Similar Languages Using Statistical Hypothesis Testing
, Proc. of Pacific Association for Computational Linguistics (PACLING'09) , pp.47--52 , 2009.09.
49. 田上敦士,佐々木力,長谷川輝之,阿野茂浩,冨浦洋一, 確率的変換に基づくインターネット調査手法の解析, 電子情報通信学会論文誌, Vol.J92-B,No.4,pp.729--740, 2009.04, ネットワーク上での匿名性を保証するアンケート調査手法として,回答(可/否)を確率的に変換した値を送信し,収集者は,可否回答の割合を受信した値の標本平均として推定する手法を提案した.本手法では,調査人数と推定値の許容誤差および信頼度が与えられると,これらの条件を満たす確率変換の分散の上限が一意に定まることを示した.また,回答を送信値から推定する場合の誤り率を用いて匿名度を定義し,与えられた条件(調査人数,推定誤差および信頼度)を満たす,匿名性の点で最良の確率変換を求めることで,本アンケート調査手法の設計方法を示した.
※調査人数,推定値の許容誤差と信頼度,確率変換の分散の間の関係の導出,匿名性の点で最良の確率変換の導出を担当..
50. 冨浦 洋一,青木 さやか,柴田 雅博,行野 顕正, 仮説検定に基づく英文書の母語話者性の判別, 自然言語処理, Vol.16, No.1, pp.23-46, 2009.01.
51. 田中 達也,島田 敬士,有田 大作,谷口 倫一郎,冨浦 洋一, 高速なParzen推定を用いた動的背景モデルによる映像からの物体検出, 映像情報メディア学会誌, Vol.62, No.12, pp.2045-2052, 2008.12.
52. Teiko NAKANO, Yoichi TOMIURA, Measure of Appropriateness of Word Co-occurrence in Japanese for Specific Purposes: Towards a Support System Framework for Writing Technical Japanese, Proc. of Empirical Methods for Asian Language Processing Workshop, pp.133-147, 2008.12.
53. Masahiro Shibata, Tomomi Nishiguchi, Yoichi Tomiura, A Method for Automatically Generating Proper Responses to User's Utterances in Open-ended Conversation by Retrieving Documents on the Web, Proc. of 2008 IEEE International Conference on Information Reuse and Integration (IEEE IRI'08), pp.268-279, 2008.07.
54. Atsushi TAGAMI, Chikara SAKAKI, Teruyuki HASEGAWA, Shigehiro ANO, Yoichi TOMIURA, Optimization of Answering Method with Probability Conversion, Proc. of 2008 International Symposium on Applications and the Internet (SAINT'08), pp.249-252, 2008.07.
55. Atsushi TAGAMI, Chikara SAKAKI, Teruyuki HASEGAWA, Shigehiro ANO, Yoichi TOMIURA, Analysis of Answering Method with Probability Conversion for Internet Research, Fifth IEEE Consumer Communications & Networking Conference (CCNC'08), pp.110-111, 2008.01.
56. 行野 顕正,田中 省作,冨浦 洋一,柴田 雅博, 統計的アプローチによる英語スラッシュ・リーディング教材の自動生成, 情報処理学会論文誌, 第48巻,第1号, 2007.01.
57. M. Shibata, Y. Tomiura, H. Matsumoto, T. Nishiguchi, K. Yukino, A. Hino, Developing a Dialog System for New Idea Generation Support, Proc. of 21st International Conference on Computer Processing of Oriental Language, 2006.12.
58. 青木 さやか,冨浦 洋一,行野 顕正,谷川 龍司, 言語識別技術を応用した英語における母語話者文書・非母語話者文書の判別, 情報科学技術レターズ, 第5巻,pp.85--88, 2006.09.
59. 本木 実,冨浦 洋一,高橋 直人, 記号列を入出力とするニューラルネットの学習法, 情報処理学会論文誌, 第47巻,第8号,pp.2279--2791, 2006.08.
60. 行野 顕正,田中 省作,冨浦 洋一,松本 英樹, 低頻度 byte 列を活用した言語識別, 情報処理学会論文誌, 第47巻,第4号,pp.1287--1294, 2006.04.
61. 田中 省作,藤井 宏,冨浦 洋一,徳見 道夫, NS/NNS論文分類モデルに基づく日本人英語科学論文の特徴抽出, 英語コーパス研究, 第13号,pp.75--87, 2006.01.
62. Y. Tomiura, S. Tanaka, T. Hitaka, Estimating Satisfactoriness of Selectional Restriction from Corpus without Thesaurus, ACM Transactions on Asian Language Information Processing, Vol.4, No.4, pp.400--416, 2005.12.
63. 藤井 宏,冨浦洋一,田中省作, Skew Divergence に基づく文書の母語話者性の推定, 自然言語処理(言語処理学会論文誌), Vol. 12, No. 4, pp.79-96, 2005.08.
64. K. YUKINO, S. TANAKA, Y. TOMIURA, H. MATSUMOTO, Robust Language Identification for Similar Languages and short texts using Low-Frequent Byte Strings, Pacific Association for Computational Linguistics 2005 (Pacling 2005), pp.368-373, 2005.08.
65. 柴田雅博,冨浦洋一,田中省作, Web上の語の共起性に基づいたコロケーションの翻訳支援, 情報処理学会論文誌, 第46巻,第6号,pp.1480-1491, 2005.06.
66. 柴田 雅博,田中 省作,冨浦 洋一, コロケーション翻訳支援システムに対する有用性の調査, 九州大学大学院システム情報科学紀要, Vol.10, No.1, pp.45--49, 2005.03.
67. 柴田 雅博,田中 省作,冨浦 洋一, Web文書中の語の共起性を用いたコロケーション翻訳支援システムの実装, 九州大学大学院システム情報科学紀要, Vol.10, No.1, pp.39--44, 2005.03.
68. S. Tanaka, Y. Tomiura, K. Yukino, A System for Extensive Slash Reading Using Web, An Interactive Workshop on Language e-Learning (IWLeL2004), pp.133-138, 2004.12.
69. 藤井 宏,田中省作,冨浦洋一, Skew Divergence に基づく母語話者/非母語話者文書の判別, 情報科学技術レターズ, 第3巻,pp81-83, 2004.09.
70. 田中省作,丸林哲也,冨浦洋一, 類語集合対応の推定と英語を介した辞書合成への応用, 九州大学大学院システム情報科学紀要, 第9巻,第2号,pp.73-78, 2004.09.
71. M. Motoki, Y. Tomiura, N. Takahashi, Problems of FGREP Module and Their Solution, 3rd IEEE International Conference on Cognitive Informatics (ICCI2004), 10.1109/COGINF.2004.1327479, 220-227, pp.220-227, 2004.08.
72. 田中 省作,冨浦 洋一,山本 祥平, チャンキング過程を考慮したスラッシュ・リーディング用文書の生成, 情報基盤センター研究報告, Vol.4, pp.1--8, 2004.03.
73. 柴田 雅博,冨浦 洋一,日高 達, 翻訳文法のための構文解析手法, システム情報科学紀要, Vol.9, No.1, pp.31--36, 2004.03.
74. Masahiro SHIBATA, Yoichi TOMIURA, Shosaku TANAKA, A Method for Retrieving Translations of Collocation in Web Data, Asian Symposium on Natural Language Processing to Overcome Language Barriers (in conjunction with IJCNLP-04), 2004.03.
75. 冨浦洋一,田中省作,日高 達, 共起データに基づく名詞の多次元空間への配置, 人工知能学会論文誌, 19巻,1号A, pp.1-9, 2004.01.
76. 冨浦洋一,日高 達, 言語コーパスからの語の共起性の推定, 情報処理学会論文誌, 第45巻,第1号,pp.324-332, 2004.01.
77. 柴田雅博,冨浦洋一,日高 達, 翻訳文法を用いた機械翻訳, 九州大学大学院システム情報科学紀要, 第8巻,第1号,pp.61-66, 2003.03.
78. 田中省作,冨浦洋一, 類語集合による英語を介して導出した対訳候補の絞り込み, 情報科学技術フォーラム 情報技術レターズ, pp.75-76, 2002.09.
79. 柴田雅博,日高達,冨浦洋一, 翻訳文法による機械翻訳とその実装, 情報基盤センター年報, No.2, pp.71-79, 2002.03.
80. TAKAHASHI Naoto, MOTOKI Minoru, SHIMAZU Yoshio, TOMIURA Yoichi, HITAKA Toru, PP-attachment Ambiguity Resolution Using a Neural Network wiht Modified FGREP Method, the 2nd Workshop on Natural Language Processing and Neural Networks (post-conference workshop of NLPRS2001), pp.1-7, 2001.11.
81. D. トウシンバット,冨浦洋一,日高 達, 汎化された係り受け文脈自由文法の構文解析法, 九州大学大学院システム情報科学紀要, 第5巻,第2号,pp.223-227, 2000.09.
82. 田中省作,冨浦洋一,日高 達, 共起制約を組み込んだ確率文法による名詞句の統語的曖昧さの解消, 九州大学大学院システム情報科学紀要, 第5巻,第1号,pp.69 - 74, 2000.03.
83. 田辺利文,冨浦洋一,日高達, 係り受け文脈自由文法とその日本語への適用, 情報処理学会論文誌, 第41巻, 第1号, pp.36 - 45, 2000.01.
84. 冨浦洋一,日高達, スパ−スな学習デ−タにおける確率係り受け文脈自由文法の確率パラメタの推定法, 情報処理学会論文誌, 第40巻, 第11号, pp.4055 - 4063, 1999.11.
85. 田中省作,柳瀬康雄,冨浦洋一,日高達, k-NN 推定法に基づいた名詞句の意味関係の推定, 九州大学大学院システム情報科学研究科報告, 第4巻,第2号,pp.159 - 164, 1999.09.
86. 田中省作,冨浦洋一,日高達, 意味範疇の散らばりに基づいた名詞の統語範疇の分類, 情報処理学会論文誌, 第40巻, 第9号, pp.3387 - 3396, 1999.09.
87. 冨浦洋一,日高達, k-NN 推定法に基づく統語的あいまいさ解消法, 電子情報通信学会論文誌 D-II, Vol.J80-D-II, No.9, pp.2475 - 2481, 1997.09.
88. 冨浦洋一 中村貞吾 日高達, 名詞句「NPのNP」の意味構造, 情報処理学会論文誌, 情報処理学会論文誌第36巻第6号 pp.1441 - 1448, 1995.06.
89. 冨浦洋一 市丸夏樹 日高達, 常識推論における推論の選択と文脈処理への応用, 情報処理学会論文誌, Vol.35, No.11, pp.2239 - 2248, 1994.11.
90. 冨浦洋一,中村貞吾,日高達, 最左部分語検索向き辞書データ構造:Prefix-Closed B-tree, 情報処理学会論文誌, 第35巻第5号 pp.779-789, 1994.05.
91. T. NAKAMURA, Y. TOMIURA, T. HITAKA, Semantic Validity of Japanese Noun Phrases with Adnominal Particles, Proc. of the nd Pacific Rim International Conference on Artificial Intelligence, Vol.1, No.2, pp.433--437, 1992.09.
92. Y. TOMIURA, T. NAKAMURA, T. HITAKA, S. YOSHIDA, Logical Form of Hierarchical Relation on Verbs and Extracting it from Definition Sentences in a Japanese Dictionary, Proc. of the th International Conference on Computational Linguistics(Coling-92), Vol.2, No.14, pp.574-580, 1992.07.
93. 冨浦洋一 日高達 吉田将, 語義文からの動詞間の上位-下位関係の抽出, 情報処理学会論文誌, Vol.32, No.1, pp.42 - 49, 1991.01.

九大関連コンテンツ

pure2017年10月2日から、「九州大学研究者情報」を補完するデータベースとして、Elsevier社の「Pure」による研究業績の公開を開始しました。