Kyushu University Academic Staff Educational and Research Activities Database
List of Presentations
Toshiyuki Shimizu Last modified date:2023.11.27

Associate Professor / Library


Presentations
1. Yasuhito Asano, Yang Cao, Soichiro Hidaka, Zhenjiang Hu, Yasunori Ishihara, Hiroyuki Kato, Keisuke Nakano, Makoto Onizuka, Yuya Sasaki, Toshiyuki Shimizu, Masato Takeichi, Chuan Xiao, Masatoshi Yoshikawa, Bidirectional Collaborative Frameworks for Decentralized Data Management, Communications in Computer and Information Science, 2022.01, Along with the continuous evolution of data management systems for the new market requirements, we are moving from centralized systems towards decentralized systems, where data are maintained in different sites with autonomous storage and computation capabilities. There are two fundamental issues with such decentralized systems: local privacy and global consistency. By local privacy, the data owner wishes to control what information should be exposed and how it should be used or updated by other peers. By global consistency, the systems wish to have a globally consistent and integrated view of all data. In this paper, we report the progress of our BISCUITS (Bidirectional Information Systems for Collaborative, Updatable, Interoperable, and Trusted Sharing) project that attempts to systematically solve these two issues in distributed systems. We present a new bidirectional transformation-based approach to control and share distributed data, propose several distributed architectures for data integration via bidirectional updatable views, and demonstrate the applications of these architectures in ride-sharing alliances and gig job sites..
2. Toshiyuki Shimizu, Joji Kido, Masatoshi Yoshikawa, Keyword Recommendation Methods for Earth Science Data Considering Hierarchical Structure of Vocabularies., ACM/IEEE Joint Conference on Digital Libraries (JCDL 2020), 2020.08.
3. Toshiyuki Shimizu, Hiroki Omori, Masatoshi Yoshikawa, Toward a view-based data cleaning architecture., CoRR, 2019.10.
4. Hideaki Ohashi, Yasuhito Asano, Toshiyuki Shimizu, Masatoshi Yoshikawa, Give and Take: Adaptive Balanced Allocation for Peer Assessments., 25th International Computing and Combinatorics Conference (COCOON 2019), 2019.07.
5. Yasuhito Asano, Soichiro Hidaka, Zhenjiang Hu, Yasunori Ishihara, Hiroyuki Kato, Hsiang-Shang Ko, Keisuke Nakano, Makoto Onizuka, Yuya Sasaki, Toshiyuki Shimizu, Kanae Tsushima, Masatoshi Yoshikawa, A View-based Programmable Architecture for Controlling and Integrating Decentralized Data., CoRR, 2018.03.
6. Yu Nakano, Toshiyuki Shimizu, Masatoshi Yoshikawa, A Visualization of Relationships Among Papers Using Citation and Co-citation Information, 18th International Conference on Asia-Pacific Digital Libraries (ICADL 2016), 2016.12, When we conduct scholarly surveys, we occasionally encounter difficulties in grasping the vast amount of related papers. Because academic papers have relationships, such as citing and cited relationships, we considered utilizing them for supporting scholarly surveys. In this paper, we propose a method for visualizing relationships among papers, and we construct paper graphs using two types of relationships, namely, citation and co-citation. Moreover, we quantify the strengths of citations and co-citations based on their frequency and the positions of co-citations, and show both types of relationships together in a graph. We constructed paper graphs using papers in the database field and discussed their usefulness..
7. Hideaki Ohashi, Toshiyuki Shimizu, Masatoshi Yoshikawa, Flexible Similarity Search for Enriched Trajectories, 11th International Workshop on Spatial and Spatiotemporal Data Mining (SSTDM 2016), 2016.12, In this study, we focus on a method of searching for similar trajectories. In most previous works on searching for similar trajectories, only raw trajectory data have been used. However, to obtain deeper insights, additional time-dependent trajectory features should be utilized depending on the search intent. For instance, to identify soccer players who have similar dribbling patterns, such additional features include the correlations between players' speeds and directions. In addition, when finding similar combination plays, the additional features include the team players' movements. In this paper, we develop a framework to flexibly search for similar trajectories associated with time-dependent features, called enriched trajectories. In this framework, weights, which represent the relative importance of each feature, can be flexibly input. Moreover, to facilitate fast searching, we propose a lower bounding measure of the DTW distance between enriched trajectories. We evaluate the effectiveness of the lower bounding measure using soccer data and synthetic data. Our experimental results suggest that the proposed lower bounding measure is superior to the existing measure and works very well..
8. Hidetsugu Nanba, Tetsuya Sakai, Noriko Kando, Atsushi Keyaki, Koji Eguchi, Kenji Hatano, Toshiyuki Shimizu, Yu Hirate, Atsushi Fujii, NEXTI at NTCIR-12 IMine-2 Task, 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016.06.
9. Youichi Ishida, Toshiyuki Shimizu, Masatoshi Yoshikawa, A Keyword Recommendation Method Using CorKeD Words and Its Application to Earth Science Data, 11th Asia Information Retrieval Societies Conference (AIRS 2015), 2015.12, In various research domains, data providers themselves annotate their own data with keywords from a controlled vocabulary. However, since selecting keywords requires extensive knowledge of the domain and the controlled vocabulary, even data providers have difficulty in selecting appropriate keywords from the vocabulary. Therefore, we propose a method for recommending relevant keywords in a controlled vocabulary to data providers. We focus on a keyword definition, and calculate the similarity between an abstract text of data and the keyword definition. Moreover, considering that there are unnecessary words in the calculation, we extract CorKeD (Corpus-based Keyword Decisive) words from a target domain corpus so that we can measure the similarity appropriately. We conduct an experiment on earth science data, and verify the effectiveness of extracting the CorKeD words, which are the terms that better characterize the domain..
10. Shuhei Shogen, Toshiyuki Shimizu, Masatoshi Yoshikawa, Enrichment of Academic Search Engine Results Pages by Citation-Based Graphs, 11th Asia Information Retrieval Societies Conference (AIRS 2015), 2015.12, Researchers' readings of academic papers make their research more sophisticated and objective. In this paper, we describe a method of supporting scholarly surveys by incorporating a graph based on citation relationships into the results page of an academic search engine. Conventional academic search engines have a problem in that users have difficulty in determining which academic papers are relevant to their needs because it is hard to understand the relationship between the academic papers that appear in the search results pages. Our method helps users to make judgments about the relevance of papers by clearly visualizing the relationship. It visualizes not only academic papers on the results page but also papers that have a strong citation relationship with them. We carefully considered the method of visualization and implemented a prototype with which we conducted a user study simulating scholarly surveys. We confirmed that our method improved the efficiency of scholarly surveys through the user study..
11. Jiyi Li, Yasuhito Asano, Toshiyuki Shimizu, Masatoshi Yoshikawa, A dynamic-static approach of model fusion for document similarity computation, 16th International Conference on Web Information Systems Engineering (WISE 2015), 2015.11, The semantic similarity of text document pairs can be used for valuable applications. There are various existing basic models proposed for representing document content and computing document similarity. Each basic model performs difference in different scenarios. Existing model selection or fusion approaches generate improved models based on these basic models on the granularity of document collection. These improved models are static for all document pairs and may be only proper for some of the document pairs. We propose a dynamic idea of model fusion, and an approach based on a Dynamic-Static Fusion Model (DSFM) on the granularity of document pairs, which is dynamic for each document pair. The dynamic module in DSFM learns to rank the basic models to predict the best basic model for a given document pair. We propose a model categorization method to construct ideal model labels of document pairs for learning in this dynamic module. The static module in DSFM is based on linear regression. We also propose a model selection method to select appropriate candidate basic models for fusion and improve the performance. The experiments on public document collections which contain paragraph pairs and sentence pairs with human-rated similarity illustrate the effectiveness of our approach..
12. Jiyi Li, Toshiyuki Shimizu, Masatoshi Yoshikawa, Document Similarity Computation by Combining Multiple Representation Models, 16th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD 2015), 2015.06, Evaluating semantic similarity of text document pairs is an active research topic. Various models of document representation have been proposed. Each kind of representation model concentrates on a different kind of information from other kind of models. However, it is difficult for a single model to perform well in all scenarios because of the variety of textual documents. Leveraging these models to complement each other is possible to improve the performance. In this paper, we first make an analysis on the relations among document semantic similarity, human ratings and model performance. Based on the observations, we propose a rational solution of selecting different representation models and fusing the results of these models to compute document similarity for a given document collection. We leverage the performance and relations of different models to select proper models. Our fusion approach proposes a regression function with both nonlinear and linear factors and dynamic weights based on the similarities by various models. We report the effectiveness of our work based on a rated news document collection. The particular version of our general approach for this collection can integrate the information from both brief entity knowledge and detailed word content..
13. Yuhei Kawakami, Atsuto Nishida, Toshiyuki Shimizu, Masatoshi Yoshikawa, Axis-Based Alignment of Scholarly Papers and Its Presentation Slides Considering Document Structure, 16th International Conference on Asia-Pacific Digital Libraries (ICADL 2014), 2014.11, Recently, most researchers make a presentation with presentation slides to introduce a paper in academic conferences. We can often retrieve and browse papers and presentation slides through websites. We consider that we can obtain information efficiently by using both of them, and we propose a method to align papers and its presentation slides at the fine granularity. Though there are some existing works on this alignment, our system tried to achieve better accuracy for this problem by proposing the two approaches: 1) the adjustment by axis alignments; and 2) two-step alignment. The content similarity between each slide and paragraph is unstable due to the small amount of texts in the slides. Therefore we also calculate the content similarity in section-level and consider the ancillary alignment in section-level. Also, we succeed to obtain better alignments by adjusting the alignment score or narrowing down the alignment candidates. Finally, for each slide, we calculate the scores of each paragraph, and determine the alignment according to the scores by associating the slide to a sequence of paragraphs. We created a small dataset manually and conducted an experiment to confirm the effectiveness of the proposed method..
14. Naoki Tsujio, Toshiyuki Shimizu, Masatoshi Yoshikawa, A Method for Fine-Grained Document Alignment Using Structural Information, 16th Asia-Pacific Web Conference (APWeb 2014), 2014.09, It is useful to understand the corresponding relationships between each part of related documents, such as a conference paper and its modified version published as a journal paper, or documents in different versions. However, it is hard to associate corresponding parts which have been heavily modified only using similarity in their content. We propose a method of aligning documents considering not only content information but also structural information in documents. Our method consists of three steps; baseline alignment considering document order, merging, and swapping. We used papers which have been presented at a domestic conference and an international conference, then obtained their alignments by using several methods in our evaluation experiments. The results revealed the effectiveness of the use of document structures..
15. Semantic Knowledge Base Construction from Domain-Specified Metadata.
16. Toshiyuki Shimizu, Tomo Sueki, Masatoshi Yoshikawa, Supporting Keyword Selection in Generating Earth Science Metadata, 37th Annual IEEE Computer Software and Applications Conference (COMPSAC 2013), 2013.07, Vast amounts of earth science data have been stored and managed in various projects. Because earth science requires considerable experience and expertise, we generally generate metadata for earth science datasets. In such metadata, keyword information is important. Controlled keywords such as GCMD Science Keyword are used for metadata. However, selection of suitable keywords is not easy. We propose methods to suggest keywords from summary texts of datasets. We considered simple string matching and inferring by Labeled LDA, and apply them to the actual earth science metadata..
17. Tsubasa Tanabe, Toshiyuki Shimizu, Masatoshi Yoshikawa, Effective keyword-based XML retrieval using the data-centric and document-centric features, 8th Asia Information Retrieval Societies Conference (AIRS 2012), 2012.12, Extensible Markup Language (XML) is used for not only describing structured documents but also for describing data just for generating XML from relational data. The former is called document-centric XML, and the latter is called data-centric XML. From studies on retrieving data-centric XML by using keyword searches, methods based on LCA have been proposed, while from studies on retrieving document-centric XML, methods based on information retrieval that focus on the granularity of XML elements have been proposed. However, documents generally have both data-centric and document-centric elements, so there are cases in which desired results cannot be returned by using existing research. We propose a method for constructing suitable search results for XML documents that include both data-centric and document-centric elements by considering a user's query intention and element features (data-centric or document-centric). Our experiments show that both data-centric and document-centric elements need to be considered for actual XML documents. © Springer-Verlag 2012..
18. 重要度と時空間近接度を統合した地球科学データのランキング手法.
19. Akinori Saito, Takuya Tsugawa, Yusuke Akiya, Toshiyuki Shimizu, Masatoshi Yoshikawa, Development of a data browsing system for geoscience data using geobrowsers, The 1st ICSU World Data System Conference, PS6-11, 2011.09.
20. Ryo Aoto, Toshiyuki Shimizu, Masatoshi Yoshikawa, Propagation of multi-granularity annotations, 22nd International Conference on Database and Expert Systems Applications (DEXA 2011), 2011.09, Data origin or processing information and the metadata that is useful in understanding data can be associated with data by using annotation. Provenance knowledge preserved by annotation is managed by continuously propagating the annotations through the workflow. Models for explicitly associating annotations are generally used for annotation-based provenance management, and techniques for propagating annotations have been proposed. There is also a model for implicitly associating annotations - the annotations are associated with data with arbitrary granularity by using queries. We call the implicit model "multi-granularity annotation" model. Multi-granularity annotation enables flexible association of information. However, no provenance management methods using multi-granularity annotations have been reported. We have developed a method for propagating multi-granularity annotations. We define rules for annotation propagation for each relational algebra operation, and they are used to recalculate the scopes of annotations associated with data. We also addressed the loss of information needed to preserve annotation associations during data derivation and the lack of static data annotations by extending the operations and the association method. Experiments showed that our method requires less space usage and execution time than conventional annotation management methods. © 2011 Springer-Verlag Berlin Heidelberg..
21. Tetsutaro Motomura, Toshiyuki Shimizu, Masatoshi Yoshikawa, Alternative query generation for XML keyword search and its optimization, 22nd International Conference on Database and Expert Systems Applications (DEXA 2011), 2011.08, Much work has been done in XML keyword search since users can obtain various information from XML databases without specific knowledge of the database schema and/or the knowledge about the query languages. Moreover, certain researches have suggested methods of returning some information that would help users understand search results. In this paper, we define alternative queries, which can be considered as different aspects of XML keyword search results. In XML keyword search, a keyword may match an unexpected text value or element name, then incorrect results that do not correspond to the users' search intentions may be retrieved. When we generate alternative queries, it does not seem useful to generate alternative queries for all the results since they include several results retrieved by several interpretations. Thus, we propose a method of generating alternative queries from results classified by interpretations. We also propose a stack-based algorithm for generating alternative queries. Finally, the experimental results reveal that our proposal generates alternative queries efficiently. © 2011 Springer-Verlag Berlin Heidelberg..
22. Masashi Tatedoko, Toshiyuki Shimizu, Akinori Saito, Masatoshi Yoshikawa, A retrieval method for earth science data based on integrated use of Wikipedia and domain ontology, 21st International Conference on Database and Expert Systems Applications (DEXA 2010), 2010.09, Due to the recent advancement in observation technologies and progress in information technologies, the total amount of earth science data has increased at an explosive pace. However, it is not easy to search and discover earth science data because earth science requires high degree of expertness. In this paper, we propose a retrieval method for earth science data which can be used by non-experts such as scientists from other field, or students interested in earth science. In order to retrieve relevant data sets from a query, which may not include technical terminologies, supplementing terms are extracted by utilizing knowledge bases
Wikipedia and domain ontology. We evaluated our method using actual earth science data. The data, the queries, and the relevance assessments for our experiments were made by the researchers of earth science. The results of our experiments show that our method has achieved good recall and precision. © 2010 Springer-Verlag..
23. Umaporn Supasitthimethee, Toshiyuki Shimizu, Masatoshi Yoshikawa, Kriengkrai Porkaew, Meaningful Interrelated Object Tree for XML Keyword Search, 2nd International Conference on Computer and Automation Engineering (ICCAE 2010), 2010.02, In the research field of XML retrieval with keyword-based approach, a variant of Lowest Common Ancestors (LCAs) have been widely accepted to provide how keywords are connected by ancestor relationship. However, returning a whole subtree or a partial subtree based on LCA nodes is insufficient for identifying how subtrees are conceptually related under different tree structure such as ID/IDREF. On the other hand, storing XML documents in the graph model can define richer relationships that the tree model cannot but the cost of enumerating result is very high. In this paper, we propose a novel Smallest Lowest Object Tree (SLOT) which keywords are connected through physical connections. In addition, to capture conceptual connections, we also propose the Smallest Interrelated Object Tree (SIOT) which extends ID/IDREF relationships based on SLOT. Finally, our experiment indicates that the proposed approach returns more effective and more semantic results for users..
24. Mika Ichino, Hiroko Kinutani, Masafumi Ono, Toshiyuki Shimizu, Masatoshi Yoshikawa, Kooiti Masuda, Kazuyo Fukuda, Haruko Kawamoto, A Document Centric Metadata Registration Tool Constructing Earth Environmental Data Infrastructure, AGU Fall Meeting 2009 (Session IN23B), San Francisco, California, USA, December 14-18, 2009. (poster), 2009.12.
25. Hiroko Kinutani, Masafumi Ono, Toshiyuki Shimizu, Masashi Tatedoko, Masatoshi Yoshikawa, Development Toward Integrated Management of Earth System Data - 2. A Document Centric Metadata Registration Tool -, 6th GEWEX and 2nd iLEAPS Joint Science Conferences, Melbourne, Australia, August 24-28, 2009. (poster), 2009.08.
26. Umaporn Supasitthimethee, Toshiyuki Shimizu, Masatoshi Yoshikawa, Kriengkrai Porkaew, An Extension of LCA based XML Keyword Search, International Workshop on Information-explosion and Next Generation Search (INGS 2008), 2008.04, One of the most convenient ways to query XML data is a keyword search because it does not require any knowledge about XML structure and without the need to learn a new user interface. However, keyword search interface is very flexible. It is hard for a system to decide which node is likely to be chosen as a return node and how much information should be included in the result. To address this challenge, we propose an extension of LCA based XML keyword search. First, to determine a return node, we provide a query syntax that the users can tell the system which node they are really interested in. In case that the users do not explicitly specify return information, our system will automatically analyze and choose appropriate return. nodes by inferring from user keywords. Second, to return a meaningful result, we investigate the problem of the return information in the LCA and the proximity search approaches. To this end, we introduce the Lowest Element Node (LEN) and define our simple rules without any requirement on the schema information such. as DTD or XML Schema. Our experiment results indicate that our system not only infers the right return nodes but also generates compact and meaningful results..
27. Toshiyuki Shimizu, Masatoshi Yoshikawa, Dynamic Focused Retrieval of XML Documents and Its Evaluation, Workshop on Novel Methodologies for Evaluation in Information Retrieval, 2008.03.
28. Masatoshi Yoshikawa, Toshiyuki Shimizu, A new ranking scheme and result representation for XML information retrieval based on benefit and reading effort, International Conference on Informatics Education and Research for Knowledge-Circulating Society (ICKS 2008), 2008.01, Elements of XML documents greatly vary in size and may nest each other. XML information. retrieval (XML-IR) systems are required to take these nature of XML documents into consideration. Because of these natures, top-k search is not suitable for XML-IR. We introduce new concepts of benefit anal reading effort of elements. By using these concepts, we propose a new dynamic ranking method that enables us to browse search results of XML-IR systems efficiently. We also study how to display the search results of XML-IR Systems. If XML document was original composed of paper image pages, such as scholarly articles or books, it would be natural to display result elements by overlaying them on the physical layout of pages in the user interfaces. We propose methods for displaying the result of XML search of scholarly articles and ranking methods based on page units..
29. Yu Suzuki, Masahiro Mitsukawa, Kenji Hatano, Toshiyuki Shimizu, Jun Miyazaki, Hiroko Kinutani, An XML Fragment Retrieval Method with Image and Text using Textual Information Retrieval Techniques, Pre-Proceedings of INEX 2007, 2007.12.
30. Kenji Hatano, Toshiyuki Shimizu, Jun Miyazaki, Yu Suzuki, Hiroko Kinutani, Masatoshi Yoshikawa, Ranking and Presenting Search Results in an RDB-based XML Search Engine, Pre-Proceedings of INEX 2007, 2007.12.
31. Toshiyuki Shimizu, Masatoshi Yoshikawa, A ranking scheme for XML information retrieval based on benefit and reading effort, 10th International Conference on Asian Digital Libraries (ICADL 2007), 2007.12, XML information retrieval (XML-IR) systems search for relevant document fragments in XML documents for given queries. In top-k search, users control the size of output by an integer k. In XML-IR, however, each output element varies widely in size. Consequently, total output size of top-k elements is uncontrollable by simply giving an integer k. In addition, search results may have nesting elements. If a system orders result elements simply by their relevance, we may browse the same content more than once due to the nestings. To handle these problems, we propose a new ranking method that enables us to browse search results of XML-IR systems efficiently by introducing the concepts of benefit and reading effort. We also propose an evaluation metrics based on benefit and reading effort, and compared the metrics with existing XML-IR metrics by experiments..
32. Toshiyuki Shimizu, Masatoshi Yoshikawa, XML Information Retrieval Considering Physical Page Layout of Logical Elements, 10th International Workshop on the Web and Databases (WebDB 2007), 2007.06.
33. Toshiyuki Shimizu, Norimasa Terada, Masatoshi Yoshikawa, Development of an XML information retrieval system for queries on contents and structures, Second International Conference on Informatics Research for Development of Knowledge Society Infrastructure (ICKS 2007), 2007.01, We have developed an XML information retrieval system which can process queries by keywords or queries by combination of keywords and structural conditions. Queries by keywords are simple yet useful because users are not required to understand XML query languages or XML schema. While issuing queries by combination of keywords and structural conditions requires users to understand query languages and the underlying XML schema, we can restrict the target document fragments and the search conditions using structures in XML. The system was implemented on top of a relational XML database system developed by our group. The system can process both types of queries under a common relational schema. By carefully designing the database schema, the system handles a huge number of document fragments efficiently. For queries by keywords, we have developed a user-friendly interface for displaying search results. Our experiments using INEX test collection show that the system achieved relatively high precision and can process keyword set queries in acceptable search time..
34. Toshiyuki Shimizu, Norimasa Terada, Masatoshi Yoshikawa, Kikori-KS: An Effective and Efficient Keyword Search System for Digital Libraries in XML, 9th International Conference on Asian Digital Libraries (ICADL 2006), 2006.11.
35. Kei Fujimoto, Toshiyuki Shimizu, Norimasa Terada, Kenji Hatano, Yu Suzuki, Toshiyuki Amagasa, Hiroko Kinutani, Masatoshi Yoshikawa, Implementation of a high-speed and high-precision XML information retrieval system on relational databases, 4th International Conference on Initiative for the Evaluation of XML Retrieval (INEX 2005), 2005.11, This paper describes an XML information retrieval system that we have developed. It is based on a vector space model, and implemented on top of XRel, a relational XML database system that has been developed in our research group. When a query is processed, a large number of fragments are retrieved, because a single XML document usually contains many XML fragments. Keeping all XML fragments degrades retrieval precision and increases query processing time, because some XML fragments are not appropriate as a query target. In existing methods, retrieval targets are manually selected by human experts when an XML collection is stored in the system. Such manual selection is not feasible when many kinds of XML documents are stored in the system. To cope with the problem we propose a method for automatically selecting document-centric fragments by introducing three measurements, namely, period ratio, number of different words, and empirical rules. By deleting inappropriate data-centric fragments from results of keyword query, we can improve the accuracy and performance of our system. Through performance evaluations, we confirmed the improvement of retrieval precision and query processing speed..
36. Toshiyuki Shimizu, Masatoshi Yoshikawa, Full-text and structural XML indexing on B+-tree, 16th International Conference on Database and Expert Systems Applications (DEXA 2005), 2005.08, XML query processing is one of the most active areas of database research. Although the main focus of past research has been the processing of structural XML queries, there are growing demands for a full-text search for XML documents. In this paper, we propose XICS (XML Indices for Content and Structural search), novel indices built on a B+-tree, for the fast processing of queries that involve structural and full-text searches of XML documents. To represent the structural information of XML trees, each node in the XML tree is labeled with an identifier. The identifier contains an integer number representing the path information from the root node. XICS consist of two types of indices, the COB-tree (COntent B+-tree) and the STB-tree (STructure B+-tree). The search keys of the COB-tree are a pair of text fragments in the XML document and the identifiers of the leaf nodes that contain the text, whereas the search keys of the STB-tree are the node identifiers. By using a node identifier in the search keys, we can retrieve only the entries that match the path information in the query. Our experimental results show the efficiency of XICS in query processing..
37. Kei Fujimoto, Toshiyuki Shimizu, Dao Dinh Kha, Masatoshi Yoshikawa, Toshiyuki Amagasa, A mapping scheme of XML documents into relational databases using schema-based path identifiers, International Workshop on Challenges in Web Information Retrieval and Integration (WIRI 2005), 2005.04, In this paper we propose a mapping scheme of XML documents into relational databases. The scheme enables us to store, retrieve and update XML documents efficiently. When storing XML documents in relational databases, XML tree structures must be preserved explicitly. To this end, a label is assigned to nodes in the XML tree. In general, document retrieval and update performance is affected by node labeling schemes. We use SPIDER (Schema based Path IDentifiER), a labeling scheme for XML documents utilizing DTDs that makes retrieval and update more efficient. SPIDER only identifies paths from root node to a node. Thus, multiple nodes appearing in the same path cannot be distinguished by only using SPIDER. We introduced Sibling Dewey Order to identify such nodes. Generally, when a new node is inserted into XML documents, some other nodes need to be relabeled to preserve the order of nodes. In our method, only Sibling Dewey Order is relabeled; SPIDER is not affected. Since the range of relabeling is small, it is possible to update documents efficiently.
We stored documents utilizing SPIDER in a relational database and then translated various XPath expressions into SQL using SPIDER. We perform experiments and demonstrate that the proposed scheme outpeforms conventional methods both in retrieval and update..
38. 2001 PRMU Algorithm Contest "Recognition of Traffic Signs" Summary Report and Its Prize Winning Algorithm
The Special Interest Group on Pattern Recognition and Media Understanding (PRMU) has been promoting an annual algorithm contest on pattern recognition and media understanding since 1997. The fifth contest focuses on pattern classification and segmentation of traffic signs. The problems of the contest are divided into three levels according to difficulties of classification ; segmented CG images as level 1, segmented real images as level 2, and real scenes as level 3. The summary of the contest is reported first, and then each prize-winning algorithm is described in detail by the original prize-winner..