Updated on 2024/10/07

Information

 

写真a

 
NAKASHIMA YUTO
 
Organization
Faculty of Information Science and Electrical Engineering Department of Informatics Associate Professor
School of Sciences Department of Physics(Concurrent)
Graduate School of Information Science and Electrical Engineering Department of Information Science and Technology(Concurrent)
Title
Associate Professor
Profile
文字列の組合せ論および文字列データ処理アルゴリズムの研究を行なっている. 教育では主に,理学部物理学科情報理学コースの演習科目を担当している.
External link

Research Areas

  • Informatics / Theory of informatics

Degree

  • Ph.D.(Information Science)

Research History

  • Kyushu University Faculty of Information Science and Electrical Engineering, Department of Informatics Associate Professor 

    2024.6 - Present

      More details

    Country:Japan

    researchmap

  • Kyushu University Faculty of Information Science and Electrical Engineering, Department of Informatics Assistant Professor 

    2017.4 - 2024.5

      More details

Research Interests・Research Keywords

  • Research theme: Combinatorics on words

    Keyword: Combinatorics on words

    Research period: 2024

  • Research theme: String Algorithm

    Keyword: String Algorithm

    Research period: 2024

  • Research theme: String Processing Algorithms, Combinatorics on Words

    Keyword: Pattern Matching, Streaming Model, Lyndon Words

    Research period: 2017.4

Awards

  • IPSJ Yamashita SIG Research Award

    2024.3   Information Processing Society of Japan   アルファベット順による lex-parse サイズ比

     More details

  • SPIRE2020 Best Paper Award

    2020.10   SPIRE2020  

Papers

  • On the Number of Non-equivalent Parameterized Squares in a String Reviewed

    Rikuya Hamai, Kazushi Taketsugu, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai

    Proceedings of 31th International Symposium on String Processing and Information Retrieval (SPIRE 2024)   174 - 183   2024.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Edit and Alphabet-Ordering Sensitivity of Lex-Parse Reviewed

    Yuto Nakashima, Dominik Köppl, Mitsuru Funakoshi, Shunsuke Inenaga, Hideo Bannai

    Proceedings of the 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)   75:1 - 75:15   2024.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Computing Maximal Palindromes in Non-standard Matching Models. Reviewed

    Mitsuru Funakoshi, Takuya Mieno, Yuto Nakashima 0001, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of the 35th International Workshop on Combinatorial Algorithms (IWOCA 2024)   165 - 179   2024.7

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-031-63021-7_13

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/iwoca/iwoca2024.html#FunakoshiMNIBT24

  • Computing Longest Common Subsequence Under Cartesian-Tree Matching Model. Reviewed

    Taketo Tsujimoto, Hiroki Shibata, Takuya Mieno, Yuto Nakashima 0001, Shunsuke Inenaga

    Proceedings of the 35th International Workshop on Combinatorial Algorithms (IWOCA 2024)   369 - 381   2024.7

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-031-63021-7_28

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/iwoca/iwoca2024.html#TsujimotoSMNI24

  • Faster space-efficient STR-IC-LCS computation Reviewed

    Yuki Yonemoto, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai

    Theoretical Computer Science   1003   114607 - 114607   2024.5   ISSN:0304-3975

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Elsevier BV  

    One of the most fundamental method for comparing two given strings A and B is the longest common subsequence (LCS), where the task is to find (the length) of an LCS of A and B. In this paper, we deal with the STR-IC-LCS1 problem which is one of the constrained LCS problems proposed by Chen and Chao [J. Comb. Optim, 2011]. A string Z is said to be an STR-IC-LCS of three given strings A, B, and P, if Z is a longest string satisfying that (1) Z includes P as a substring and (2) Z is a common subsequence of A and B. We present three efficient algorithms for this problem: First, we begin with a space-efficient solution which computes the length of an STR-IC-LCS in O(n2) time and O((ℓ+1)(n−ℓ+1)) space, where ℓ is the length of an LCS of A and B of length n. When ℓ=O(1) or n−ℓ=O(1), then this algorithm uses only linear O(n) space. Second, we present a faster algorithm that works in O(nr/log⁡r+n(n−ℓ+1)) time, where r is the length of P, while retaining the O((ℓ+1)(n−ℓ+1)) space efficiency. Third, we give an alternative algorithm that runs in O(nr/log⁡r+n(n−ℓ′+1)) time with O((ℓ′+1)(n−ℓ′+1)) space, where ℓ′ denotes the STR-IC-LCS length for input strings A, B, and P.

    DOI: 10.1016/j.tcs.2024.114607

    Scopus

    researchmap

  • Largest Repetition Factorization of Fibonacci Words Reviewed

    Kaisei Kishi, Yuto Nakashima, Shunsuke Inenaga

    Proceedings of 30th International Symposium on String Processing and Information Retrieval (SPIRE 2023)   2023.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Linear-Time Computation of Generalized Minimal Absent Words for Multiple Strings. Reviewed

    Kouta Okabe, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai

    Proceedings of 30th International Symposium on String Processing and Information Retrieval (SPIRE 2023)   14240 LNCS   331 - 344   2023.9   ISSN:0302-9743 ISBN:9783031439797, 9783031439803 eISSN:1611-3349

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:Springer Nature Switzerland  

    A string w is called a minimal absent word (MAW) for a string S if w does not occur as a substring in S and all proper substrings of w occur in S. MAWs are well-studied combinatorial string objects that have potential applications in areas including bioinformatics, musicology, and data compression. In this paper, we generalize the notion of MAWs to a set $$\mathcal {S} = \{S_1, \ldots, S_k\}$$ of multiple strings. We first describe our solution to the case of $$k = 2$$ strings, and show how to compute the set $$\textsf{M}$$ of MAWs in optimal $$O(n + |\textsf{M}|)$$ time and with O(n) working space, where n denotes the total length of the strings in $$\mathcal {S}$$. We then move on to the general case of $$k > 2$$ strings, and show how to compute the set $$\textsf{M}$$ of MAWs in $$O(n \lceil k / \log n \rceil + |\textsf{M}|)$$ time and with $$O(n (k + \log n))$$ bits of working space, in the word RAM model with machine word size $$\omega = \log n$$. The latter algorithm runs in optimal $$O(n + |\textsf{M}|)$$ time for $$k = O(\log n)$$.

    DOI: 10.1007/978-3-031-43980-3_27

    Scopus

    researchmap

  • Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph. Reviewed

    Hiroki Arimura, Shunsuke Inenaga, Yasuaki Kobayashi, Yuto Nakashima, Mizuki Sue

    Proceedings of 30th International Symposium on String Processing and Information Retrieval (SPIRE 2023)   14240 LNCS   28 - 34   2023.9   ISSN:03029743 ISBN:9783031439797

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)  

    In this paper, we present the first study of the computational complexity of converting an automata-based text index structure, called the Compact Directed Acyclic Word Graph (CDAWG), of size e for a text T of length n into other text indexing structures for the same text, suitable for highly repetitive texts: the run-length BWT of size r, the irreducible PLCP array of size r, and the quasi-irreducible LPF array of size e, as well as the lex-parse of size O(r) and the LZ77-parse of size z, where $$r, z \leqslant e$$. As main results, we showed that the above structures can be optimally computed from either the CDAWG for T stored in read-only memory or its self-index version of size e without a text in O(e) worst-case time and words of working space. To obtain the above results, we devised techniques for enumerating a particular subset of suffixes in the lexicographic and text orders using the forward and backward search on the CDAWG by extending the result by Belazzougui et al. in 2015.

    DOI: 10.1007/978-3-031-43980-3_3

    Scopus

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/spire/spire2023.html#ArimuraIKNS23

  • Computing SEQ-IC-LCS of Labeled Graphs Reviewed International journal

    Yuki Yonemoto, Yuto Nakashima, Shunsuke Inenaga

    Proceedings of the Prague Stringology Conference 2023 (PSC 2023)   2023.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • On Sensitivity of Compact Directed Acyclic Word Graphs Reviewed International journal

    Hiroto Fujimaru, Yuto Nakashima, Shunsuke Inenaga

    Proceedings of 14th International Conference on Words (WORDS 2023)   168 - 180   2023.6

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Optimal LZ-End Parsing Is Hard Reviewed International journal

    Hideo Bannai, Mitsuru Funakoshi, Kazuhiro Kurita, Yuto Nakashima, Kazuhisa Seto, Takeaki Uno

    Proceedings of the 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)   3:1 - 3:11   2023.6

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Space-Efficient STR-IC-LCS Computation Reviewed International journal

    Yuuki Yonemoto, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai

    Proceedings of 47th International Conference on Current Trends in Theory and Practice of Informatics (SOFSEM 2023)   372 - 384   2023.1

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Parameterized DAWGs: Efficient constructions and bidirectional pattern searches Reviewed

    Katsuhito Nakashima, Noriki Fujisato, Diptarama Hendrian, Yuto Nakashima, Ryo Yoshinaka, Shunsuke Inenaga, Hideo Bannai, Ayumi Shinohara, Masayuki Takeda

    Theoretical Computer Science   933   21 - 42   2022.10   ISSN:0304-3975 eISSN:1879-2294

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Elsevier BV  

    Two strings x and y over Σ∪Π of equal length are said to parameterized match (p-match) if there is a renaming bijection f:Σ∪Π→Σ∪Π that is identity on Σ and transforms x to y (or vice versa). The p-matching problem is to look for substrings in a text that p-match a given pattern. In this paper, we propose parameterized suffix automata (p-suffix automata) and parameterized directed acyclic word graphs (PDAWGs) which are the p-matching versions of suffix automata and DAWGs. While suffix automata and DAWGs are equivalent for standard strings, we show that p-suffix automata can have Θ(n2) nodes and edges but PDAWGs have only O(n) nodes and edges, where n is the length of an input string. We also give an O(n|Π|log⁡(|Π|+|Σ|))-time O(n)-space algorithm that builds the PDAWG in a left-to-right online manner. As a byproduct, it is shown that the parameterized suffix tree for the reversed string can also be built in the same time and space, in a right-to-left online manner. This duality also leads us to two further efficient algorithms for p-matching: Given the parameterized suffix tree for the reversal T‾ of the input string T, one can build the PDAWG of T in O(n) time in an offline manner; One can perform bidirectional p-matching in O(mlog⁡(|Π|+|Σ|)+occ) time using O(n) space, where m denotes the pattern length and occ is the number of pattern occurrences in the text T.

    DOI: 10.1016/j.tcs.2022.09.008

    Web of Science

    Scopus

    researchmap

  • Combinatorics of minimal absent words for a sliding window Reviewed

    Tooru Akagi, Yuki Kuhara, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Theoretical Computer Science   927   109 - 119   2022.6   ISSN:0304-3975 eISSN:1879-2294

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Elsevier BV  

    A string w is called a minimal absent word (MAW) for another string T if w does not occur in T but the proper substrings of w occur in T. For example, let Σ={a,b,c} be the alphabet. Then, the set of MAWs for string w=abaab is {aaa,aaba,bab,bb,c}. In this paper, we study combinatorial properties of MAWs in the sliding window model, namely, how the set of MAWs changes when a sliding window of fixed length d is shifted over the input string T of length n, where 1≤d<n. We present tight upper and lower bounds on the maximum number of changes in the set of MAWs for a sliding window over T, both in the cases of general alphabets and binary alphabets. Our bounds improve on the previously known best bounds [Crochemore et al., 2020].

    DOI: 10.1016/j.tcs.2022.06.002

    Web of Science

    Scopus

    researchmap

  • Minimal Absent Words on Run-Length Encoded Strings Reviewed International journal

    Tooru Akagi, Kouta Okabe, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

    Proceedings of the 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)   27:1 - 27:17   2022.6

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • c-trie++: A dynamic trie tailored for fast prefix searches Reviewed

    Kazuya Tsuruta, Dominik Köppl, Shunsuke Kanda, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Information and Computation   285   104794 - 104794   2022.5   ISSN:0890-5401

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Elsevier BV  

    Given a dynamic set K of k strings of total length n whose characters are drawn from an alphabet of size σ, a keyword dictionary is a data structure built on K that provides lookup, prefix search, and update operations on K. Under the assumption that α=w/lg⁡σ characters fit into a single machine word of w bits, we propose a keyword dictionary that represents K in either nlg⁡σ+Θ(klg⁡n) or |T|lg⁡σ+Θ(kw) bits of space, where |T| is the number of nodes of a trie representing K. It supports all operations in O(m/α+lg⁡α) expected time on an input string of length m in the word RAM model. An evaluation of our implementation highlights the practical usefulness of the proposed data structure, especially for prefix searches — one of the most essential keyword dictionary operations.

    DOI: 10.1016/j.ic.2021.104794

    Scopus

    researchmap

  • Factorizing strings into repetitions Reviewed International journal

    Hiroe Inoue, Yoshiaki Matsuoka, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Theory of Computing Systems   66 ( 2 )   484 - 501   2022.4

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

  • Palindromic Trees for a Sliding Window and Its Applications Reviewed

    Takuya Mieno, Kiichi Watanabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Information Processing Letters   173   106174 - 106174   2022.1   ISSN:0020-0190 eISSN:1872-6119

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Elsevier BV  

    The palindromic tree (a.k.a. eertree) for a string S of length n is a tree-like data structure that represents the set of all distinct palindromic substrings of S, using O(n) space [Rubinchik and Shur, 2018]. It is known that, when S is over an alphabet of size σ and is given in an online manner, then the palindromic tree of S can be constructed in O(nlog⁡σ) time with O(n) space. In this paper, we consider the sliding window version of the problem: For a sliding window of length at most d, we present two versions of an algorithm which maintains the palindromic tree of size O(d) for every sliding window S[i..j] over S, where 1≤j−i+1≤d. The first version works in O(nlog⁡σ′) time with O(d) space where σ′≤d is the maximum number of distinct characters in the windows, and the second one works in O(n+dσ) time with (d+2)σ+O(d) space. We also show how our algorithms can be applied to efficient computation of minimal unique palindromic substrings (MUPS) and minimal absent palindromic words (MAPW) for a sliding window.

    DOI: 10.1016/j.ipl.2021.106174

    Web of Science

    Scopus

    researchmap

  • The Parameterized Suffix Tray Reviewed International journal

    Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 12th International Conference on Algorithms and Complexity (CIAC 2021)   258 - 270   2021.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Efficiently computing runs on a trie Reviewed

    Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Theoretical Computer Science   887   143 - 151   2021.10

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1016/j.tcs.2021.07.011

  • Position Heaps for Cartesian-Tree Matching on Strings and Tries Reviewed International journal

    Akio Nishimoto, Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga

    Proceedings of 28th International Symposium in String Processing and Information Retrieval (SPIRE 2021)   241 - 254   2021.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • On the Approximation Ratio of LZ-End to LZ77 Reviewed International journal

    Takumi Ideue, Takuya Mieno, Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Masayuki Takeda

    Proceedings of 28th International Symposium in String Processing and Information Retrieval (SPIRE 2021)   114 - 126   2021.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Longest Common Rollercoasters Reviewed International journal

    Kosuke Fujita, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedinds of 28th International Symposium in String Processing and Information Retrieval (SPIRE 2021)   21 - 32   2021.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Grammar Index by Induced Suffix Sorting Reviewed International journal

    Tooru Akagi, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 28th International Symposium in String Processing and Information Retrieval (SPIRE 2021)   85 - 99   2021.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Counting Lyndon Subsequence Reviewed

    Ryo Hirakawa, Yuto Nakashima, Shunsuke Inenaga, Masayuki Takeda

    Proceedings of the Prague Stringology Conference 2021 (PSC 2021)   53 - 60   2021.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Computing Minimal Unique Substrings for a Sliding Window Reviewed

    Takuya Mieno, Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Algorithmica   84 ( 3 )   670 - 693   2021.8   ISSN:0178-4617 eISSN:1432-0541

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Springer Science and Business Media LLC  

    <title>Abstract</title>A substring <italic>u</italic> of a string <italic>T</italic> is called a <italic>minimal unique substring</italic> (<italic>MUS</italic>) of <italic>T</italic> if <italic>u</italic> occurs exactly once in <italic>T</italic> and any proper substring of <italic>u</italic> occurs at least twice in <italic>T</italic>. In this paper, we study the problem of computing MUSs for a sliding window over a given string <italic>T</italic>. We first show how the set of MUSs can change when the window slides over <italic>T</italic>. We then present an <inline-formula><alternatives><tex-math>$$O(n\log \sigma ')$$</tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
    <mml:mrow>
    <mml:mi>O</mml:mi>
    <mml:mo>(</mml:mo>
    <mml:mi>n</mml:mi>
    <mml:mo>log</mml:mo>
    <mml:msup>
    <mml:mi>σ</mml:mi>
    <mml:mo>′</mml:mo>
    </mml:msup>
    <mml:mo>)</mml:mo>
    </mml:mrow>
    </mml:math></alternatives></inline-formula>-time and <italic>O</italic>(<italic>d</italic>)-space algorithm to compute MUSs for a sliding window of size <italic>d</italic> over the input string <italic>T</italic> of length <italic>n</italic>, where <inline-formula><alternatives><tex-math>$$\sigma '\le d$$</tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
    <mml:mrow>
    <mml:msup>
    <mml:mi>σ</mml:mi>
    <mml:mo>′</mml:mo>
    </mml:msup>
    <mml:mo>≤</mml:mo>
    <mml:mi>d</mml:mi>
    </mml:mrow>
    </mml:math></alternatives></inline-formula> is the maximum number of distinct characters in every window.

    DOI: 10.1007/s00453-021-00864-1

    Web of Science

    Scopus

    researchmap

    Other Link: https://link.springer.com/article/10.1007/s00453-021-00864-1/fulltext.html

  • Compressed Communication Complexity of Hamming Distance Reviewed

    Shiori Mitsuya, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Algorithms   14 ( 4 )   116 - 116   2021.4

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.3390/a14040116

  • Computing longest palindromic substring after single-character or block-wise edits. Reviewed

    Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Theor. Comput. Sci.   859   116 - 133   2021.3

     More details

    Language:Others   Publishing type:Research paper (scientific journal)  

    DOI: 10.1016/j.tcs.2021.01.014

  • Space-efficient algorithms for computing minimal/shortest unique substrings. Reviewed

    Takuya Mieno, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Theor. Comput. Sci.   845   230 - 242   2020.12

     More details

    Language:Others   Publishing type:Research paper (scientific journal)  

    DOI: 10.1016/j.tcs.2020.09.017

  • Towards Efficient Interactive Computation of Dynamic Time Warping Distance. Reviewed

    Akihiro Nishi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 27th International Symposium on String Processing and Information Retrieval (SPIRE 2020)   27 - 41   2020.10

     More details

    Language:Others   Publishing type:Research paper (other academic)  

    DOI: 10.1007/978-3-030-59212-7_3

  • On Repetitiveness Measures of Thue-Morse Words. Reviewed

    Kanaru Kutsukake, Takuya Matsumoto, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 27th International Symposium on String Processing and Information Retrieval (SPIRE 2020)   213 - 220   2020.10

     More details

    Language:Others   Publishing type:Research paper (other academic)  

    DOI: 10.1007/978-3-030-59212-7_15

  • Lyndon Words, the Three Squares Lemma, and Primitive Squares. Reviewed

    Hideo Bannai, Takuya Mieno, Yuto Nakashima

    Proceedings of 27th International Symposium on String Processing and Information Retrieval (SPIRE 2020)   265 - 273   2020.10

     More details

    Language:Others   Publishing type:Research paper (other academic)  

    DOI: 10.1007/978-3-030-59212-7_19

  • Grammar-compressed Self-index with Lyndon Words Reviewed

    Kazuya Tsuruta, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    情報処理学会論文誌数理モデル化と応用(TOM)   13 ( 2 )   84 - 92   2020.8

     More details

    Language:English  

    Grammar-compressed Self-index with Lyndon Words
    We introduce a new class of straight-line programs (SLPs), named the Lyndon SLP, inspired by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data structure of O(g) words of spacethat can be built from a string T in O(n lg n) expected time, retrieving the starting positions of all occurrences of a pattern P of length m in O(m + lg m lg n + occ lg g) time, where n is the length of T, g is the size of the Lyndon SLP for T, and occ is the number of occurrences of P in T.We introduce a new class of straight-line programs (SLPs), named the Lyndon SLP, inspired by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data structure of O(g) words of spacethat can be built from a string T in O(n lg n) expected time, retrieving the starting positions of all occurrences of a pattern P of length m in O(m + lg m lg n + occ lg g) time, where n is the length of T, g is the size of the Lyndon SLP for T, and occ is the number of occurrences of P in T.

  • Detecting k-(Sub-)Cadences and Equidistant Subsequence Occurrences

    Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Ayumi Shinohara

    31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020 31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020   2020.6

     More details

    Language:English   Publishing type:Research paper (other academic)  

    The equidistant subsequence pattern matching problem is considered. Given a pattern string P and a text string T, we say that P is an equidistant subsequence of T if P is a subsequence of the text such that consecutive symbols of P in the occurrence are equally spaced. We can consider the problem of equidistant subsequences as generalizations of (sub-)cadences. We give bit-parallel algorithms that yield o(n2) time algorithms for finding k-(sub-)cadences and equidistant subsequences. Furthermore, O(n log2 n) and O(n log n) time algorithms, respectively for equidistant and Abelian equidistant matching for the case P = 3, are shown. The algorithms make use of a technique that was recently introduced which can efficiently compute convolutions with linear constraints. 2012 ACM Subject Classification Mathematics of computing ! Combinatorial algorithms.

    DOI: 10.4230/LIPIcs.CPM.2020.12

  • DAWGs for Parameterized Matching Online Construction and Related Indexing Structures

    Katsuhito Nakashima, Noriki Fujisato, Diptarama Hendrian, Yuto Nakashima, Ryo Yoshinaka, Shunsuke Inenaga, Hideo Bannai, Ayumi Shinohara, Masayuki Takeda

    31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020 31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020   2020.6

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Two strings x and y over of equal length are said to parameterized match (p-match) if there is a renaming bijection f : That is identity on and transforms x to y (or vice versa). The p-matching problem is to look for substrings in a text that p-match a given pattern. In this paper, we propose parameterized suffix automata (p-suffix automata) and parameterized directed acyclic word graphs (PDAWGs) which are the p-matching versions of suffix automata and DAWGs. While suffix automata and DAWGs are equivalent for standard strings, we show that p-suffix automata can have(n2) nodes and edges but PDAWGs have only O(n) nodes and edges, where n is the length of an input string. We also give O(n log( + ))-time O(n)-space algorithm that builds the PDAWG in a left-to-right online manner. As a byproduct, it is shown that the parameterized suffix tree for the reversed string can also be built in the same time and space, in a right-to-left online manner. 2012 ACM Subject Classification Theory of computation ! Pattern matching.

    DOI: 10.4230/LIPIcs.CPM.2020.26

  • Practical grammar compression based on maximal repeats Reviewed

    Isamu Furuya, Takuya Takagi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Takuya Kida

    Algorithms   13 ( 4 )   2020.4

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    This study presents an analysis of RePair, which is a grammar compression algorithm known for its simple scheme, while also being practically effective. First, we show that the main process of RePair, that is, the step by step substitution of the most frequent symbol pairs, works within the corresponding most frequent maximal repeats. Then, we reveal the relation between maximal repeats and grammars constructed by RePair. On the basis of this analysis, we further propose a novel variant of RePair, called MR-RePair, which considers the one-time substitution of the most frequent maximal repeats instead of the consecutive substitution of the most frequent pairs. The results of the experiments comparing the size of constructed grammars and execution time of RePair and MR-RePair on several text corpora demonstrate that MR-RePair constructs more compact grammars than RePair does, especially for highly repetitive texts.

    DOI: 10.3390/A13040103

  • C-Trie++ A dynamic trie tailored for fast prefix searches

    Kazuya Tsuruta, Dominik Köppl, Shunsuke Kanda, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    2020 Data Compression Conference, DCC 2020 Proceedings - DCC 2020 Data Compression Conference   243 - 252   2020.3

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Given a dynamic set K of k strings of total length n whose characters are drawn from an alphabet of size σ, a keyword dictionary is a data structure built on K that provides locate, prefix search, and update operations on K. Under the assumption that α = w / lg σ characters fit into a single machine word w, we propose a keyword dictionary that represents K in n lg σ + Θ(k lg n) bits of space, supporting all operations in Θ(m / α + lg α) expected time on an input string of length m in the word RAM model. This data structure is underlined with an exhaustive practical evaluation, highlighting the practical usefulness of the proposed data structure, especially for prefix searches - one of the most elementary keyword dictionary operations.

    DOI: 10.1109/DCC47342.2020.00032

  • Fast Algorithms for the Shortest Unique Palindromic Substring Problem on Run-Length Encoded Strings Reviewed

    Kiichi Watanabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Theory of Computing Systems   2020.1

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    For a string S, a palindromic substring S[i.j] is said to be a shortest unique palindromic substring (SUPS) for an interval [s,t] in S, if S[i.j] occurs exactly once in S, the interval [i,j] contains [s,t], and every palindromic substring containing [s,t] which is shorter than S[i.j] occurs at least twice in S. In this paper, we study the problem of answering SUPS queries on run-length encoded strings. We show how to preprocess a given run-length encoded string RLES of size m in O(m) space and O(mlogσRLES+mlogm/loglogm) time so that all SUPSs for any subsequent query interval can be answered in O(logm/loglogm+α) time, where α is the number of outputs, and σRLES is the number of distinct runs of RLES. Additionaly, we consider a variant of the SUPS problem where a query interval is also given in a run-length encoded form. For this variant of the problem, we present two alternative algorithms with faster queries. The first one answers queries in O(loglogm/logloglogm+α) time and can be built in O(mlogσRLES+mlogm/loglogm) time, and the second one answers queries in O(log log m+ α) time and can be built in O(mlogσRLES) time. Both of these data structures require O(m) space.

    DOI: 10.1007/s00224-020-09980-x

  • Minimal Unique Substrings and Minimal Absent Words in a Sliding Window

    Takuya Mieno, Yuki Kuhara, Tooru Akagi, Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 46th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2020   148 - 160   2020.1

     More details

    Language:English   Publishing type:Research paper (other academic)  

    A substring u of a string T is called a minimal unique substring (MUS) of T if u occurs exactly once in T and any proper substring of u occurs at least twice in T. A string w is called a minimal absent word (MAW) of T if w does not occur in T and any proper substring of w occurs in T. In this paper, we study the problems of computing MUSs and MAWs in a sliding window over a given string T. We first show how the set of MUSs can change in a sliding window over T, and present an-time and O(d)-space algorithm to compute MUSs in a sliding window of width d over T, where is the maximum number of distinct characters in every window. We then give tight upper and lower bounds on the maximum number of changes in the set of MAWs in a sliding window over T. Our bounds improve on the previous results in Crochemore et al. (2017).

    DOI: 10.1007/978-3-030-38919-2_13

  • Faster STR-EC-LCS Computation

    Kohei Yamada, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 46th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2020   125 - 135   2020.1

     More details

    Language:English   Publishing type:Research paper (other academic)  

    The longest common subsequence (LCS) problem is a central problem in stringology that finds the longest common subsequence of given two strings A and B. More recently, a set of four constrained LCS problems (called generalized constrained LCS problem) were proposed by Chen and Chao [J. Comb. Optim, 2011]. In this paper, we consider the substring-excluding constrained LCS (STR-EC-LCS) problem. A string Z is said to be an STR-EC-LCS of two given strings A and B excluding P if, Z is one of the longest common subsequences of A and B that does not contain P as a substring. Wang et al. proposed a dynamic programming solution which computes an STR-EC-LCS in O(mnr) time and space where [Inf. Process. Lett., 2013]. In this paper, we show a new solution for the STR-EC-LCS problem. Our algorithm computes an STR-EC-LCS in time where denotes the set of distinct characters occurring in both A and B, and L is the length of the STR-EC-LCS. This algorithm is faster than the O(mnr)-time algorithm for short/long STR-EC-LCS (namely, or), and is at least as efficient as the O(mnr)-time algorithm for all cases.

    DOI: 10.1007/978-3-030-38919-2_11

  • An improved data structure for left-right maximal generic words problem

    Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 30th International Symposium on Algorithms and Computation, ISAAC 2019   2019.12

     More details

    Language:English   Publishing type:Research paper (other academic)  

    For a set D of documents and a positive integer d, a string w is said to be d-left-right maximal, if (1) w occurs in at least d documents in D, and (2) any proper superstring of w occurs in less than d documents. The left-right-maximal generic words problem is, given a set D of documents, to preprocess D so that for any string p and for any positive integer d, all the superstrings of p that are d-left-right maximal can be answered quickly. In this paper, we present an O(n log m) space data structure (in words) which answers queries in O(|p| + o log log m) time, where n is the total length of documents in D, m is the number of documents in D and o is the number of outputs. Our solution improves the previous one by Nishimoto et al. (PSC 2015), which uses an O(n log n) space data structure answering queries in O(|p| + r · log n + o · log2 n) time, where r is the number of right-extensions q of p occurring in at least d documents such that any proper right extension of q occurs in less than d documents.

    DOI: 10.4230/LIPIcs.ISAAC.2019.40

  • On the size of the smallest alphabet for Lyndon trees Reviewed

    Yuto Nakashima, Takuya Takagi, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Theoretical Computer Science   792   131 - 143   2019.11

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    We consider the problem of reverse-engineering the Lyndon tree, i.e., given a full binary ordered tree T with n leaves as input, we are to compute a string w of length n of which Lyndon tree is isomorphic to the input tree T. Hereby we call such a string a solution string. Although the problem is easily solvable in linear time for binary alphabets and unbounded-size alphabets, it is not known how to efficiently find the smallest alphabet size for a solution string. In this paper, we show several new observations concerning this problem. Namely, we show that: 1) For any positive integer n, there exists a full binary ordered tree T with n leaves, s.t. the smallest alphabet size of a solution string for T is ⌊ [Formula presented] ⌋+1. 2) For any full binary ordered tree T with n leaves, there exists a solution string w over an alphabet of size at most ⌊ [Formula presented] ⌋+1. 3) For any full binary ordered tree T, there exists a solution string w over an alphabet of size at most h+1, where h is the height of T. 4) For any complete binary ordered tree T with 2k leaves, there exists a solution string w over an alphabet of size at most 4.

    DOI: 10.1016/j.tcs.2018.06.044

  • Direct Linear Time Construction of Parameterized Suffix and LCP Arrays for Constant Alphabets

    Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 26th International Symposium on String Processing and Information Retrieval, SPIRE 2019   382 - 391   2019.10

     More details

    Language:English   Publishing type:Research paper (other academic)  

    We present the first worst-case linear time algorithm that directly computes the parameterized suffix and LCP arrays for constant sized alphabets. Previous algorithms either required quadratic time or the parameterized suffix tree to be built first. More formally, for a string over static alphabet and parameterized alphabet, our algorithm runs in time and O(n) words of space, where is the number of distinct symbols of in the string.

    DOI: 10.1007/978-3-030-32686-9_27

  • On Longest Common Property Preserved Substring Queries

    Kazuki Kai, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Tomasz Kociumaka

    Proceedings of 26th International Symposium on String Processing and Information Retrieval, SPIRE 2019   162 - 174   2019.10

     More details

    Language:English   Publishing type:Research paper (other academic)  

    We revisit the problem of longest common property preserving substring queries introduced by Ayad et al. (SPIRE 2018, arXiv 2018). We consider a generalized and unified on-line setting, where we are given a set X of k strings of total length n that can be pre-processed so that, given a query string y and a positive integer (Formula presented), we can determine the longest substring of y that satisfies some specific property and is common to at least (Formula presented) strings in X. Ayad et al. considered the longest square-free substring in an on-line setting and the longest periodic and palindromic substring in an off-line setting. In this paper, we give efficient solutions in the on-line setting for finding the longest common square, periodic, palindromic, and Lyndon substrings. More precisely, we show that X can be pre-processed in O(n) time resulting in a data structure of O(n) size that answers queries in (Formula presented) time and O(1) working space, where (Formula presented) is the size of the alphabet, and the common substring must be a square, a periodic substring, a palindrome, or a Lyndon word.

    DOI: 10.1007/978-3-030-32686-9_12

  • Compact Data Structures for Shortest Unique Substring Queries

    Takuya Mieno, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 26th International Symposium on String Processing and Information Retrieval, SPIRE 2019   107 - 123   2019.10

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Given a string T of length n, a substring of T is called a shortest unique substring (SUS) for an interval [s, t] if (a) u occurs exactly once in T, (b) u contains the interval [s, t] (i.e.), and (c) every substring v of T with containing [s, t] occurs at least twice in T. Given a query interval, the interval SUS problem is to output all the SUSs for the interval [s, t]. In this article, we propose a bits data structure answering an interval SUS query in output-sensitive time, where is the number of returned SUSs. Additionally, we focus on the point SUS problem, which is the interval SUS problem for. Here, we propose a bits data structure answering a point SUS query in the same output-sensitive time.

    DOI: 10.1007/978-3-030-32686-9_8

  • Computing maximal palindromes and distinct palindromes in a trie

    Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    23rd Prague Stringology Conference, PSC 2019 Proceedings of the Prague Stringology Conference, PSC 2019   3 - 15   2019.8

     More details

    Language:English   Publishing type:Research paper (other academic)  

    It is known that all maximal palindromes of a given string T of length n can be computed in O(n) time by Manacher's algorithm [J. ACM '75]. Also, all distinct palindromes in T can be computed in O(n) time [Groult et al., Inf. Process. Lett. 2010]. In this paper, we consider the problem of computing maximal palindromes and distinct palindromes of a given trie T (i.e. rooted edge-labeled tree). A trie is a natural generalization of a string which can be seen as a single path tree. We propose algorithms to compute all maximal palindromes and all distinct palindromes in T in O(N log h) time and O(N) space, where N is the number of edges in T and h is the height of T . To our knowledge these are the first sub-quadratic time solutions to these problems.

  • Shortest unique palindromic substring queries on run-length encoded strings

    Kiichi Watanabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 30th International Workshop on Combinatorial Algorithms, IWOCA 2019   430 - 441   2019.7

     More details

    Language:English   Publishing type:Research paper (other academic)  

    For a string S, a palindromic substring S[i.j] is said to be a shortest unique palindromic substring (SUPS) for an interval [s, t] in S, if S[i.j] occurs exactly once in S, the interval [i, j] contains [s, t], and every palindromic substring containing [s, t] which is shorter than S[i.j] occurs at least twice in S. In this paper, we study the problem of answering SUPS queries on run-length encoded strings. We show how to preprocess a given run-length encoded string RLES of size m in O(m) space and O(mlog σRLES + m√log m/log logm) time so that all SUPSs for any subsequent query interval can be answered in O(√log m/log logm+ α) time, where α is the number of outputs, and σRLES is the number of distinct runs of RLES.

    DOI: 10.1007/978-3-030-25005-8_35

  • Computing runs on a trie

    Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019   2019.6

     More details

    Language:English   Publishing type:Research paper (other academic)  

    A maximal repetition, or run, in a string, is a maximal periodic substring whose smallest period is at most half the length of the substring. In this paper, we consider runs that correspond to a path on a trie, or in other words, on a rooted edge-labeled tree where the endpoints of the path must be a descendant/ancestor of the other. For a trie with n edges, we show that the number of runs is less than n. We also show an O(n√log n log log n) time and O(n) space algorithm for counting and finding the shallower endpoint of all runs. We further show an O(n log n) time and O(n) space algorithm for finding both endpoints of all runs. We also discuss how to improve the running time even more.

    DOI: 10.4230/LIPIcs.CPM.2019.23

  • On the size of overlapping Lempel-Ziv and Lyndon factorizations

    Yuki Urabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019   2019.6

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Lempel-Ziv (LZ) factorization and Lyndon factorization are well-known factorizations of strings. Recently, Kärkkäinen et al. studied the relation between the sizes of the two factorizations, and showed that the size of the Lyndon factorization is always smaller than twice the size of the non-overlapping LZ factorization [STACS 2017]. In this paper, we consider a similar problem for the overlapping version of the LZ factorization. Since the size of the overlapping LZ factorization is always smaller than the size of the non-overlapping LZ factorization and, in fact, can even be an O(log n) factor smaller, it is not immediately clear whether a similar bound as in previous work would hold. Nevertheless, in this paper, we prove that the size of the Lyndon factorization is always smaller than four times the size of the overlapping LZ factorization.

    DOI: 10.4230/LIPIcs.CPM.2019.29

  • Faster queries for longest substring palindrome after block edit

    Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019   2019.6

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Palindromes are important objects in strings which have been extensively studied from combinatorial, algorithmic, and bioinformatics points of views. Manacher [J. ACM 1975] proposed a seminal algorithm that computes the longest substring palindromes (LSPals) of a given string in O(n) time, where n is the length of the string. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(ℓ + log log n) time, after a substring in T is replaced by a string of arbitrary length ℓ. This outperforms the query algorithm proposed in our previous work [CPM 2018] that uses O(ℓ + log n) time for each query.

    DOI: 10.4230/LIPIcs.CPM.2019.27

  • The Parameterized Position Heap of a Trie

    Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Proceedings of 11th International Conference on Algorithms and Complexity, CIAC 2019   237 - 248   2019.5

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Let and be disjoint alphabets of respective size and. Two strings over of equal length are said to parameterized match (p-match) if there is a bijection such that (1) f is identity on and (2) f maps the characters of one string to those of the other string so that the two strings become identical. We consider the p-matching problem on a (reversed) trie and a string pattern P such that every path that p-matches P has to be reported. Let N be the size of the given trie. In this paper, we propose the parameterized position heap for that occupies O(N) space and supports p-matching queries in time, where m is the length of a query pattern P and is the number of paths in to report. We also present an algorithm which constructs the parameterized position heap for a given trie in time and working space.

    DOI: 10.1007/978-3-030-17402-6_20

  • MR-RePair Grammar Compression Based on Maximal Repeats

    Isamu Furuya, Takuya Takagi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Takuya Kida

    2019 Data Compression Conference, DCC 2019 Proceedings - DCC 2019 2019 Data Compression Conference   508 - 517   2019.3

     More details

    Language:English   Publishing type:Research paper (other academic)  

    We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at once instead of substituting the most frequent pairs consecutively. We implemented MR-RePair and compared the size of the grammar generated by MR-RePair to that by RePair on several text corpora. Our experiments show that MR-RePair generates more compact grammars than RePair does, especially for highly repetitive texts.

    DOI: 10.1109/DCC.2019.00059

  • Recovering, counting and enumerating strings from forward and backward suffix arrays

    Yuki Kuhara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    25th International Symposium on String Processing and Information Retrieval, SPIRE 2018 String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings   254 - 267   2018.10

     More details

    Language:English   Publishing type:Research paper (other academic)  

    The suffix array SAw of a string w of length n is a permutation of [1..n] such that SAw[i]=j iff w[j, n] is the lexicographically i-th suffix of w. In this paper, we consider variants of the reverse-engineering problem on suffix arrays with two given permutations P and Q of [1..n], such that P refers to the forward suffix array of some string w and Q refers to the backward suffix array of the reversed string wR. Our results are the following: (1) An algorithm which computes a solution string over an alphabet of the smallest size, in O(n) time. (2) The exact number of solution strings over an alphabet of size σ. (3) An efficient algorithm which computes all solution strings in the lexicographical order, in time near optimal up to log n factor.

    DOI: 10.1007/978-3-030-00479-8_21

  • Algorithms and combinatorial properties on shortest unique palindromic substrings Reviewed

    Hiroe Inoue, Yuto Nakashima, Takuya Mieno, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Journal of Discrete Algorithms   52-53   122 - 132   2018.9

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    A palindrome is a string that reads the same forward and backward. A palindromic substring P of a string S is called a shortest unique palindromic substring (SUPS) for an interval [s,t] in S, if P occurs exactly once in S, this occurrence of P contains interval [s,t], and every palindromic substring of S which contains interval [s,t] and is shorter than P occurs at least twice in S. The SUPS problem is, given a string S, to preprocess S so that for any subsequent query interval [s,t] all the SUPSs for interval [s,t] can be answered quickly. We present an optimal solution to this problem. Namely, we show how to preprocess a given string S of length n in O(n) time and space so that all SUPSs for any subsequent query interval can be answered in O(α+1) time, where α is the number of outputs. We also discuss the number of SUPSs in a string.

    DOI: 10.1016/j.jda.2018.11.009

  • Right-to-left online construction of parameterized position heaps

    Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    22nd Prague Stringology Conference, PSC 2018 Proceedings of the Prague Stringology Conference, PSC 2018   91 - 102   2018.8

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Two strings of equal length are said to parameterized match if there is a bijection that maps the characters of one string to those of the other string, so that two strings become identical. The parameterized pattern matching problem is, given two strings T and P, to find the occurrences of substrings in T that parameterized match P. Diptarama et al. [Position Heaps for Parameterized Strings, CPM 2017] proposed an indexing data structure called parameterized position heaps, and gave a left-toright online construction algorithm. In this paper, we present a right-to-left online construction algorithm for parameterized position heaps. For a text string T of length n over two kinds of alphabets Σand II of respective size σ and π, our construction algorithm runs in O(n log(σ+π)) time with O(n) space. Our right-to-left parameterized position heaps support pattern matching queries in O(mlog(σ+π)+mπ+pocc)) time, where m is the length of a query pattern P and pocc is the number of occurrences to report. Our construction and pattern matching algorithms are as efficient as Diptarama et al.'s algorithms.

  • O(n log n)-time text compression by LZ-style longest first substitution

    Akihiro Nishi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    22nd Prague Stringology Conference, PSC 2018 Proceedings of the Prague Stringology Conference, PSC 2018   12 - 26   2018.8

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Mauer et al. [A Lempel-Ziv-style Compression Method for Repetitive Texts, PSC 2017] proposed a hybrid text compression method called LZ-LFS which has both features of Lempel-Ziv 77 factorization and longest first substitution. They showed that LZ-LFS can achieve better compression ratio for repetitive texts, compared to some state-of-the-art compression algorithms. The drawback of Mauer et al.'s method is that their LZ-LFS compression algorithm takes O(n2) time on an input string of length n. In this paper, we show a faster LZ-LFS compression algorithm that works in O(n log n) time. We also propose a simpler version of LZ-LFS that can be computed in O(n) time.

  • Lyndon factorization of grammar compressed texts revisited

    Isamu Furuya, Yuto Nakashima, I. Tomohiro, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018   241 - 2410   2018.5

     More details

    Language:English   Publishing type:Research paper (other academic)  

    We revisit the problem of computing the Lyndon factorization of a string w of length N which is given as a straight line program (SLP) of size n. For this problem, we show a new algorithm which runs in O(P(n,N) + Q(n,N)n log logN) time and O(n logN + S(n,N)) space where P(n,N), S(n,N), Q(n,N) are respectively the pre-processing time, space, and query time of a data structure for longest common extensions (LCE) on SLPs. Our algorithm improves the algorithm proposed by I et al. (TCS '17), and can be more efficient than the O(N)-time solution by Duval (J. Algorithms '83) when w is highly compressible.

    DOI: 10.4230/LIPIcs.CPM.2018.24

  • Longest substring palindrome after edit

    Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018   105   121 - 1214   2018.5

     More details

    Language:English   Publishing type:Research paper (other academic)  

    It is known that the length of the longest substring palindromes (LSPals) of a given string T of length n can be computed in O(n) time by Manacher's algorithm [J. ACM '75]. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(log(min{ω, log n})) time after single character substitution, insertion, or deletion, where ω denotes the number of distinct characters appearing in T. We also propose an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(ℓ+log n) time, after an existing substring in T is replaced by a string of arbitrary length ℓ.

    DOI: 10.4230/LIPIcs.CPM.2018.12

  • Longest lyndon substring after edit

    Yuki Urabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018   191 - 1910   2018.5

     More details

    Language:English   Publishing type:Research paper (other academic)  

    The longest Lyndon substring of a string T is the longest substring of T which is a Lyndon word. LLS(T) denotes the length of the longest Lyndon substring of a string T. In this paper, we consider computing LLS(T′) where T′ is an edited string formed from T. After O(n) time and space preprocessing, our algorithm returns LLS(T′) in O(log n) time for any single character edit. We also consider a version of the problem with block edits, i.e., a substring of T is replaced by a given string of length l. After O(n) time and space preprocessing, our algorithm returns LLS(T′) in O(l log σ + log n) time for any block edit where σ is the number of distinct characters in T. We can modify our algorithm so as to output all the longest Lyndon substrings of T′ for both problems.

    DOI: 10.4230/LIPIcs.CPM.2018.19

  • Faster online elastic degenerate string matching

    Kotaro Aoyama, Yuto Nakashima, I. Tomohiro, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018   91 - 910   2018.5

     More details

    Language:English   Publishing type:Research paper (other academic)  

    An Elastic-Degenerate String [Iliopoulus et al., LATA 2017] is a sequence of sets of strings, which was recently proposed as a way to model a set of similar sequences. We give an online algorithm for the Elastic-Degenerate String Matching (EDSM) problem that runs in O(nm √mlogm+ N) time and O(m) working space, where n is the number of elastic degenerate segments of the text, N is the total length of all strings in the text, and m is the length of the pattern. This improves the previous algorithm by Grossi et al. [CPM 2017] that runs in O(nm2 + N) time.

    DOI: 10.4230/LIPIcs.CPM.2018.9

  • Shortest unique palindromic substring queries in optimal time

    Yuto Nakashima, Hiroe Inoue, Takuya Mieno, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    28th International Workshop on Combinational Algorithms, IWOCA 2017 Combinatorial Algorithms - 28th International Workshop, IWOCA 2017, Revised Selected Papers   397 - 408   2018.1

     More details

    Language:English   Publishing type:Research paper (other academic)  

    palindrome is a string that reads the same forward and backward. A palindromic substring P of a string S is called a shortest unique palindromic substring (SUPS) for an interval [s, t] in S, if P occurs exactly once in S, this occurrence of P contains interval [s, t], and every palindromic substring of S which contains interval [s, t] and is shorter than P occurs at least twice in S. The SUPS problem is, given a string S, to preprocess S so that for any subsequent query interval [s, t] all the SUPSs for interval [s, t] can be answered quickly. We present an optimal solution to this problem. Namely, we show how to preprocess a given string S of length n in O(n) time and space so that all SUPSs for any subsequent query interval can be answered in O(α + 1) time, where α is the number of outputs.

    DOI: 10.1007/978-3-319-78825-8_32

  • Almost linear time computation of maximal repetitions in run length encoded strings

    Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    28th International Symposium on Algorithms and Computation, ISAAC 2017 28th International Symposium on Algorithms and Computation, ISAAC 2017   2017.12

     More details

    Language:English   Publishing type:Research paper (other academic)  

    We consider the problem of computing all maximal repetitions contained in a string that is given in run-length encoding. Given a run-length encoding of a string, we show that the maximum number of maximal repetitions contained in the string is at most m+k-1, where m is the size of the run-length encoding, and k is the number of run-length factors whose exponent is at least 2. We also show an algorithm for computing all maximal repetitions in O(m α (m)) time and O(m) space, where α denotes the inverse Ackermann function.

    DOI: 10.4230/LIPIcs.ISAAC.2017.33

  • The "runs" theorem Reviewed

    Hideo Bannai, Tomohiro I, Shunsuke Inenaga, Yuto Nakashima, Masayuki Takeda, Kazuya Tsuruta

    SIAM Journal on Computing   46 ( 5 )   1501 - 1514   2017.9

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon words. The characterization leads to a proof of what was known as the "runs" conjecture [R. M. Kolpakov andG.Kucherov, Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 1999, pp. 596-604]), which states that the maximum number of runs ρ(n) in a string of length n is less than n. The proof is remarkably simple, considering the numerous endeavors to tackle this problem in the last 15 years, and significantly improves our understanding of how runs can occur in strings. In addition, we obtain an upper bound of 3n for the maximum sum of exponents σ(n) of runs in a string of length n, improving on the best known bound of 4.1n by Crochemore et al. [J. Discrete Algorithms, 14 (2012), pp. 29-36], as well as other improved bounds on related problems. The characterization also gives rise to a new, conceptually simple linear-time algorithm for computing all the runs in a string. A notable characteristic of our algorithm is that, unlike all existing linear-time algorithms, it does not utilize the Lempel-Ziv factorization of the string. We also establish a relationship between runs and nodes of the Lyndon tree, which gives a simple optimal solution to the 2-period query problem that was recently solved by Kociumaka et al. [Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA) 2015, San Diego, CA, SIAM, Philadelphia, 2015, pp. 532-551].

    DOI: 10.1137/15M1011032

  • On reverse engineering the Lyndon tree

    Yuto Nakashima, Takuya Takagi, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    21st Prague Stringology Conference, PSC 2017 Proceedings of the Prague Stringology Conference, PSC 2017   108 - 117   2017.8

     More details

    Language:English   Publishing type:Research paper (other academic)  

    We consider the problem of reverse engineering the Lyndon tree, i.e., given a full binary ordered tree T with n leaves as input, compute a string w of length n for which it’s Lyndon tree is isomorphic to the input tree. Although the problem is easy and solvable in linear time when assuming a binary alphabet or when there is no limit on the alphabet size, how to efficiently find the smallest alphabet size for a solution string is not known. We show several new observations concerning this problem. Namely, we show that: 1) For any full binary ordered tree T, there exists a solution string w over an alphabet of size at most h + 1, where h is the height of T. 2) For any positive n, there exists a full binary ordered tree T with n leaves, s.t. the smallest alphabet size of the solution string for T is ⌊n2 ⌋ + 1.

  • Inferring strings from Lyndon factorization Reviewed

    Yuto Nakashima, Takashi Okabe, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Theoretical Computer Science   689   147 - 156   2017.8

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    The Lyndon factorization of a string w is a unique factorization ℓ1p1,…,ℓmpm of w such that ℓ1,…,ℓm is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S=((s1,p1),…,(sm,pm)) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i.e., the Lyndon factorization of w is in a form of ℓ1p1,…,ℓmpm with |ℓi|=si for all 1≤i≤m. Firstly, we show that there exists a simple O(n)-time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O(n)-time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O(m) time. Fourthly, we give an O(m)-time algorithm to compute an O(m)-size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

    DOI: 10.1016/j.tcs.2017.05.038

  • On the size of Lempel-Ziv and Lyndon factorizations

    Juha Kärkkäinen, Dominik Kempa, Yuto Nakashima, Simon J. Puglisi, Arseny M. Shur

    34th Symposium on Theoretical Aspects of Computer Science, STACS 2017 34th Symposium on Theoretical Aspects of Computer Science, STACS 2017   2017.3

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Lyndon factorization and Lempel-Ziv (LZ) factorization are both important tools for analysing the structure and complexity of strings, but their combinatorial structure is very different. In this paper, we establish the first direct connection between the two by showing that while the Lyndon factorization can be bigger than the non-overlapping LZ factorization (which we demonstrate by describing a new, non-trivial family of strings) it is always less than twice the size.

    DOI: 10.4230/LIPIcs.STACS.2017.45

  • Faster Lyndon factorization algorithms for SLP and LZ78 compressed text Reviewed

    Tomohiro I, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Theoretical Computer Science   656   215 - 224   2016.12

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    We present two efficient algorithms which, given a compressed representation of a string w of length N, compute the Lyndon factorization of w. Given a straight line program (SLP) S of size n that describes w, the first algorithm runs in O(n2+P(n,N)+Q(n,N)nlog⁡n) time and O(n2+S(n,N)) space, where P(n,N), S(n,N), Q(n,N) are respectively the pre-processing time, space, and query time of a data structure for longest common extensions (LCE) on SLPs. Given the Lempel–Ziv 78 encoding of size s for w, the second algorithm runs in O(slog⁡s) time and space.

    DOI: 10.1016/j.tcs.2016.03.005

  • Longest common Abelian factors and large alphabets

    Golnaz Badkobeh, Travis Gagie, Szymon Grabowski, Yuto Nakashima, Simon J. Puglisi, Shiho Sugimoto

    23rd International Symposium on String Processing and Information Retrieval, SPIRE 2016 String Processing and Information Retrieval - 23rd International Symposium, SPIRE 2016, Proceedings   254 - 259   2016.10

     More details

    Language:English   Publishing type:Research paper (other academic)  

    Two strings X and Y are considered Abelian equal if the letters of X can be permuted to obtain Y (and vice versa). Recently, Alatabbi et al. (2015) considered the longest common Abelian factor problem in which we are asked to find the length of the longest Abelian-equal factor present in a given pair of strings. They provided an algorithm that uses O(σn2) time and O(σn) space, where n is the length of the pair of strings and σ is the alphabet size. In this paper we describe an algorithm that uses O(n2 log2 n log∗ n) time and O(n log2 n) space, significantly improving Alatabbi et al.’s result unless the alphabet is small. Our algorithm makes use of techniques for maintaining a dynamic set of strings under split, join, and equality testing (Melhorn et al., Algorithmica 17(2), 1997).

    DOI: 10.1007/978-3-319-46049-9_24

  • Computing smallest and largest repetition factorizations in O(n log n) time

    Hiroe Inoue, Yoshiaki Matsuoka, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    20th Prague Stringology Conference, PSC 2016 Proceedings of the Prague Stringology Conference, PSC 2016   135 - 145   2016.8

     More details

    Language:English   Publishing type:Research paper (other academic)  

    A factorization f1, . . . , fm of a string w is called a repetition factorization of w if each factor fi is a repetition, namely, fi = xkx' for some non-empty string x, an integer k ≥ 2, and x' being a proper prefix of x. Dumitran et al. (Proc. SPIRE 2015) proposed an algorithm which computes a repetition factorization of a given string w in O(n) time, where n is the length of w. In this paper, we propose two algorithms which compute smallest/largest repetition factorizations in O(n log n) time. The first algorithm is a simple O(n log n) space algorithm while the second one uses only O(n) space.

  • Constructing LZ78 tries and position heaps in linear time for large alphabets Reviewed

    Yuto Nakashima, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Information Processing Letters   115 ( 9 )   655 - 659   2015.9

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    We present the first worst-case linear-time algorithm to compute the Lempel-Ziv 78 factorization of a given string over an integer alphabet. Our algorithm is based on nearest marked ancestor queries on the suffix tree of the given string. We also show that the same technique can be used to construct the position heap of a set of strings in worst-case linear time, when the set of strings is given as a trie.

    DOI: 10.1016/j.ipl.2015.04.002

  • Computing left-right maximal generic words

    Takaaki Nishimoto, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    19th Prague Stringology Conference, PSC 2015 Proceedings of the Prague Stringology Conference 2015, PSC 2015   5 - 16   2015.1

     More details

    Language:English   Publishing type:Research paper (other academic)  

    The maximal generic words problem was proposed by Kucherov et al. (SPIRE 2012). Let D be a set of documents. In this problem, given a pattern P and a threshold δ ≤ |D|, we want to compute all right-maximal extensions of P which occur in at least δ distinct documents. They proposed an O(n)-space data structure which can solve this problem in O(|P| + rocc) time where n is the total length of documents in D and rocc is the number of right-maximal extensions of P. The data structure can be constructed in O(n) time. In this paper, we propose a more generalized problem. Given a pattern P and a threshold δ ≤ |D|, we want to compute all left-right-maximal extensions of P which occur in at least δ distinct documents. We propose an O(n log n)- space data structure which can solve this problem in O(|P| + occ log2 n + rocc log n) time where occ is the number of left-right-maximal extensions of P.

  • A new characterization of maximal repetitions by Lyndon trees

    Hideo Bannai, Tomohiro I, Shunsuke Inenaga, Yuto Nakashima, Masayuki Takeda, Kazuya Tsuruta

    26th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015 Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015   562 - 571   2015.1

     More details

    Language:English   Publishing type:Research paper (other academic)  

    We give a new characterization of maximal repetitions (or runs) in strings, using a tree defined on recursive standard factorizations of Lyndon words, called the Lyndon tree. The characterization leads to a remarkably simple novel proof of the linearity of the maximum number of runs p(n) in a string of length n. Furthermore, we show an upper bound of p(n) < 1.5n, which improves on the best upper bound 1.6n (Crochemore & Hie 2008) that does not rely on computational verification. The proof also gives rise to a new, conceptually simple linear-time algorithm for computing all the runs in a string. A notable characteristic of our algorithm is that, unlike all existing linear-time algorithms, it does not utilize the Lempel-Ziv factorization of the string.

    DOI: 10.1137/1.9781611973730.38

  • Inferring strings from Lyndon factorization

    Yuto Nakashima, Takashi Okabe, I. Tomohiro, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    39th International Symposium on Mathematical Foundations of Computer Science, MFCS 2014 Mathematical Foundations of Computer Science 2014 - 39th International Symposium, MFCS 2014, Proceedings   565 - 576   2014.1

     More details

    Language:English   Publishing type:Research paper (other academic)  

    The Lyndon factorization of a string w is a unique factorization ℓp11,⋯, ℓpmm of w s.t. ℓ1,⋯, ℓm is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S = ((s1, p1),⋯, (sm, p m)) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i.e., the Lyndon factorization of w is in a form of ℓp11,⋯, ℓpmm with |ℓi| = si for all 1 ≤ i ≤ m. Firstly, we show that there exists a simple O(n)-time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O(n)-time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O(m) time. Fourthly, we give an O(m)-time algorithm to compute an O(m)-size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

    DOI: 10.1007/978-3-662-44465-8_48

  • Faster lyndon factorization algorithms for SLP and LZ78 compressed text

    Tomohiro I, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    20th International Symposium on String Processing and Information Retrieval, SPIRE 2013 String Processing and Information Retrieval - 20th International Symposium, SPIRE 2013, Proceedings   174 - 185   2013.10

     More details

    Language:English   Publishing type:Research paper (other academic)  

    We present two efficient algorithms which, given a compressed representation of a string w of length N, compute the Lyndon factorization of w. Given a straight line program (SLP) S of size n and height h that describes w, the first algorithm runs in O(nh(n + log N log n)) time and O(n2) space. Given the Lempel-Ziv 78 encoding of size s for w, the second algorithm runs in O(s log s) time and space.

    DOI: 10.1007/978-3-319-02432-5_21

  • Efficient Lyndon factorization of grammar compressed text

    Tomohiro I, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    24th Annual Symposium on Combinatorial Pattern Matching, CPM 2013 Combinatorial Pattern Matching - 24th Annual Symposium, CPM 2013, Proceedings   153 - 164   2013.1

     More details

    Language:English   Publishing type:Research paper (other academic)  

    We present an algorithm for computing the Lyndon factorization of a string that is given in grammar compressed form, namely, a Straight Line Program (SLP). The algorithm runs in O(n4 + mn3 h) time and O(n 2) space, where m is the size of the Lyndon factorization, n is the size of the SLP, and h is the height of the derivation tree of the SLP. Since the length of the decompressed string can be exponentially large w.r.t. n, m and h, our result is the first polynomial time solution when the string is given as SLP.

    DOI: 10.1007/978-3-642-38905-4_16

  • The position heap of a trie

    Yuto Nakashima, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    19th International Symposium on String Processing and Information Retrieval, SPIRE 2012 String Processing and Information Retrieval - 19th International Symposium, SPIRE 2012, Proceedings   360 - 371   2012.10

     More details

    Language:English   Publishing type:Research paper (other academic)  

    The position heap is a text indexing structure for a single text string, recently proposed by Ehrenfeucht et al. [Position heaps: A simple and dynamic text indexing data structure, Journal of Discrete Algorithms, 9(1):100-121, 2011]. In this paper we introduce the position heap for a set of strings, and propose an efficient algorithm to construct the position heap for a set of strings which is given as a trie. For a fixed alphabet our algorithm runs in time linear in the size of the trie. We also show that the position heap can be efficiently updated after addition/removal of a leaf of the input trie.

    DOI: 10.1007/978-3-642-34109-0-38

▼display all

Presentations

  • Optimal LZ-End Parsing Is Hard International conference

    Hideo Bannai, Mitsuru Funakoshi, Kazuhiro Kurita, Yuto Nakashima, Kazuhisa Seto, Takeaki Uno

    34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)  2023.6 

     More details

    Event date: 2023.6

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Marne-la-Vallée   Country:France  

  • Space-Efficient STR-IC-LCS Computation International conference

    Yuuki Yonemoto, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai

    48th International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2023)  2023.1 

     More details

    Event date: 2023.1 - 2020.1

    Language:English  

    Venue:Nový Smokovec   Country:Slovakia  

  • Minimal Absent Words on Run-Length Encoded Strings International conference

    Tooru Akagi, Kouta Okabe, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

    33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)  2022.6 

     More details

    Event date: 2022.6

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Czech Technical University   Country:Czech Republic  

  • Longest Common Rollercoasters International conference

    Kosuke Fujita, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    28th International Symposium on String Processing and Information Retrieval (SPIRE 2021)  2021.10 

     More details

    Event date: 2021.10

    Language:English  

    Venue:Online   Country:France  

  • On the approximation ratio of LZ-End to LZ77 International conference

    Takumi Ideue, Takuya Mieno, Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Masayuki Takeda

    28th International Symposium on String Processing and Information Retrieval (SPIRE 2021)  2021.10 

     More details

    Event date: 2021.10

    Language:English  

    Venue:Online   Country:France  

  • Position Heaps for Cartesian-tree Matching on Strings and Tries International conference

    Akio Nishimoto, Noriki. Fujisato, Yuto Nakashima, Shunsuke Inenaga

    28th International Symposium on String Processing and Information Retrieval (SPIRE 2021)  2021.10 

     More details

    Event date: 2021.10

    Language:English  

    Venue:Online   Country:France  

  • Grammar Index By Induced Suffix Sorting International conference

    Tooru Akagi, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    28th International Symposium on String Processing and Information Retrieval (SPIRE 2021)  2021.10 

     More details

    Event date: 2021.10

    Language:English  

    Venue:Online   Country:France  

  • Counting Lyndon Subsequence International conference

    Ryo Hirakawa, Yuto Nakashima, Shunsuke Inenaga, Masayuki Takeda

    Prague Stringology Conference 2021 (PSC 2021)  2021.8 

     More details

    Event date: 2021.8

    Language:English  

    Venue:Online   Country:Czech Republic  

  • The Parameterized Suffix Tray International conference

    Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda

    12th International Conference on Algorithms and Complexity (CIAC 2021)  2021.5 

     More details

    Event date: 2021.5

    Language:English  

    Venue:Online   Country:Cyprus  

  • On repetitiveness measures of Thue-Morse words International conference

    Kanaru Kutsukake, Takuya Matsumoto, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    27th International Symposium on String Processing and Information Retrieval (SPIRE 2020)  2020.10 

     More details

    Event date: 2020.10

    Language:English  

    Venue:Online   Country:United States  

  • Lyndon Words, the Three Squares Lemma, and Primitive Squares International conference

    Hideo Bannai, Takuya Mieno, Yuto Nakashima

    27th International Symposium on String Processing and Information Retrieval (SPIRE 2020)  2020.10 

     More details

    Event date: 2020.10

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Online   Country:United States  

  • Towards Efficient Interactive Computation of Dynamic Time Warping Distance International conference

    Akihiro Nishi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    27th International Symposium on String Processing and Information Retrieval (SPIRE 2020)  2020.10 

     More details

    Event date: 2020.10

    Language:English  

    Venue:Online   Country:United States  

  • DAWGs for Parameterized Matching: Online Construction and Related Indexing Structures International conference

    Katsuhito Nakashima, Noriki Fujisato, Diptarama Hendrian, Yuto Nakashima, Ryo Yoshinaka, Shunsuke Inenaga, Hideo Bannai, Ayumi Shinohara, Masayuki Takeda

    31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)  2020.6 

     More details

    Event date: 2020.6

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Online   Country:Denmark  

  • Detecting k-(Sub-)Cadences and Equidistant Subsequence Occurrences International conference

    Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Ayumi Shinohara

    31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)  2020.6 

     More details

    Event date: 2020.6

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Online   Country:Denmark  

  • c-Trie++: A Dynamic Trie Tailored for Fast Prefix Searches International conference

    Kazuya Tsuruta, Dominik Köppl, Shunsuke Kanda, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Data Compression Conference 2020  2020.3 

     More details

    Event date: 2020.3

    Language:English  

    Venue:オンライン会議   Country:Other  

  • Faster STR-EC-LCS Computation International conference

    Kohei Yamada, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    46th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2020  2020.1 

     More details

    Event date: 2020.1

    Language:English  

    Venue:Limassol   Country:Cyprus  

    The longest common subsequence (LCS) problem is a central problem in stringology that finds the longest common subsequence of given two strings A and B. More recently, a set of four constrained LCS problems (called generalized constrained LCS problem) were proposed by Chen and Chao [J. Comb. Optim, 2011]. In this paper, we consider the substring-excluding constrained LCS (STR-EC-LCS) problem. A string Z is said to be an STR-EC-LCS of two given strings A and B excluding P if, Z is one of the longest common subsequences of A and B that does not contain P as a substring. Wang et al. proposed a dynamic programming solution which computes an STR-EC-LCS in O(mnr) time and space where [Inf. Process. Lett., 2013]. In this paper, we show a new solution for the STR-EC-LCS problem. Our algorithm computes an STR-EC-LCS in time where denotes the set of distinct characters occurring in both A and B, and L is the length of the STR-EC-LCS. This algorithm is faster than the O(mnr)-time algorithm for short/long STR-EC-LCS (namely, or), and is at least as efficient as the O(mnr)-time algorithm for all cases.

  • Minimal Unique Substrings and Minimal Absent Words in a Sliding Window International conference

    Takuya Mieno, Yuki Kuhara, Tooru Akagi, Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    46th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2020  2020.1 

     More details

    Event date: 2020.1

    Language:English  

    Venue:Limassol   Country:Cyprus  

    A substring u of a string T is called a minimal unique substring (MUS) of T if u occurs exactly once in T and any proper substring of u occurs at least twice in T. A string w is called a minimal absent word (MAW) of T if w does not occur in T and any proper substring of w occurs in T. In this paper, we study the problems of computing MUSs and MAWs in a sliding window over a given string T. We first show how the set of MUSs can change in a sliding window over T, and present an-time and O(d)-space algorithm to compute MUSs in a sliding window of width d over T, where is the maximum number of distinct characters in every window. We then give tight upper and lower bounds on the maximum number of changes in the set of MAWs in a sliding window over T. Our bounds improve on the previous results in Crochemore et al. (2017).

  • An improved data structure for left-right maximal generic words problem International conference

    Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    30th International Symposium on Algorithms and Computation, ISAAC 2019  2019.12 

     More details

    Event date: 2019.12

    Language:English  

    Venue:Shanghai   Country:China  

    For a set D of documents and a positive integer d, a string w is said to be d-left-right maximal, if (1) w occurs in at least d documents in D, and (2) any proper superstring of w occurs in less than d documents. The left-right-maximal generic words problem is, given a set D of documents, to preprocess D so that for any string p and for any positive integer d, all the superstrings of p that are d-left-right maximal can be answered quickly. In this paper, we present an O(n log m) space data structure (in words) which answers queries in O(|p| + o log log m) time, where n is the total length of documents in D, m is the number of documents in D and o is the number of outputs. Our solution improves the previous one by Nishimoto et al. (PSC 2015), which uses an O(n log n) space data structure answering queries in O(|p| + r · log n + o · log2 n) time, where r is the number of right-extensions q of p occurring in at least d documents such that any proper right extension of q occurs in less than d documents.

  • Compact Data Structures for Shortest Unique Substring Queries International conference

    Takuya Mieno, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    26th International Symposium on String Processing and Information Retrieval, SPIRE 2019  2019.10 

     More details

    Event date: 2019.10

    Language:English  

    Venue:Segovia   Country:Spain  

    Given a string T of length n, a substring of T is called a shortest unique substring (SUS) for an interval [s, t] if (a) u occurs exactly once in T, (b) u contains the interval [s, t] (i.e.), and (c) every substring v of T with containing [s, t] occurs at least twice in T. Given a query interval, the interval SUS problem is to output all the SUSs for the interval [s, t]. In this article, we propose a bits data structure answering an interval SUS query in output-sensitive time, where is the number of returned SUSs. Additionally, we focus on the point SUS problem, which is the interval SUS problem for. Here, we propose a bits data structure answering a point SUS query in the same output-sensitive time.

  • On Longest Common Property Preserved Substring Queries International conference

    Kazuki Kai, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Tomasz Kociumaka

    26th International Symposium on String Processing and Information Retrieval, SPIRE 2019  2019.10 

     More details

    Event date: 2019.10

    Language:English  

    Venue:Segovia   Country:Spain  

    We revisit the problem of longest common property preserving substring queries introduced by Ayad et al. (SPIRE 2018, arXiv 2018). We consider a generalized and unified on-line setting, where we are given a set X of k strings of total length n that can be pre-processed so that, given a query string y and a positive integer (Formula presented), we can determine the longest substring of y that satisfies some specific property and is common to at least (Formula presented) strings in X. Ayad et al. considered the longest square-free substring in an on-line setting and the longest periodic and palindromic substring in an off-line setting. In this paper, we give efficient solutions in the on-line setting for finding the longest common square, periodic, palindromic, and Lyndon substrings. More precisely, we show that X can be pre-processed in O(n) time resulting in a data structure of O(n) size that answers queries in (Formula presented) time and O(1) working space, where (Formula presented) is the size of the alphabet, and the common substring must be a square, a periodic substring, a palindrome, or a Lyndon word.

  • Direct Linear Time Construction of Parameterized Suffix and LCP Arrays for Constant Alphabets International conference

    Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    26th International Symposium on String Processing and Information Retrieval, SPIRE 2019  2019.10 

     More details

    Event date: 2019.10

    Language:English  

    Venue:Segovia   Country:Spain  

    We present the first worst-case linear time algorithm that directly computes the parameterized suffix and LCP arrays for constant sized alphabets. Previous algorithms either required quadratic time or the parameterized suffix tree to be built first. More formally, for a string over static alphabet and parameterized alphabet, our algorithm runs in time and O(n) words of space, where is the number of distinct symbols of in the string.

  • Computing Maximal Palindromes and Distinct Palindromes in a Trie International conference

    Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Prague Stringology Conference 2019  2019.8 

     More details

    Event date: 2019.8

    Language:English  

    Country:Czech Republic  

  • Shortest unique palindromic substring queries on run-length encoded strings International conference

    Kiichi Watanabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    30th International Workshop on Combinatorial Algorithms, IWOCA 2019  2019.7 

     More details

    Event date: 2019.7

    Language:English  

    Venue:Pisa   Country:Italy  

    For a string S, a palindromic substring S[i.j] is said to be a shortest unique palindromic substring (SUPS) for an interval [s, t] in S, if S[i.j] occurs exactly once in S, the interval [i, j] contains [s, t], and every palindromic substring containing [s, t] which is shorter than S[i.j] occurs at least twice in S. In this paper, we study the problem of answering SUPS queries on run-length encoded strings. We show how to preprocess a given run-length encoded string RLES of size m in O(m) space and O(mlog σRLES + m√log m/log logm) time so that all SUPSs for any subsequent query interval can be answered in O(√log m/log logm+ α) time, where α is the number of outputs, and σRLES is the number of distinct runs of RLES.

  • Computing runs on a trie International conference

    Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019  2019.6 

     More details

    Event date: 2019.6

    Language:English  

    Venue:Pisa   Country:Italy  

    A maximal repetition, or run, in a string, is a maximal periodic substring whose smallest period is at most half the length of the substring. In this paper, we consider runs that correspond to a path on a trie, or in other words, on a rooted edge-labeled tree where the endpoints of the path must be a descendant/ancestor of the other. For a trie with n edges, we show that the number of runs is less than n. We also show an O(n√log n log log n) time and O(n) space algorithm for counting and finding the shallower endpoint of all runs. We further show an O(n log n) time and O(n) space algorithm for finding both endpoints of all runs. We also discuss how to improve the running time even more.

  • On the size of overlapping Lempel-Ziv and Lyndon factorizations International conference

    Yuki Urabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019  2019.6 

     More details

    Event date: 2019.6

    Language:English  

    Venue:Pisa   Country:Italy  

    Lempel-Ziv (LZ) factorization and Lyndon factorization are well-known factorizations of strings. Recently, Kärkkäinen et al. studied the relation between the sizes of the two factorizations, and showed that the size of the Lyndon factorization is always smaller than twice the size of the non-overlapping LZ factorization [STACS 2017]. In this paper, we consider a similar problem for the overlapping version of the LZ factorization. Since the size of the overlapping LZ factorization is always smaller than the size of the non-overlapping LZ factorization and, in fact, can even be an O(log n) factor smaller, it is not immediately clear whether a similar bound as in previous work would hold. Nevertheless, in this paper, we prove that the size of the Lyndon factorization is always smaller than four times the size of the overlapping LZ factorization.

  • Faster queries for longest substring palindrome after block edit International conference

    Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019  2019.6 

     More details

    Event date: 2019.6

    Language:English  

    Venue:Pisa   Country:Italy  

    Palindromes are important objects in strings which have been extensively studied from combinatorial, algorithmic, and bioinformatics points of views. Manacher [J. ACM 1975] proposed a seminal algorithm that computes the longest substring palindromes (LSPals) of a given string in O(n) time, where n is the length of the string. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(ℓ + log log n) time, after a substring in T is replaced by a string of arbitrary length ℓ. This outperforms the query algorithm proposed in our previous work [CPM 2018] that uses O(ℓ + log n) time for each query.

  • The Parameterized Position Heap of a Trie International conference

    Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    11th International Conference on Algorithms and Complexity, CIAC 2019  2019.5 

     More details

    Event date: 2019.5

    Language:English  

    Venue:Rome   Country:Italy  

    Let and be disjoint alphabets of respective size and. Two strings over of equal length are said to parameterized match (p-match) if there is a bijection such that (1) f is identity on and (2) f maps the characters of one string to those of the other string so that the two strings become identical. We consider the p-matching problem on a (reversed) trie and a string pattern P such that every path that p-matches P has to be reported. Let N be the size of the given trie. In this paper, we propose the parameterized position heap for that occupies O(N) space and supports p-matching queries in time, where m is the length of a query pattern P and is the number of paths in to report. We also present an algorithm which constructs the parameterized position heap for a given trie in time and working space.

  • MR-RePair Grammar Compression Based on Maximal Repeats

    Isamu Furuya, Takuya Takagi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Takuya Kida

    2019 Data Compression Conference, DCC 2019  2019.3 

     More details

    Event date: 2019.3

    Language:English  

    Venue:Snowbird   Country:United States  

    We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at once instead of substituting the most frequent pairs consecutively. We implemented MR-RePair and compared the size of the grammar generated by MR-RePair to that by RePair on several text corpora. Our experiments show that MR-RePair generates more compact grammars than RePair does, especially for highly repetitive texts.

  • Recovering, counting and enumerating strings from forward and backward suffix arrays International conference

    Yuki Kuhara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    25th International Symposium on String Processing and Information Retrieval, SPIRE 2018  2018.10 

     More details

    Event date: 2018.10

    Language:English  

    Venue:Lima   Country:Peru  

    The suffix array SAw of a string w of length n is a permutation of [1..n] such that SAw[i]=j iff w[j, n] is the lexicographically i-th suffix of w. In this paper, we consider variants of the reverse-engineering problem on suffix arrays with two given permutations P and Q of [1..n], such that P refers to the forward suffix array of some string w and Q refers to the backward suffix array of the reversed string wR. Our results are the following: (1) An algorithm which computes a solution string over an alphabet of the smallest size, in O(n) time. (2) The exact number of solution strings over an alphabet of size σ. (3) An efficient algorithm which computes all solution strings in the lexicographical order, in time near optimal up to log n factor.

  • Right-to-left online construction of parameterized position heaps International conference

    Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda.

    Prague Stringology Conference 2018  2018.8 

     More details

    Event date: 2018.8

    Language:English   Presentation type:Oral presentation (general)  

    Country:Czech Republic  

  • O(n log n)-time text compression by LZ-style longest first substitution International conference

    Akihiro Nishi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda.

    Prague Stringology Conference 2018  2018.8 

     More details

    Event date: 2018.8

    Language:English   Presentation type:Oral presentation (general)  

    Country:Czech Republic  

  • Faster online elastic degenerate string matching

    Kotaro Aoyama, Yuto Nakashima, I. Tomohiro, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018  2018.7 

     More details

    Event date: 2018.7

    Language:English  

    Venue:Qingdao   Country:China  

    An Elastic-Degenerate String [Iliopoulus et al., LATA 2017] is a sequence of sets of strings, which was recently proposed as a way to model a set of similar sequences. We give an online algorithm for the Elastic-Degenerate String Matching (EDSM) problem that runs in O(nm √mlogm+ N) time and O(m) working space, where n is the number of elastic degenerate segments of the text, N is the total length of all strings in the text, and m is the length of the pattern. This improves the previous algorithm by Grossi et al. [CPM 2017] that runs in O(nm2 + N) time.

  • Longest substring palindrome after edit

    Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018  2018.7 

     More details

    Event date: 2018.7

    Language:English  

    Venue:Qingdao   Country:China  

    It is known that the length of the longest substring palindromes (LSPals) of a given string T of length n can be computed in O(n) time by Manacher's algorithm [J. ACM '75]. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(log(min{ω, log n})) time after single character substitution, insertion, or deletion, where ω denotes the number of distinct characters appearing in T. We also propose an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(ℓ+log n) time, after an existing substring in T is replaced by a string of arbitrary length ℓ.

  • Longest lyndon substring after edit

    Yuki Urabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018  2018.7 

     More details

    Event date: 2018.7

    Language:English  

    Venue:Qingdao   Country:China  

    The longest Lyndon substring of a string T is the longest substring of T which is a Lyndon word. LLS(T) denotes the length of the longest Lyndon substring of a string T. In this paper, we consider computing LLS(T′) where T′ is an edited string formed from T. After O(n) time and space preprocessing, our algorithm returns LLS(T′) in O(log n) time for any single character edit. We also consider a version of the problem with block edits, i.e., a substring of T is replaced by a given string of length l. After O(n) time and space preprocessing, our algorithm returns LLS(T′) in O(l log σ + log n) time for any block edit where σ is the number of distinct characters in T. We can modify our algorithm so as to output all the longest Lyndon substrings of T′ for both problems.

  • Lyndon 文字列とテキスト圧縮 Invited

    中島祐人

    2018年電子情報通信学会総合大会  2018.3 

     More details

    Event date: 2018.6

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:東京電機大学(東京)   Country:Japan  

  • ストリーミングモデルにおける最長Lyndon文字列

    中島祐人

    冬のLAシンポジウム2017  2018.2 

     More details

    Event date: 2018.6

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:京都大学(京都市)   Country:Japan  

  • Almost linear time computation of maximal repetitions in run length encoded strings International conference

    Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    The 28th International Symposium on Algorithms and Computation (ISAAC 2017)  2017.12 

     More details

    Event date: 2017.12

    Language:English   Presentation type:Oral presentation (general)  

    Country:Thailand  

  • On Reverse Engineering the Lyndon Tree International conference

    Yuto Nakashima, Takuya Takagi, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    The Prague Stringology Conference 2017 (PSC 2017)  2017.8 

     More details

    Event date: 2017.8

    Language:English   Presentation type:Oral presentation (general)  

    Country:Czech Republic  

  • Almost linear time computation of maximal repetitions in run length encoded strings International conference

    Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    12th Workshop on Compression, Text and Algorithms  2017.9 

     More details

    Event date: 2017.6

    Language:English   Presentation type:Oral presentation (general)  

    Country:Italy  

  • On Sensitivity of Compact Directed Acyclic Word Graphs International conference

    Hiroto Fujimaru, Yuto Nakashima, Shunsuke Inenaga

    14th International Conference on Words (WORDS 2023)  2023.6 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Umeå   Country:Sweden  

  • Computing SEQ-IC-LCS of Labeled Graphs International conference

    Yuki Yonemoto, Yuto Nakashima, Shunsuke Inenaga

    Prague Stringology Conference 2023 (PSC 2023)  2023.8 

     More details

    Language:English  

    Country:Czech Republic  

  • Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph International conference

    Hiroki Arimura, Shunsuke Inenaga, Yasuaki Kobayashi, Yuto Nakashima, Mizuki Sue

    30th International Symposium on String Processing and Information Retrieval (SPIRE 2023)  2023.9 

     More details

    Language:English  

    Country:Czech Republic  

  • Largest Repetition Factorization of Fibonacci Words International conference

    Kaisei Kishi, Yuto Nakashima, Shunsuke Inenaga

    30th International Symposium on String Processing and Information Retrieval (SPIRE 2023)  2023.9 

     More details

    Language:English  

    Country:Czech Republic  

  • Linear-Time Computation of Generalized Minimal Absent Words for Multiple Strings International conference

    Kouta Okabe, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai

    30th International Symposium on String Processing and Information Retrieval (SPIRE 2023)  2023.9 

     More details

    Language:English  

    Country:Czech Republic  

▼display all

Committee Memberships

  • 情報処理学会 アルゴリズム研究運営委員会   幹事  

    2024.4 - Present   

      More details

  • EATCS Japan Chapter   Organizer   Foreign country

    2023.1 - Present   

  • EATCS Japan Chapter   Secretary  

    2023.1 - Present   

      More details

  • 電子情報通信学会 情報・システムソサイエティ誌 編集委員会   編集委員  

    2022.6 - Present   

      More details

  • 情報処理学会   アルゴリズム研究会運営委員   Domestic

    2022.6 - 2026.6   

  • 情報処理学会 アルゴリズム研究運営委員会   運営委員  

    2022.4 - 2024.3   

      More details

  • 電子情報通信学会 コンピュテーション研究専門委員会   専門委員  

    2020.6 - Present   

      More details

  • 電子情報通信学会   コンピュテーション研究会 専門委員   Domestic

    2020.6 - 2022.5   

▼display all

Academic Activities

  • 組織委員長 International contribution

    35th Annual Symposium on Combinatorial Pattern Matching  ( Japan ) 2024.6

     More details

    Type:Competition, symposium, etc. 

  • PC memer International contribution

    34th Annual Symposium on Combinatorial Pattern Matching  ( France ) 2023.6

     More details

    Type:Competition, symposium, etc. 

  • Screening of academic papers

    Role(s): Peer review

    2023

     More details

    Type:Peer review 

    Proceedings of International Conference Number of peer-reviewed papers:8

  • 運営委員

    STRセミナー2022 若手研究者のための大学間合同研究集会  ( Japan ) 2022.9

     More details

    Type:Competition, symposium, etc. 

  • 電子情報通信学会 情報・システムソサイエティ誌

    2022.6 - 2024.6

     More details

    Type:Academic society, research group, etc. 

  • Screening of academic papers

    Role(s): Peer review

    2022

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:3

    Proceedings of International Conference Number of peer-reviewed papers:7

  • Screening of academic papers

    Role(s): Peer review

    2021

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:2

    Proceedings of International Conference Number of peer-reviewed papers:4

  • 座長

    冬のLAシンポジウム2019  ( Japan ) 2020.2

     More details

    Type:Competition, symposium, etc. 

  • Screening of academic papers

    Role(s): Peer review

    2020

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:4

    Proceedings of International Conference Number of peer-reviewed papers:4

  • 座長

    コンピュテーション研究会  ( Japan ) 2019.10

     More details

    Type:Competition, symposium, etc. 

  • コメンテータ

    第11回データ工学と情報マネジメントに関するフォーラム  ( Japan ) 2019.3

     More details

    Type:Competition, symposium, etc. 

  • Screening of academic papers

    Role(s): Peer review

    2019

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:0

    Number of peer-reviewed articles in Japanese journals:0

    Proceedings of International Conference Number of peer-reviewed papers:6

    Proceedings of domestic conference Number of peer-reviewed papers:0

  • 座長

    夏のLAシンポジウム2018  ( Japan ) 2018.7

     More details

    Type:Competition, symposium, etc. 

  • 座長

    2018年電子情報通信学会総合大会  ( Japan ) 2018.3

     More details

    Type:Competition, symposium, etc. 

  • 座長

    アルゴリズム研究会(167回)  ( Japan ) 2018.3

     More details

    Type:Competition, symposium, etc. 

  • 座長

    冬のLAシンポジウム2017  ( Japan ) 2018.2

     More details

    Type:Competition, symposium, etc. 

  • Screening of academic papers

    Role(s): Peer review

    2018

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:4

    Number of peer-reviewed articles in Japanese journals:0

    Proceedings of International Conference Number of peer-reviewed papers:3

    Proceedings of domestic conference Number of peer-reviewed papers:0

  • Screening of academic papers

    Role(s): Peer review

    2017

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:1

    Number of peer-reviewed articles in Japanese journals:0

    Proceedings of International Conference Number of peer-reviewed papers:2

    Proceedings of domestic conference Number of peer-reviewed papers:0

▼display all

Research Projects

  • 辞書式順序依存問題の複雑さの解明

    Grant number:23H04386  2023 - 2024

    Japan Society for the Promotion of Science・Ministry of Education, Culture, Sports, Science and Technology  Grants-in-Aid for Scientific Research  Grant-in-Aid for Transformative Research Areas (A)

    中島 祐人

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

    効率的な文字列アルゴリズムの開発には,文字列が持つ数理的性質の理解が重要である.文字列の特徴を捉える際に利用される構造や性質は,辞書式順序依存な構造と辞書式順序非依存な構造に大別される.辞書式順序依存な構造におけるこれまでの研究では,暗に与えられた辞書式順序のみを考えているに過ぎなかった.本研究では辞書式順序に依存する問題に着目し,組合せ論・計算量理論・アルゴリズム論の三方向から,辞書式順序が与える文字列構造への影響を解明する.

    CiNii Research

  • 広義文字列のアルゴリズムと組合せ論

    Grant number:23K24808  2022.4 - 2026.3

    Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (B)

    稲永 俊介, 坂内 英夫, 中島 祐人, Koeppl Dominik

      More details

    Grant type:Scientific research funding

    文字列とは,文字あるいは記号の一本鎖列である.自然言語テキスト,サーバログ,DNA配列など,計算機処理可能な様々なデータを文字列と見なすことができる.1970年代から半世紀にわたって,検索・圧縮・比較・解析・発見などの重要タスクを文字列データ上で高速実行するアルゴリズムが多数提案されてきた.本提案課題では,文字列の定義域を拡張した「広義文字列」という新概念を提唱し,木型文字列/グラフ型文字列/2次元文字列/時系列の高速処理アルゴリズム開発と,それを支える組合せ論的性質の解明に取り組む.

    CiNii Research

  • 広義文字列のアルゴリズムと組合せ論

    Grant number:22H03551  2022 - 2025

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (B)

      More details

    Authorship:Coinvestigator(s)  Grant type:Scientific research funding

  • Lyndon文字列による反復性指標解析

    Grant number:21K17705  2021 - 2024

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Early-Career Scientists

    中島 祐人

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

    文字列データを対象としたデータ圧縮においては,辞書式圧縮と呼ばれる様々な手法が知られている.一般に,部分文字列の反復(繰り返し)が多いほど,文字列は圧縮されやすい傾向にあるため,LZ 分解などの文字列構造のサイズは,文字列の反復性を捉えた反復性指標と見なすことができる.最近では,String attractor や部分文字列複雑性に基づいた反復性指標が提案され,反復性への理解が進められているが,本申請課題では,辞書式圧縮とは直接関係のない Lyndon 文字列および関連する文字列構造を導入することで,新たな視点から反復性指標の解析を行い,その本質を明らかにする.

    CiNii Research

  • 辞書式順序に基づいた文字列データ処理法の構築

    2020 - 2022

    JST Strategic Basic Research Program (Ministry of Education, Culture, Sports, Science and Technology)

      More details

    Authorship:Principal investigator  Grant type:Contract research

  • 情報爆縮に基づくIoTデータ処理基盤の構築

    Grant number:18H04098  2018 - 2022

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (A)

    竹田 正幸, 定兼 邦彦, 坂内 英夫, 井 智弘, 瀧本 英二, 坂本 比呂志, 畑埜 晃平, 稲永 俊介, 喜田 拓也, 中島 祐人

      More details

    Authorship:Coinvestigator(s)  Grant type:Scientific research funding

    IoT時代が到来し, 製造・産業界, モビリティ・交通インフラ, 医療・ヘルスケアなど, 広範な分野において技術革新が生み出され新たな価値が創出されると期待されている. 一方, IoTデータの急増に伴い, クラウドサーバへのデータ処理の集中, ネットワーク回線の逼迫, データ送信の遅延などの技術的問題が深刻化している. 現在のIoTに対する取り組みは実証フェーズのものが多くIoTを用いた価値創出に重点が置かれているため, 上記の問題の抜本的な解決は後回しとなっている. 代表者らは, これまでに, 入力データや計算に用いるデータ構造に潜む冗長性の除去に基づいて計算そのものの効率化を図る「情報爆縮」の研究を行ってきた. 本研究プロジェクトでは, この情報爆縮技術を核に据え, 理論と実用的の両面から, エッジ側とクラウド側の処理を効率化する技術を開発し低コストでデータを収集・集約・送信・蓄積・検索・解析できる新しいIoTデータ処理基盤の構築を目指す.
    このために, 以下の3つの研究項目をおいて研究を行った.
    (A) 圧縮データストリームに対するリアルタイム抽出・集計技術.
    (B) 圧縮データに対する高速質問処理技術.
    (C) 解釈可能な圧縮データ錬成技術.
    前年度に引き続き, 特に(A), (B)に力点を置いて研究を行い、(A)スライド窓における極小ユニーク文字列の列挙・回文木の構築/維持のアルゴリズム、(B)接尾辞配列構築アルゴリズムを基にした高速な圧縮索引構造の提案等、多くの研究成果を得た.

    CiNii Research

  • Lyndon文字列による簡潔で高速な文字列処理アルゴリズム

    Grant number:18K18002  2018 - 2020

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Early-Career Scientists

    中島 祐人

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

    本研究の目的は,Lyndon文字列の性質に基づいて,簡潔で高速な文字列処理アルゴリズムを開発することである.
    本目的の達成のために,Lyndon文字列を中心に,繰り返し構造や回文構造など広く文字列処理アルゴリズムや文字列組合せ論の問題に取り組んだ.Lyndon文字列を中心に,様々な文字列構造の性質の理解や,それらの性質を利用した効率的なアルゴリズムを提案した.

    CiNii Research

  • 情報爆縮基盤技術

    Grant number:25240003  2017

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (A)

      More details

    Authorship:Coinvestigator(s)  Grant type:Scientific research funding

  • ストリーミングモデルにおける文字列処理アルゴリズム基盤

    Grant number:17H06923  2017

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Research Activity start-up

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

▼display all

Educational Activities

  • 主に,理学部物理学科情報理学コースの演習科目を担当している.

Class subject

  • 情報科学

    2024.10 - 2025.3   Second semester

  • 情報代数学演習

    2024.4 - 2024.9   First semester

  • 形式言語理論演習

    2024.4 - 2024.9   First semester

  • 情報代数学演習

    2023.4 - 2023.9   First semester

  • 形式言語理論演習

    2023.4 - 2023.9   First semester

  • 情報代数学演習

    2022.4 - 2022.9   First semester

  • 情報論理学演習

    2022.4 - 2022.9   First semester

  • 形式言語理論演習

    2022.4 - 2022.9   First semester

  • 情報論理学演習

    2021.4 - 2021.9   First semester

  • 情報代数学演習

    2021.4 - 2021.9   First semester

  • 形式言語理論演習

    2021.4 - 2021.9   First semester

  • 形式言語理論演習

    2020.4 - 2020.9   First semester

  • 情報代数学演習

    2020.4 - 2020.9   First semester

  • 情報論理学演習

    2020.4 - 2020.9   First semester

  • 形式言語理論演習

    2019.4 - 2019.9   First semester

  • 情報代数学演習

    2019.4 - 2019.9   First semester

  • 情報論理学演習

    2019.4 - 2019.9   First semester

  • 情報科学

    2018.10 - 2019.3   Second semester

  • 形式言語理論演習

    2018.4 - 2018.9   First semester

  • 情報代数学演習

    2018.4 - 2018.9   First semester

  • 情報論理学演習

    2018.4 - 2018.9   First semester

  • 情報論理学演習

    2017.4 - 2017.9   First semester

  • 形式言語理論演習

    2017.4 - 2017.9   First semester

  • 情報代数学演習

    2017.4 - 2017.9   First semester

▼display all

FD Participation

  • 2024.3   Role:Participation   Title:【シス情FD】高度データサイエンティスト育成事業の取り組みについて

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2023.10   Role:Participation   Title:【シス情FD】価値創造型半導体人材育成センターについて

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2023.4   Role:Speech   Title:【シス情FD】若手教員による研究紹介⑧

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2022.5   Role:Participation   Title:【シス情FD】若手教員による研究紹介④「量子コンピュータ・システム・アーキテクチャの研究~道具になることを目指して~」

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2022.4   Role:Participation   Title:【シス情FD】第4期中期目標・中期計画等について

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2022.1   Role:Participation   Title:【シス情FD】シス情関連の科学技術に対する国の政策動向(に関する私見)

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2021.10   Role:Participation   Title:【シス情FD】熊本高専と九大システム情報との交流・連携に向けて ー 3年半で感じた高専の実像 ー

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2021.7   Role:Participation   Title:若手教員による研究紹介 及び 科研取得のポイント、その他について ②

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2021.6   Role:Participation   Title:若手教員による研究紹介 及び 科研取得のポイントについて ①

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2020.11   Role:Participation   Title:マス・フォア・イノベーション卓越大学院について

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2020.4   Role:Participation   Title:新型コロナウイルスが誘起した社会変化に対するシステム情報科学からの提言

    Organizer:[Undergraduate school/graduate school/graduate faculty]

  • 2018.7   Role:Participation   Title:論文剽窃ソフトの活用方法について

    Organizer:[Undergraduate school/graduate school/graduate faculty]

▼display all