Kyushu University Academic Staff Educational and Research Activities Database
List of Presentations
Yuto Nakashima Last modified date:2022.04.20

Assistant Professor / Mathematical Informatics / Department of Informatics / Faculty of Information Science and Electrical Engineering


Presentations
1. Kazuya Tsuruta, Dominik Köppl, Shunsuke Kanda, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, c-Trie++: A Dynamic Trie Tailored for Fast Prefix Searches, Data Compression Conference 2020, 2020.03.
2. Kohei Yamada, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Faster STR-EC-LCS Computation, 46th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2020, 2020.01, The longest common subsequence (LCS) problem is a central problem in stringology that finds the longest common subsequence of given two strings A and B. More recently, a set of four constrained LCS problems (called generalized constrained LCS problem) were proposed by Chen and Chao [J. Comb. Optim, 2011]. In this paper, we consider the substring-excluding constrained LCS (STR-EC-LCS) problem. A string Z is said to be an STR-EC-LCS of two given strings A and B excluding P if, Z is one of the longest common subsequences of A and B that does not contain P as a substring. Wang et al. proposed a dynamic programming solution which computes an STR-EC-LCS in O(mnr) time and space where [Inf. Process. Lett., 2013]. In this paper, we show a new solution for the STR-EC-LCS problem. Our algorithm computes an STR-EC-LCS in time where denotes the set of distinct characters occurring in both A and B, and L is the length of the STR-EC-LCS. This algorithm is faster than the O(mnr)-time algorithm for short/long STR-EC-LCS (namely, or), and is at least as efficient as the O(mnr)-time algorithm for all cases..
3. Takuya Mieno, Yuki Kuhara, Tooru Akagi, Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Minimal Unique Substrings and Minimal Absent Words in a Sliding Window, 46th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2020, 2020.01, A substring u of a string T is called a minimal unique substring (MUS) of T if u occurs exactly once in T and any proper substring of u occurs at least twice in T. A string w is called a minimal absent word (MAW) of T if w does not occur in T and any proper substring of w occurs in T. In this paper, we study the problems of computing MUSs and MAWs in a sliding window over a given string T. We first show how the set of MUSs can change in a sliding window over T, and present an-time and O(d)-space algorithm to compute MUSs in a sliding window of width d over T, where is the maximum number of distinct characters in every window. We then give tight upper and lower bounds on the maximum number of changes in the set of MAWs in a sliding window over T. Our bounds improve on the previous results in Crochemore et al. (2017)..
4. Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, An improved data structure for left-right maximal generic words problem, 30th International Symposium on Algorithms and Computation, ISAAC 2019, 2019.12, For a set D of documents and a positive integer d, a string w is said to be d-left-right maximal, if (1) w occurs in at least d documents in D, and (2) any proper superstring of w occurs in less than d documents. The left-right-maximal generic words problem is, given a set D of documents, to preprocess D so that for any string p and for any positive integer d, all the superstrings of p that are d-left-right maximal can be answered quickly. In this paper, we present an O(n log m) space data structure (in words) which answers queries in O(|p| + o log log m) time, where n is the total length of documents in D, m is the number of documents in D and o is the number of outputs. Our solution improves the previous one by Nishimoto et al. (PSC 2015), which uses an O(n log n) space data structure answering queries in O(|p| + r · log n + o · log2 n) time, where r is the number of right-extensions q of p occurring in at least d documents such that any proper right extension of q occurs in less than d documents..
5. Takuya Mieno, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Compact Data Structures for Shortest Unique Substring Queries, 26th International Symposium on String Processing and Information Retrieval, SPIRE 2019, 2019.10, Given a string T of length n, a substring of T is called a shortest unique substring (SUS) for an interval [s, t] if (a) u occurs exactly once in T, (b) u contains the interval [s, t] (i.e.), and (c) every substring v of T with containing [s, t] occurs at least twice in T. Given a query interval, the interval SUS problem is to output all the SUSs for the interval [s, t]. In this article, we propose a bits data structure answering an interval SUS query in output-sensitive time, where is the number of returned SUSs. Additionally, we focus on the point SUS problem, which is the interval SUS problem for. Here, we propose a bits data structure answering a point SUS query in the same output-sensitive time..
6. Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Direct Linear Time Construction of Parameterized Suffix and LCP Arrays for Constant Alphabets, 26th International Symposium on String Processing and Information Retrieval, SPIRE 2019, 2019.10, We present the first worst-case linear time algorithm that directly computes the parameterized suffix and LCP arrays for constant sized alphabets. Previous algorithms either required quadratic time or the parameterized suffix tree to be built first. More formally, for a string over static alphabet and parameterized alphabet, our algorithm runs in time and O(n) words of space, where is the number of distinct symbols of in the string..
7. Kazuki Kai, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Tomasz Kociumaka, On Longest Common Property Preserved Substring Queries, 26th International Symposium on String Processing and Information Retrieval, SPIRE 2019, 2019.10, We revisit the problem of longest common property preserving substring queries introduced by Ayad et al. (SPIRE 2018, arXiv 2018). We consider a generalized and unified on-line setting, where we are given a set X of k strings of total length n that can be pre-processed so that, given a query string y and a positive integer (Formula presented), we can determine the longest substring of y that satisfies some specific property and is common to at least (Formula presented) strings in X. Ayad et al. considered the longest square-free substring in an on-line setting and the longest periodic and palindromic substring in an off-line setting. In this paper, we give efficient solutions in the on-line setting for finding the longest common square, periodic, palindromic, and Lyndon substrings. More precisely, we show that X can be pre-processed in O(n) time resulting in a data structure of O(n) size that answers queries in (Formula presented) time and O(1) working space, where (Formula presented) is the size of the alphabet, and the common substring must be a square, a periodic substring, a palindrome, or a Lyndon word..
8. Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Computing Maximal Palindromes and Distinct Palindromes in a Trie, Prague Stringology Conference 2019, 2019.08.
9. Kiichi Watanabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Shortest unique palindromic substring queries on run-length encoded strings, 30th International Workshop on Combinatorial Algorithms, IWOCA 2019, 2019.07, For a string S, a palindromic substring S[i.j] is said to be a shortest unique palindromic substring (SUPS) for an interval [s, t] in S, if S[i.j] occurs exactly once in S, the interval [i, j] contains [s, t], and every palindromic substring containing [s, t] which is shorter than S[i.j] occurs at least twice in S. In this paper, we study the problem of answering SUPS queries on run-length encoded strings. We show how to preprocess a given run-length encoded string RLES of size m in O(m) space and O(mlog σRLES + m√log m/log logm) time so that all SUPSs for any subsequent query interval can be answered in O(√log m/log logm+ α) time, where α is the number of outputs, and σRLES is the number of distinct runs of RLES..
10. Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Computing runs on a trie, 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019, 2019.06, A maximal repetition, or run, in a string, is a maximal periodic substring whose smallest period is at most half the length of the substring. In this paper, we consider runs that correspond to a path on a trie, or in other words, on a rooted edge-labeled tree where the endpoints of the path must be a descendant/ancestor of the other. For a trie with n edges, we show that the number of runs is less than n. We also show an O(n√log n log log n) time and O(n) space algorithm for counting and finding the shallower endpoint of all runs. We further show an O(n log n) time and O(n) space algorithm for finding both endpoints of all runs. We also discuss how to improve the running time even more..
11. Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Faster queries for longest substring palindrome after block edit, 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019, 2019.06, Palindromes are important objects in strings which have been extensively studied from combinatorial, algorithmic, and bioinformatics points of views. Manacher [J. ACM 1975] proposed a seminal algorithm that computes the longest substring palindromes (LSPals) of a given string in O(n) time, where n is the length of the string. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(ℓ + log log n) time, after a substring in T is replaced by a string of arbitrary length ℓ. This outperforms the query algorithm proposed in our previous work [CPM 2018] that uses O(ℓ + log n) time for each query..
12. Yuki Urabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, On the size of overlapping Lempel-Ziv and Lyndon factorizations, 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019, 2019.06, Lempel-Ziv (LZ) factorization and Lyndon factorization are well-known factorizations of strings. Recently, Kärkkäinen et al. studied the relation between the sizes of the two factorizations, and showed that the size of the Lyndon factorization is always smaller than twice the size of the non-overlapping LZ factorization [STACS 2017]. In this paper, we consider a similar problem for the overlapping version of the LZ factorization. Since the size of the overlapping LZ factorization is always smaller than the size of the non-overlapping LZ factorization and, in fact, can even be an O(log n) factor smaller, it is not immediately clear whether a similar bound as in previous work would hold. Nevertheless, in this paper, we prove that the size of the Lyndon factorization is always smaller than four times the size of the overlapping LZ factorization..
13. Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, The Parameterized Position Heap of a Trie, 11th International Conference on Algorithms and Complexity, CIAC 2019, 2019.05, Let and be disjoint alphabets of respective size and. Two strings over of equal length are said to parameterized match (p-match) if there is a bijection such that (1) f is identity on and (2) f maps the characters of one string to those of the other string so that the two strings become identical. We consider the p-matching problem on a (reversed) trie and a string pattern P such that every path that p-matches P has to be reported. Let N be the size of the given trie. In this paper, we propose the parameterized position heap for that occupies O(N) space and supports p-matching queries in time, where m is the length of a query pattern P and is the number of paths in to report. We also present an algorithm which constructs the parameterized position heap for a given trie in time and working space..
14. Isamu Furuya, Takuya Takagi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Takuya Kida, MR-RePair
Grammar Compression Based on Maximal Repeats, 2019 Data Compression Conference, DCC 2019, 2019.03, We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at once instead of substituting the most frequent pairs consecutively. We implemented MR-RePair and compared the size of the grammar generated by MR-RePair to that by RePair on several text corpora. Our experiments show that MR-RePair generates more compact grammars than RePair does, especially for highly repetitive texts..
15. Yuki Kuhara, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Recovering, counting and enumerating strings from forward and backward suffix arrays, 25th International Symposium on String Processing and Information Retrieval, SPIRE 2018, 2018.10, The suffix array SAw of a string w of length n is a permutation of [1..n] such that SAw[i]=j iff w[j, n] is the lexicographically i-th suffix of w. In this paper, we consider variants of the reverse-engineering problem on suffix arrays with two given permutations P and Q of [1..n], such that P refers to the forward suffix array of some string w and Q refers to the backward suffix array of the reversed string wR. Our results are the following: (1) An algorithm which computes a solution string over an alphabet of the smallest size, in O(n) time. (2) The exact number of solution strings over an alphabet of size σ. (3) An efficient algorithm which computes all solution strings in the lexicographical order, in time near optimal up to log n factor..
16. Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda., Right-to-left online construction of parameterized position heaps, Prague Stringology Conference 2018, 2018.08.
17. Akihiro Nishi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda., O(n log n)-time text compression by LZ-style longest first substitution, Prague Stringology Conference 2018, 2018.08.
18. Kotaro Aoyama, Yuto Nakashima, I. Tomohiro, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Faster online elastic degenerate string matching, 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018, 2018.07, An Elastic-Degenerate String [Iliopoulus et al., LATA 2017] is a sequence of sets of strings, which was recently proposed as a way to model a set of similar sequences. We give an online algorithm for the Elastic-Degenerate String Matching (EDSM) problem that runs in O(nm √mlogm+ N) time and O(m) working space, where n is the number of elastic degenerate segments of the text, N is the total length of all strings in the text, and m is the length of the pattern. This improves the previous algorithm by Grossi et al. [CPM 2017] that runs in O(nm2 + N) time..
19. Yuki Urabe, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Longest lyndon substring after edit, 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018, 2018.07, The longest Lyndon substring of a string T is the longest substring of T which is a Lyndon word. LLS(T) denotes the length of the longest Lyndon substring of a string T. In this paper, we consider computing LLS(T′) where T′ is an edited string formed from T. After O(n) time and space preprocessing, our algorithm returns LLS(T′) in O(log n) time for any single character edit. We also consider a version of the problem with block edits, i.e., a substring of T is replaced by a given string of length l. After O(n) time and space preprocessing, our algorithm returns LLS(T′) in O(l log σ + log n) time for any block edit where σ is the number of distinct characters in T. We can modify our algorithm so as to output all the longest Lyndon substrings of T′ for both problems..
20. Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Longest substring palindrome after edit, 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018, 2018.07, It is known that the length of the longest substring palindromes (LSPals) of a given string T of length n can be computed in O(n) time by Manacher's algorithm [J. ACM '75]. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(log(min{ω, log n})) time after single character substitution, insertion, or deletion, where ω denotes the number of distinct characters appearing in T. We also propose an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(ℓ+log n) time, after an existing substring in T is replaced by a string of arbitrary length ℓ..
21. Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Almost linear time computation of maximal repetitions in run length encoded strings, The 28th International Symposium on Algorithms and Computation (ISAAC 2017), 2017.12.
22. Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Almost linear time computation of maximal repetitions in run length encoded strings, 12th Workshop on Compression, Text and Algorithms, 2017.09.
23. Yuto Nakashima, Takuya Takagi, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, On Reverse Engineering the Lyndon Tree, The Prague Stringology Conference 2017 (PSC 2017), 2017.08.