九州大学 研究者情報
発表一覧
丸山 修(まるやま おさむ) データ更新日:2024.04.22

准教授 /  芸術工学研究院 未来共生デザイン部門 モデリング・最適化


学会発表等
1. Tsukasa Koga, Osamu Maruyama, CBOEP: Generating negative enhancer-promoter interactions to
train classifiers, The 14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB), 2023.09, [URL], For training and testing enhancer-promoter interaction (EPI) clas-
sifiers, the question on which non-positive EPIs are selected as
negative instances must be answered. Most previous methods use
the dataset of the EPI classifier TargetFinder where negative EP
pairs are sampled from non-positive EP pairs. Consequently, over
92% of EPIs in the TargetFinder-positive and negative sets of cell
line GM12878 have a 2-fold or greater positive/negative class imbal-
ance of promoter occurrences between the positive and negative
EP pairs. This situation negatively impacts the predictability of EPI
classifiers trained using the datasets.
Thus, we first proposed the condition that the negative EPIs
should satisfy. Second, we devised a method called CBOEP (class
balanced occurrences of enhancers and promoters), to generate
negative EPI sets that approximately fulfil this condition for a given
positive EPI set. CBOEP solves the finding problem by reducing it to
the maximum-flow problem. Third, we applied the generated nega-
tive EPI sets to existing EPI classifiers, TransEPI and TargetFinder.
The negative datasets lead to higher prediction performance than
the existing negative EPI datasets. The source code is available at
https://github.com/maruyama-lab-design/CBOEP..
2. 古賀 吏(九州大学), 丸山 修(九州大学), エンハンサー・プロモーター間相互作用の負例生成手法とその評価, 第73回バイオ研究発表会, 2023.03.
3. Osamu Maruyama, Recurrent neural network approach for predicting DNA methylation inheritance of CpG islands using embedding vectors of variable-length k-mers, The International Symposium "Totipotency and Germ Cell Development", 2022.11.
4. 成田 浩規(九州大学), Au Yeung Wan Kin(九州大学), 佐々木 裕之(九州大学), 丸山 修(九州大学), 埋め込みベクトルによるCpGアイランドのメチル化状態予測, 第69回バイオ研究発表会, 2022.02.
5. 古賀 吏(九州大学), 丸山 修(九州大学), エンハンサー・プロモーター間相互作用予測問題に対する負例生成手法の提案, 第69回バイオ研究発表会, 2022.02.
6. Ryo Shimizu, Wan Kin Au Yeung, Hidehiro Toh, Hiroyuki Sasaki and Osamu Maruyama, Predicting Discriminative Motifs for DNA Methylation in Mammalian Development, 2020日本バイオインフォマティクス学会年会 第9回生命医薬情報学連合大会 IIBMP2020 , 2020.09, [URL].
7. Osamu Maruyama, Fumiko Matsuzaki, DegSampler3: Pairwise Dependency Model in Degradation Motif Site Prediction of Substrate Protein Sequences, 2019 IEEE 19th International Conference on Bioinformatics and Bioengineering, BIBE 2019, 2019.10, [URL].
8. Osamu Maruyama, Fumiko Matsuzaki, DegSampler: Collapsed Gibbs Sampler for Detecting E3 Binding Sites, 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE), 2018.12, In this paper, we address the problem of finding sequence motifs in substrate proteins specific to E3 ubiquitin ligases (E3s). We formulated a posterior probability distribution of sites by designing a likelihood function based on amino acid indexing and a prior distribution based on the disorderness of protein sequences. These designs are derived from known characteristics of E3 binding sites in substrate proteins. Then, we devise a collapsed Gibbs sampling algorithm for the posterior probability distribution called DegSampler. We performed computational experiments using 36 sets of substrate proteins specific to E3s and compared the performance of DegSampler with those of popular motif finders, MEME and GLAM2. The results showed that DegSampler was superior to the others in finding E3 binding motifs. Thus, DegSampler is a promising tool for finding E3 motifs in substrate proteins..
9. 丸山 修, 正負例配列集合のためのコンセンサス・モチーフによるクラスタリング・アルゴリズム, 日本バイオインフォマティクス学会(JSBi), 2017.10, [URL].
10. Osamu Maruyama, Limsoon Wong, Regularizing predicted complexes by mutually exclusive protein-protein interactions, International Symposium on Network Enabled Health Informatics, Biomedicine and Bioinformatics, HI-BI-BI 2015, 2015.08, [URL], Protein complexes are key entities in the cell responsible for various cellular mechanisms and biological processes. We
propose here a method for predicting protein complexes from
a protein-protein interaction (PPI) network, using information
on mutually exclusive PPIs. If two interactions are mutually
exclusive, they are not allowed to exist simultaneously in the
same predicted complex. We introduce a new regularization term
which checks whether predicted complexes are connected by mu-
tually exclusive PPIs. This regularization term is added into the
scoring function of our earlier protein complex prediction tool,
PPSampler2. We show that PPSampler2 with mutually exclusive
PPIs outperforms the original one. Furthermore, the performance
is superior to well-known representative conventional protein
complex prediction methods. Thus, it is is effective to use mutual
exclusiveness of PPIs in protein complex prediction..
11. Tatsuke Daisuke, Osamu Maruyama, Sampling Strategy for Protein Complex Prediction Using Cluster Size Frequency, The 23rd International Conference on Genome Informatics, 2012.12, [URL], In this paper we propose a Markov chain Monte Carlo sampling method for
predicting protein complexes from protein-protein interactions (PPIs). Many
of the existing tools for this problem are designed more or less based on a
density measure of a subgraph of the PPI network. This kind of measures
is less effective for smaller complexes. On the other hand, it can be found
that the number of complexes of a size in a database of protein complexes
follows a power-law. Thus, most of the complexes are small-sized. For example,
in CYC2008, a database of curated protein complexes of yeast, 42% of
the complexes are heterodimeric, i.e., a complex consisting of two different
proteins. In this work, we propose a protein complex prediction algorithm,
called PPSampler (Proteins’ Partition Sampler), which is designed based on
the Metropolis-Hastings algorithm using a parameter representing a target
value of the relative frequency of the number of predicted protein complexes
of a particular size. In a performance comparison, PPSampler outperforms
other existing algorithms. Furthermore, about half of the predicted clusters
that are not matched with any known complexes in CYC2008 are statistically
significant by Gene Ontology terms. Some of them can be expected to
be true complexes..
12. Daisuke Tatsuke, Osamu Maruyama, MCMC Strategy for Protein Complex Prediction Using Cluster Size Frequency, 第11回電子情報通信学会情報論的学習理論と機械学習(IBISML)研究会 —第15回情報論的学習理論ワークショップ(IBIS2012)—, 2012.11, [URL], In this paper we propose a Markov chain Monte Carlo sampling method
for predicting protein complexes from protein-protein interactions (PPIs).
Many of the existing tools for this problem are designed more or less based on
a density measure of a subgraph of the PPI network.
This kind of measures is less effective for smaller complexes.
On the other hand,
it can be found that
the frequency of complexes of size, $i$, in a database of protein complexes
often follows a power-law,
$i^{-\gamma}$, where $\gamma$ is a constant.
Thus, most of the complexes are small-sized.
For example, in CYC2008, a database of curated protein complexes of yeast,
42\% of the complexes are heterodimeric, i.e.,
a complex consisting of two different proteins.
In this work,
we propose
a protein complex prediction algorithm, called {\OurMethodName} ({\OurMethodFullName}),
which is designed based on the Metropolis-Hastings algorithm
using a parameter representing a target value of
the relative frequency of the number of
predicted protein complexes of a particular size.
In a performance comparison,
{\OurMethodName} outperforms other existing algorithms.
Furthermore,
about half of the predicted clusters that are not matched with
any known complexes in CYC2008 are statistically significant by
Gene Ontology terms.
Some of them can be expected to be true complexes. .
13. Osamu Maruyama, Protein complex prediction by sampling, 平成24年度文部科学省数学・数理科学と諸科学・産業 との連携研究ワークショップ, 2012.11, Protein complexes are important entities to organize various biological processes in
the cell, like signal transduction, gene expression, and molecular transmission. Many
proteins are known to perform their intrinsic tasks in association with their specific
interacting partners, forming protein complexes. Therefore, an enriched catalog of
protein complexes in a cell could accelerate further research to elucidate the mechanisms
underlying many biological processes. However, known complexes are still limited.
Thus, it is a challenging problem to computationally predict protein complexes.
Many of existing tools are designed more or less based on density measures of a
subgraph of the protein-protein interaction network. This kind of measures is less
effective for smaller complexes. On the other hand, it can be found that the frequency
distribution of the number of complexes of size, i, in a database of protein complexes is
often scale-free, i.e., follows a power-law, i.
14. Osamu Maruyama, Protein Complex Prediction, 2012.10, [URL], In this talk, we will consider the problem of protein complex prediction,
which is a challenging problem in computational biology. After a brief
introduction of this problem, we will present a few computational models used in
prediction algorithms, some of which are based on random walks with restarts
and MCMC (Markov chain Monte Carlo) sampling methods.
.
15. Osamu Maruyama, Protein complex prediction, Joint Workshop of IMS and IMI on Mathematics for Industry: Biological and Climatic Prospects, 2012.09, [URL], In this talk, we will consider the problem of protein complex prediction, which is a challenging problem in computational biology. After a brief introduction of this problem, we will present a few computational models used in prediction algorithms, some of which are based on random walks with restarts and MCMC (Markov chain Monte Carlo) sampling methods..
16. 田附大典,丸山修, タンパク質複合体サイズ分布を用いたマルコフ連鎖モンテカルロ法に基づく複合体予測手法の研究, 第30回情報処理学会バイオ情報学研究会, 2012.08, 本研究では,タンパク質間相互作用情報からタンパク質複合体を予測するサンプリング手法を提案
する.既存手法の多くはタンパク質間相互作用ネットワークの部分グラフの密度に基づき複合体を予測す
るので,小さな複合体の正確な予測は相対的に困難である.ところが,酵母の代表的なタンパク質複合体
データベースであるCYC2008 を調べると,複合体のサイズ分布はスケール・フリーであり,42%の複合
体は最小サイズ2 であることが分かる.そこで,本研究では,複合体のサイズ分布情報を活用したメトロ
ポリス-ヘイスティングス法に基づく予測手法PPSampler (Proteins’ Partition Sampler) を提案する.こ
のPPSampler が,既存手法と比べて高い精度を実現することを計算機実験により確認した..
17. Osamu Maruyama, Heterodimeric Protein Complex Identification, ACM Conference on Bioinformatics, Computational Biology and Biomedicine 2011, 2011.08, [URL].
18. Osamu Maruyama, Evaluating Protein Sequence Signatures Inferred
from Protein-Protein Interaction Data by Gene Ontology Annotations, 2008 IEEE International Conference on Bioinformatics and Biomedicine, 2008.11, [URL].
19. O.Maruyama, Searching for Regulatory Elements of Alternative Splicing Events Using Phylogenetic Footprinting, The 4th Workshop on Algorithms in Bioinformatics,, 2004.09.
20. 丸山 修, 最適degenerate pattern探索アルゴリズムと転写因子結合部位同定問題への適用, 情報処理学会第91回アルゴリズム研究会, 2003.09.
21. Osamu Maruyama, Finding optimal degenerate patterns in DNA sequences, European Conference on Computational Biology (ECCB 2003), 2003.09.
22. O.Maruyama, Toward Drawing an Atlas of Hypothesis Classes:
Approximating a Hypothesis via Another Hypothesis Model, The 5nd International Conference of Discovery Science, 2002.11.

九大関連コンテンツ

pure2017年10月2日から、「九州大学研究者情報」を補完するデータベースとして、Elsevier社の「Pure」による研究業績の公開を開始しました。