Kyushu University Academic Staff Educational and Research Activities Database
List of Reports
Seiichi Uchida Last modified date:2022.07.05

Professor / Real World Robotics / Department of Advanced Information Technology / Faculty of Information Science and Electrical Engineering


Reports
1. 17pCP-12 Evaluation of plasma parameters using image analysis of fine particle motion in Ar plasmas.
2. Image-informatics for Biology and Biology for Image-informatics.
3. Faisal Shafait, Dimosthenis Karatzas, Seiichi Uchida, Masakazu Iwamura, Special Issue: Robust Reading Preface, INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 10.1007/s10032-015-0244-0, Vol.18, No.2, pp.109-110, 2015.06.
4. A Trial for Development of Fundamental Technologies for New Usage of Character and Document Media
"You are what you read." と言われるように,読むことは,我々の知識や人格の形成に中心的な役割を果たしている.実際,知識を追加・更新するため,我々は毎日読むことに膨大な時間を費やしている.ところが,このような努力は記録も再利用もされておらず,かけた時間に見合った価値を引き出せているとは言い難い.本稿では,読むという行動を記録し,再利用するためのリーディングライフログ(Reading-Life Log)技術について解説する.具体的には,人の読む行動と読まれる対象を相互解析することによって,読んだ文字・文書を通して人を知り,また人の読み方を通して文字・文書を知ることを実現する..
5. A Trial for Development of Fundamental Technologies for New Usage of Character and Document Media.
6. Molecular genetic application of hyperspectral image sensing as a method for high-throughput quantitative phenotype analysis(Remote sensing of vegetation)
生物科学の研究では、材料やスケールを問わず、対象物を見分ける必要に迫られることが多い。その際、私たちが最も頻繁に参照する形質は、言うまでもなく目に見える形や色である。しかしながら、私たちはこれらの情報を十分に活用できているだろうか。とりわけ色に関しては、定量的な情報として注意深く扱われていることは稀である。一方、衛星や航空機によるリモートセンシングでは、精密な色情報とも言える分光特性が、地上にある対象物を定量または識別するための主要な鍵情報として活用されている。筆者らは、高精細な分光データを保持することができるハイパースペクトル画像を、ノイズの少ない撮影条件と遺伝的に均質なモデル植物の利点を生かすことのできる実験室において活用することにより、組織または細胞レベルの生理的状態を精度よく予測するための実験系の構築に取り組んでいる。本稿では遺伝学研究にリモートセンシングの異分野技術を取り入れる試みを紹介するが、生態学における同技術の新たな活用法を案出する有用な手がかりとなることを願いたい。.
7. Distribution Analysis of a Large-Scale Pattern Set Using Minimum Spanning Tree
本研究の究極の目標は,「パターンの真の分布」を解明することである.その際.単一のクラスの分布だけでなく,複数クラス間の関係も解明の対象とする.パターン空間に存在しうる全てのパターンを収集するのは実現不可能なため,できる限り多くのクラスラベル付きパターンを収集した上で,その分布構造を解析することで,この目標に挑む.真の分布の解明を目指す以上,解析手法として,何らかのモデルによる近似や低次元化など,パターン間の近傍関係に誤差が入り得るものは適切でない.そこで本論文では,パターンの相対位置関係を誤差なく保存しうるネットワーク解析手法により大規模パターンの分布構造の解析を行う.具体的には,各パターンを1ノードとし,その近傍関係によりエッジを付与してネットワークを構成し,その構造を解析する.本論文では,ネットワークの作成手法として最小全域木を適用し,分布解析の対象として約50万個の活字数字画像と約80万個の手書き数字画像を用いた実施例を示し,パターン数の増加によるパターン分布の変化を明らかにする..
8. Scene Character Extraction by an Optimal Two-Dimensional Segmentation
情景内文字認識において,情景画像から文字領域を正しく抽出することは重要である.しかし,情景内文字は多様な文字配置と複雑な照明・外光条件を伴うため,文字抽出は依然未解決の課題である.本論文では,画像の部分領域に対する文字/非文字の識別器として文字認識を用いながら,最適な2次元セグメンテーションを併用することで,極力高精度に文字抽出を行う手法を提案する.本手法は,2値化のしきい値を段階的に変えて得られる2値画像集合の中に各文字が明瞭に現れる段階が含まれることに着目し,2値画像集合における個々の連結成分を文字仮説とする多重仮説を生成する.そして,2次元最適化の枠組みで,文字仮説の近傍を考慮しながら最適な仮説を選択することによって文字抽出を行う.以上の考え方を,しきい値を変えながら得られた文字認識結果をノードとするコンポーネント・ツリーとグラフカットを利用して実現する..
9. Multispectral Imaging of Odor Space
Although the odor is invisibile, the odor have important informations as a danger signal. The odor information have the kind of odor material, density and spatial distribution, because the odor is vapor. In this study, with fluorescence probe as agarose gel film that mixed the fluorescence material, the odor information is extracted as fluorescence change. In this article, with multi fluorescence probe that mixed a number of fluorescence material, and with multispectral imaging, the discrimination of odor material is performed. This probe have ability that discriminates spatial distribution of odor density contained mixed odor, because the reaction of multi fluorescence probe is able to be divided the reaction of single fluorescence probe for each odor material..
10. Two Topics on Elastic Matching
弾性マッチングは,音声認識や文字認識,ステレオマッチング等に広く用いられている.一種の最適化問題として定式化され,その大局的最適解は,多くの場合,動的計画法を用いて求めることができる.ところで,大局的最適解は,他の方法でも求めることができる.例えばグラフカットや整数線形計画法である.本稿では,動的計画法以外の大局的最適化手法を用いることで,従前より機能強化された弾性マッチングを実現できることを紹介する..
11. Kai Kunze, Masakazu Iwamura, Koichi Kise, Seiichi Uchida, Shinichiro Omachi, Activity Recognition for the Mind: Toward a Cognitive "Quantified Self", IEEE Computer, 2013.10.
12. 1. Introduction to Bioimage Informatics(Bioimage Informatics).
13. Congealingによる多フォント同時アライメント.
14. Reading‐Life Logにおける追跡の利用.
15. クラスタリングによる線虫の挙動解析.
16. ネットワークフロー最適化手法に基づく細胞内粒子群の追跡.
17. 全順序性を持つ大局的特徴系列の選択.
18. 弾性マッチング問題の線形計画法による解法.
19. 情景内画像における文字・非文字領域の差異について.
20. 相対近傍グラフによるパターン分布構造の解析.
21. 大局的特徴に対するDPマッチング
時系列データの離れた時刻間の関係を抽出した大局的特徴を利用可能なDPマッチングの処理スキームを提案する.具体的には,任意の2時刻間で定義される大局的特徴群から,マルコフ性を満たしつつ識別能力も備える特徴を選択することで,大局的特徴へのDPマッチング適用を実現する.オンライン数字データを用いた評価実験で,提案手法の有効性を確認した..
22. 非侵襲的な分光画像撮影による葉緑体機能異常の特異的検出の試み.
23. Development of odor sensing film for odor imaging sensor.
24. Development of odor sensing film for odor imaging sensor.
25. Development of odor sensing film for odor imaging sensor
In recent years, a way to objective evaluation of the quality and quantity of the odor is required to resolve issues such as odor nuisance, gas explosions and poisoning gas. It is also required to detect and track their-harmful odors, and remove their sources radically. In this study, we tried to measure odor materials by fluorescent quenching. Furthermore, we developed odor sensing film using fluorescence dyes and the odor gas detection system using the film and CCD camera. The system could detect gas odor and visualize shape, spread and concentration distribution of odor..
26. How do goalkeepers save penalty kicks? : Analysis of kickers' motion by DP matching
It is known that soccer goalkeepers anticipate shot direction of penalty kicks utilizing the difference in kicking action. In this paper, we analyzed penalty takers' kicking actions with DP matching to identify candidate information sources for successful goalkeeping performance. The results showed that pivoting foot, leg, and hip movements were different among the kick directions just before the moment of foot-ball contact..
27. How do goalkeepers save penalty kicks? : Analysis of kickers' motion by DP matching
It is known that soccer goalkeepers anticipate shot direction of penalty kicks utilizing the difference in kicking action. In this paper, we analyzed penalty takers' kicking actions with DP matching to identify candidate information sources for successful goalkeeping performance. The results showed that pivoting foot, leg, and hip movements were different among the kick directions just before the moment of foot-ball contact..
28. The Data-Embedding Pen
Handwriting is one of the oldest media for human beings. Even after the invention of the printing technology, handwriting has been a popular way to communicate with each other until today. In contrast, handwriting has been less related to modern cyber-world. In this talk, a data-embedding pen is introduced, which has been developed in the authors project, called universal-pattern project, to enhance the value of handwriting in cyber-world. The data-embedding pen has a unique function to inject an ink dot sequence along handwriting. The pattern of the ink dot sequence represents some information, such as writer's ID, writing date, and other meta-information related to the handwriting..
29. Generation of Character Patterns from Sample Character Images
Various character fonts are used depending on a purpose or a use. However, designing character fonts requires great efforts. In this paper, we propose a method for designing fonts with specific characteristics. In the proposed method, Patch Transform algorithm, which divides an image into small patches and reconstructs them, and Shape Context, which is a descriptor of shape information, are used. Experimental results show that the proposed method can automatically design fonts to some extent..
30. An Online Character Recognition Method by DP Matching
In this pater, An online character recognition by DP matching method is proposed..
31. Sound Source Detection
The purpose of this paper is to consider a sound source detection method..
32. Character Detection in Scenery Images
This paper tackles the scenery character detection problem, which is one of the most difficult problems of pattern recognition..
33. Text Detection in Scenery Images
This paper tackles text detection in scenery images..
34. Localization of Multiple Persons
We propose a technique for localizing multiple persons..
35. An Optimization Method for Elastic Matching
This paper describes a method for elastic matching of sequential patterns with nonlinear time warping..
36. Visual Odor Sensing Using Fluorescence Dyes.
37. Visual Odor Sensing Using Fluorescence Dyes
Odor tracking have various application. For example, odor could be removed completely by defining odor sources by tracking odor. Especially odor tracking is expected in robotics. In this study, we measured fluorescence quenching by odor materials. Moreover, we tried to recognize odor flow based on optical method using cooled CCD camera imaging and fluorescence dyes as odor detection probs. Fluorescence dyes are used as sheet and could visualize odor shape and flow by obtaining odor information such as odor concentration gradient..
38. Detection of Granular Objects in Cell by Learning
By the development of the microscope, it is now possible to observe the moving APP-GFPs in cells. By observing their movement, the elucidation of causes of diseases, such as Alzheimer, is expected. Presently, quantitative analysis is performed manually with the microscopes and eyes, consuming much effort of researchers. Therefore, in this report, we attempt the detection of APP-GFPs in cells as the first step of the movement analysis of APP-GFPs. Specifically, we perform preprocessing to the cell image for background noise removal, and try the 2-class classification between background and APP-GPP in each pixel. For the classification, we use 1-class support vector machine (OCSVM), which has been often used for pattern detection problems. Through several experimental results, we will observe the difficulties of the detection problem and consider possible remedies for future research..
39. An Experimental Study toward Massive Character Recognition
In pattern recognition, to increase the number of prototypes is a simple method to improve accuracy. In this paper, we use over 830,000 manually labeled handwriting patterns, and we examine an effect of the number of prototypes on the handwriting numeral recognition. The analysis result showed the error rate decreases about 40% by increasing the number of prototypes 10 times. Other analysis results showed the changing situation of feature space when the number of prototypes increased..
40. Consistent Localization of Persons by Integrating Global and Local Observations.
41. Sound Source Detection by Learning.
42. H-031 Object Tracking Using Multi-Modal Analytical DP.
43. Sequential Pattern Recognition by Local Classifiers and Dynamic Time Warping
This paper describes a method for recognizing sequential patterns with nonlinear time warping. The proposed method uses a sequence of local classifiers, each of which is prepared to provide a recognition result (i.e., class label) at a certain sample point. In addition, in order to compensate nonlinear time warping, the local classifier of the point v has to be assigned to the point t_v of the prototype sequential pattern. Consequently, we must solve the optimal labeling problem and the optimal point-to-point correspondence problem (i.e., the optimal mapping from v to t_v) simultaneously. In the proposed method, this multiple optimization problem is tackled by graph cut. Specifically, the α-expansion algorithm, which is an approximation algorithm for graph cut problems, is employed. After the solving the problem, the input pattern is recognized based on majority voting of the class labels obtained at the local classifiers. Several penalties are introduced for forcing neighboring local classifiers to have the same class labels and continuous point-to-point correspondence. For observing the validity of the proposed method, it was applied to an online character recognition task..
44. Sequential Pattern Recognition by Local Classifiers and Dynamic Time Warping
This paper describes a method for recognizing sequential patterns with nonlinear time warping. The proposed method uses a sequence of local classifiers, each of which is prepared to provide a recognition result (i.e., class label) at a certain sample point. In addition, in order to compensate nonlinear time warping, the local classifier of the point v has to be assigned to the point t_v of the prototype sequential pattern. Consequently, we must solve the optimal labeling problem and the optimal point-to-point correspondence problem (i.e., the optimal mapping from v to t_v) simultaneously. In the proposed method, this multiple optimization problem is tackled by graph cut. Specifically, the α-expansion algorithm, which is an approximation algorithm for graph cut problems, is employed. After the solving the problem, the input pattern is recognized based on majority voting of the class labels obtained at the local classifiers. Several penalties are introduced for forcing neighboring local classifiers to have the same class labels and continuous point-to-point correspondence. For observing the validity of the proposed method, it was applied to an online character recognition task..
45. Consistent Localization of Persons by Integrating Global and Local Observations
Abstract We propose a technique for identifying the positions of multiple persons by integrating global observation from an environment camera and multiple local observations from wearable cameras. Specifically, the proposed technique will establish the optimal matching between candidate positions obtained from global observation with images from local observations from viewpoints of individual persons. Mathematically, the proposed technique formulates the matching problem as a weighted bipartite matching problem to have an optimal and consistent matching of global and local observations. In this paper, the principle of the proposed technique is described and then experimental results are shown for the evaluation of accuracy..
46. Consistent Localization of Persons by Integrating Global and Local Observations
We propose a technique for identifying the positions of multiple persons by integrating global observation from an environment camera and multiple local observations from wearable cameras. Specifically, the proposed technique will establish the optimal matching between candidate positions obtained from global observation with images from local observations from viewpoints of individual persons. Mathematically, the proposed technique formulates the matching problem as a weighted bipartite matching problem to have an optimal and consistent matching of global and local observations. In this paper, the principle of the proposed technique is described and then experimental results are shown for the evaluation of accuracy..
47. Sound Source Detection by Learning
Sound source detection in an image is a difficult inverse problem where the pixels belonging to the sound source area are to be estimated. The purpose of this paper is to consider an accurate sound source detection method by using machine learning framework. Specifically, the proposed method relies on an AdaBoost-based learning scheme for discriminating whether each pixel belongs to a sound source or not. The learning is done by training weak learners to discriminate positive samples (couples of image features around sound sources and audio features) and negative samples (couples of image features distant from sound sources and audio features). This learning scheme simply combines these multimodal information (i.e., image and audio) by using some weak learners to discriminate the samples by a single image feature and others by a single audio feature. The performance of this naive implementation based on a simple combination of multimodal information was examined experimentally and its essential problem was revealed with a possible remedy..
48. Sound Source Detection by Learning
Sound source detection in an image is a difficult inverse problem where the pixels belonging to the sound source area are to be estimated. The purpose of this paper is to consider an accurate sound source detection method by using machine learning framework. Specifically, the proposed method relies on an AdaBoost-based learning scheme for discriminating whether each pixel belongs to a sound source or not. The learning is done by training weak learners to discriminate positive samples (couples of image features around sound sources and audio features) and negative samples (couples of image features distant from sound sources and audio features). This learning scheme simply combines these multimodal information (i.e., image and audio) by using some weak learners to discriminate the samples by a single image feature and others by a single audio feature. The performance of this naive implementation based on a simple combination of multimodal information was examined experimentally and its essential problem was revealed with a possible remedy..
49. Object Tracking by a Combination of Discrete DP and Analytical DP
Visual object tracking by popular dynamic programming (DP) requires huge computations, although it can provide stable tracking results. As a solution of this computational problem, a tracking technique based on "analytical" DP tracking has been proposed. In analytical DP tracking, the tracking cost is approximated locally at each frame as a single quadratic function. By this quadratic approximation, the tracking cost becomes differentiable, and thus it is possible to find the optimal tracking trajectory very efficiently with analytical DP's procedure. However, as a side effect of the use of the single quadratic function the tracking accuracy is not sufficient, especially when the original tracking cost is a complicated function. In this paper, we suggest an improved version of analytical DP tracker, where the tracking cost is approximated by multiple quadratic functions..
50. Object Tracking by a Combination of Discrete DP and Analytical DP
Visual object tracking by popular dynamic programming (DP) requires huge computations, although it can provide stable tracking results. As a solution of this computational problem, a tracking technique based on "analytical" DP tracking has been proposed. In analytical DP tracking, the tracking cost is approximated locally at each frame as a single quadratic function. By this quadratic approximation, the tracking cost becomes differentiable, and thus it is possible to find the optimal tracking trajectory very efficiently with analytical DP's procedure. However, as a side effect of the use of the single quadratic function the tracking accuracy is not sufficient, especially when the original tracking cost is a complicated function. In this paper, we suggest an improved version of analytical DP tracker, where the tracking cost is approximated by multiple quadratic functions..
51. A General Assignment of Supplementary Information
特徴量のみでは本質的に避けることができない誤認識を回避するために,付加情報を用いるパターン認識という枠組みが提案されている.この方式では,パターン認識を行う際に,付加情報と呼ばれるクラスの決定を補助する少量の情報を特徴量と同時に用いて認議性能の改善を目指す.付加情報は自由に設定でき,通常は誤認識率が最小になるように設定する.ここで問題となるのは,誤認識率が最小になる付加情報の設定方法である.常に正しい付加情報が得られるいう理想的な条件においては既に問題が定式化され,付加情報の割当方法が導かれている.しかし,実環境での使用を考えると,付加情報に生じる観測誤差を考慮した割当方法が求められる.そこで本論文では付加情報の観測誤差を考慮に入れて,問題を新たに定式化する.これは付加情報が誤らない場合にも有効な一般的なものである.本論文で導いた割当方法が有効に機能することをマハラノビス距離を用いた実験で例示する..
52. Recognition of Sequential Patterns by Combining Mutually Constrained Local Classifiers
本論文では,時系列パターンの認識手法として,各サンプル点(各時刻)で認識すなわちクラスラベルの決定を行い,最終的にクラスラベル数の多数決によってクラスを確定する手法を検討する.その一つの特徴として,必要に応じて複数サンプル点間に相互制約を設け,それらをできるだけ同じクラスにラベリングする点が挙げられる.これにより,クラスラベルの割当方を制御でき,自由度の高い識別が可能となる.クラスラベルの割当の組合せは総サンプル点数に対し指数関数的に増加する.そこで,グラフの最小切断アルゴリズムいわゆるグラフカットを用いることで,総サンプル点数に対して多項式時間での計算を実現する.オンライン文字データを対象とした認識実験を行い,本手法の有効性を検証した..
53. Improvement of Accuracy of Document Image Retrieval by Expanding Queries and Databases
In this report, we propose a method to improve accuracy of document image retrieval for a camera-pen system. The system is to acquire handwriting on a printed document as digital ink. In this system, document image retrieval is employed to locate the pen-tip position. Features calculated based on the foreground image are used to retrieve the document image and the pen-tip position on it. A problem of this system is that severe perspective distortion in the query image prevents us from acquiring an accurate position. To solve this problem, we improve the discrimination power of the features using a perspective invariant. In addition, we propose two expansion methods which bring database image and query image closer. Database expansion is to store geometrically distorted images in the database, and query expansion is to generate transformed images from a query image. From the experimental results, we confirm that the best combination is the query expansion with the proposed feature..
54. Improvement of Accuracy of Document Image Retrieval by Expanding Queries and Databases
In this report, we propose a method to improve accuracy of document image retrieval for a camera-pen system. The system is to acquire handwriting on a printed document as digital ink. In this system, document image retrieval is employed to locate the pen-tip position. Features calculated based on the foreground image are used to retrieve the document image and the pen-tip position on it. A problem of this system is that severe perspective distortion in the query image prevents us from acquiring an accurate position. To solve this problem, we improve the discrimination power of the features using a perspective invariant. In addition, we propose two expansion methods which bring database image and query image closer. Database expansion is to store geometrically distorted images in the database, and query expansion is to generate transformed images from a query image. From the experimental results, we confirm that the best combination is the query expansion with the proposed feature..
55. Improvement of Accuracy of Document Image Retrieval by Expanding Queries and Databases.
56. Real-Time Nonlinear FEM-Based Simulator for Deforming Volume Model of Soft Organ by Neural Network
本論文では,ニューラルネットワークを用いて,軟性臓器モデルの変形をシミュレートする新たな手法を提案する.提案手法は,基本的なモデルの変形(以後,変形モードと呼ぶ)の組合せに基づいて,モデルの変形を推定する.つまり,変形モードをあらかじめ非線形有限要素法で求め,臓器に加わった外力と,それに対応する変形モードの関係をニューラルネットワークで学習する.学習したニューラルネットワークは,非線形有限要素解析によりモデルの振舞いを推定することを模倣する.実験結果より,提案手法は,非線形有限要素解析とほぼ同程度の精度を保ちつつ,計算コストを大幅に削減することができた..
57. Part-based recognition of handwritten digits
This paper investigates a part-based recognition method of handwritten digits. In the proposed method, the global structure of digit patterns is disregarded by representing each pattern by just a set of partial patterns. The method is then comprised of two steps: first, each of J partial patterns of a target pattern is recognized into one of ten categories ("0"-"9") by the nearest neighbor discrimination with a large database of reference partial patterns. Then, the category of the target pattern is determined by the majority voting on the J local recognition results..
58. Liwicki Marcus, 内田 誠一, 岩村 雅一, 大町 真一郎, 黄瀬 浩一, データ埋め込みペンの実装, 電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解, Vol.109, No.418, pp.105-110, 2010.02, 本稿では,筆記時にストロークに沿って付加情報を埋め込むペン-データ埋め込みペン-について,その原理と実装結果について論ずる.付加情報はペン先に装着されたインクジェットノズルからの微小インク滴として塗布される.データはこのインク滴の間隔などを制御することで表現される.実験の結果,少なくとも28ビットの埋め込みおよび復元に成功した..
59. Quality Analysis of Handwriting Recovery from Pen-Tip Camera Images
Quality of a camera-based handwriting pattern acquisition system is analyzed. The system assumes that the camera is mounted around the pen-tip and acquires a frame sequence, where the fine structure of the paper surface, called paper fingerprint, is captured. SURF keypoints extracted from paper fingerprint are used for establishing the correspondence between two consecutive images. By repeating this process for every pairs of consecutive frames, we can obtain a mosaic image which shows the entire image of the reconstructed handwriting pattern. In this paper, we analyze the causes which degrades the quality of the reconstructed handwriting pattern..
60. License plate detection using local features
This paper proposes a method for detecting license plates by local feature called SURF, and concealing the detected plates. SURF is a method for extracting rotation and scale invariant local feature, like SIFT. In the proposed method, first, SURF features are extracted from training-patterns capturing license plates. Then, each of SURF features from a testpattern is compared with features from training-patterns, and finally, the local areas with high similarity are detected as the location of a license plate. After that, the detected plate area is concealed by blurring..
61. Instance-Based Localization Using Local Features
This paper analyzes location recognition by using local features and a simple nearest neighbor approach; after determining the location of each input local feature by nearest neighbor, the final recognition result of the entire input image is determined by voting. Specifically, if the input image is gets N local features, N location candidates are determined by the nearest neighbor (or k-nearest neighbor) from stored scenery images at known locations, and then the most major location is selected as the final location recognition result. Image block and SURF ware employed and examined as local features..
62. Camera Pen System Using Feature Tracking and Document Image Retrieval
This report presents a camera-pen for acquiring handwriting as digital ink. With a pen we write memos on blank paper as well annotations on existing documents. Therefore, it is necessary for the camera-pen to acquire handwriting for both cases. In particular, for the case of annotations, not only acquiring handwriting but also locating them onto the document are essential. We fulfill these requirements by combining two existing methods: one is for recovering handwriting on blank paper, and the other is for locating handwriting based on document image retrieval. The problem to be solved for the combination is the difference of required area to be captured by the camera (the former requires fine images with smaller areas, while the latter needs larger areas). In the proposed method, this problem is solved by using mosaicing of captured images. Handwriting is recovered from the fine image based on SURF features extracted from the paper surface and characters printed on it. In addition, mosaicing with SURF features allows us to obtain a larger image. Once a sufficiently large image is obtained, document image retrieval is employed to locate the recovered handwriting. From the experimental results, we discuss the effectiveness of the proposed method as well as future work to be explored..
63. Character Detection from Scenery Images Using Scene Context
Character detection in scene images is the process to find characters in scene images. Conventional character detection approaches have utilized features of character shapes. However, there is a limitation on those approaches because the shape and the size of characters have huge variations. In this article, we propose approach which focuses on scene context, such as sky, tree, and buildings. For example, there is rarely any character in sky, and thus by knowing the area of sky in scene images, we can exclude false positives from the area. We confirmed experimentally that precision of the character detection improved by using scene context information..
64. Handwritten Digit Recognition by Analytical DP Matching
Elastic matching is one of the promising tequniques for handwritten character recognition. In this tequnique, an input pattern is nonlinearly fitted to a refrence pattern while minimizing their distance as possible. Analytical two-dimensional DP Matching is a novel elastic image matching method to reduce the computations drastically. This method employs a parametric representation of local matching evaluation. This parametric representaiton allows to introduce an analytical optimization process into the DP framework and realizes drastic reduction of computations. In this paper, we evaluate the performance of this method in handwritten digit recognition..
65. Reconstruction of Handwritings via Pen-Tip Camera Images
本論文では,ペン先カメラの映像から,手書きパターンを復元できることを実証する.具体的には,紙面の微細構造-紙指紋-の動きに着目したビデオモザイキング法により,ペン先の軌跡すなわち手書きパターンを推定できることを示す..
66. Trachea and Esophagus Classification by AdaBoost
気道確保法の一つである気道挿管では,通常まず喉頭鏡を使って喉頭展開を行い,声門の位置を目視により確認する.しかし実際の医療現場では,上気道閉塞など様々な要因で,声門の位置を目視により確認しづらい場合がある.この不完全な確認が原因で食道へ誤挿管した場合,気道が確保されず危険なだけでなく,無理な目視のために頸椎や歯牙損傷などの合併症を引き起こす危険性がある.安全・確実な気道挿管の実現に向けて,我々は,スタイレット先端に小型カメラを搭載した自動気管内挿管システムを開発することを自指している.本論文では,その要素機能として,カメラから取得される画像から,挿管チューブが気道あるいは食道に挿管されているかを自動的に識別する方法を提案する.本手法は,気道画像には気道周囲の輪状軟骨が特徴的に観察されることから,まずこの環状模様の記述に適した特徴量を定義し,それに基づいた気道・食道識別器をAdaBoostによって構築する.実験の結果,97.6%の高い識別率で気道・食道の判別が可能であり,提案手法の有効性が確認できた..
67. 4. Problem Analysis of Pattern Recognition and Media Understanding(Grand Challenges in Pattern Recognition and Media Understanding)
本稿ではパターン認識,メディア理解の中でも,特に画像認識・理解に焦点をあてて,その問題について考える.まず画像理解独特の難しさを分析・整理する.次に,最近,急激に普及してきている顔画像認識を取り上げて,なぜ実用化がうまくいったのかを分析し,その特殊性を明らかにする.最後にこれまで開発された様々な画像認識・理解技術を物理モデル・統計モデル・意味モデルという三軸が張る空間で統一的にとらえ,未来へ向けて今後どのようなアプローチを取ればよいのかについて考える..
68. 5. The Ten Biggest Challenges in Pattern Recognition and Media Understanding(Grand Challenges in Pattern Recognition and Media Understanding)
パターン認識・メディア理解分野において,次の10年に解くべき問題として,画像生成過程が不確定的な場合のモデル化,統計モデルの困難さの克服,及び,意味・内容にかかわる問題がある.このような技術要素を含み,科学技術として挑戦する度合いが高く,かつ,成功した場合の社会的な波及効果が大きいことを基準に,今後10年間にチャレンジすべき重要テーマを例示した.これらは,人間の行動の認識・理解・評価,画像情報の関連付けと全自動構造化,視覚情報からの意味ある情報を抽出,人に不足する視覚情報の検出と提示,状況観察による危険予知,健康・医療における画像診断,人の観察による環境認識,一般情景内に存在する文字の認識,地球規模のセンサから得られる膨大な情報の処理,そして最終的には,画像に対する意味の記述を人と同じレベルで可能にすることである.本稿では,この10大チャレンジテーマについて具体的に解説した..
69. Visual Tracking Based on Global Optimization : DP Tracking
映像中の物体のトラッキングは,その物体のフレーム間の移動量の最適推定問題として定式化される.本論文では,その大局的最適解を得るために,動的計画法(DP)を用いたトラッキング手法を提案する.従来,幅優先探索の一種として扱われていたDP最適化では,画像のサイズやパラメータの増加により,探索幅が非常に大きくなり計算量が増加するという問題がある.これに対し本論文ではDPの解析的解法をトラッキング問題に適用する.これは,最適化の評価に用いられる局所的な誤差関数を二次関数近似することで,DPによる最適化過程に微分による最適化を導入した手法である.幅優先探索なしに解析的にかつ高速に最適解を得ることができ,トラッキング問題には特に有効といえる.本論文では,本手法の定式化と実験結果を示す..
70. Visual Tracking of an Object with its Motion Information
Tracking of a moving robot in surveillance video is an important task for coexistence of human beings with robots. An essential technology to manage coexistence environment of human beings and moving robots is separation and tracking of moving robots. For this task, the moving robot should be separated from other moving objects, i.e., human beings. We assume that the robot provides its additional motion information to the surveillance system to ease the task. The robot can be tracked from the other objects as a moving region being consistent with the additional motion information. For this purpose, we modify a tracking algorithm based on particle filter in order to incorporate the additional motion information. The results of an experiment on real surveillance video sequences have indicated that the proposed framework can separate and track a moving robot under the existence of several walking persons..
71. Recovering Handwritings via Pen-tip Camera
Toward realization of "writing-life-log", a camera-based handwriting pattern acquisition system is proposed. The camera is attached around the tip of a popular pen. It captures frame images around the pen tip continuously. Our problem is video-mosaicing of those frame images by perspective registration of consecutive frames. A key idea is to use precise structure of paper surface, called paper fingerprint, for the registration. Specifically, perspective transformation is estimated by using correspondence of SURF feature points extracted on paper surface. Since the precise structure can be captured stably as the SURF feature points from the pen-tip camera, thus it is possible to expect accurate registration of video frames..
72. Dewarping of Planar Document Image without Layout Constraints
For user convenience, processing of document images captured by a digital camera has been attracted much attention. However, most existing processing methods require an upright image such like captured by a scanner. Therefore, we have to cancel perspective distortion of a camera-captured image before processing. Although there are rectification methods of the distortion, most of them work under certain assumptions on the layout; the borders of a document are available, textlines are in parallel, a stereo camera or a video image is required and so on. In this paper, we propose a layout-free rectification method which requires none of the above assumptions. We confirm the effectiveness of the proposed method by experiments..
73. Shape Analysis of Characters in Medieval Printed Documents
In the project "The Development of Anglo-Saxon Language and Linguistic Universals" (organized by Senshu University, Japan), an OCR system for medieval English manuscripts is developed. One of the difficulty for the development is the variability of character shapes by rough paper surface, heavy/light print, degradation on binarization, etc. Another difficulty is that each manuscript has its own character shape and thus we must prepare document-specific reference patterns through a manual labeling process on 5 or 10 or more pages. Using 100 pages of ground-truthed "Pierce Plowman" (printed in 1550), recognition performance was observed under different numbers of labeled pages. The effect of active shape model for compensating character shape variation was also observed..
74. Challenges in character recognition research
Character recognition and document understanding might be seen as a solved problem, since nowadays there are many commercial and practical systems as products of great efforts by pioneers in this research area. We, however, can recognize a huge number of open problems. In this report, after reviewing those open problems briefly, several challenging problems are proposed for encouraging and inviting young researchers to this interesting and never-ending research area..
75. Conspicuous Character Patterns
Characters in scene image are often hard to detect, i.e., not conspicuous. Thus, one of the main tasks for camera-based character recognition is the detection of characters in scene image. There are many past attempts for this difficult task. This paper investigates the essence of this task, that is, "what is conspicuous character images?" In order to have an example of the conspicuous character image, we use the relation between the subspace of non-character images and that of character images. Specifically, we try to select an image in the set of character images furthest from the subspace of non-character images..
76. Human Activity Recognition Based on Camera Selection by Boosting
A gesture recognition method for multi-camera surveillance is proposed. The proposed method possesses the following three characteristics desirable for practical surveillans. First, the final recognition result is provided by integrating recognition results from individual cameras complementary. Second, camera calibration is not necessary. Third, various sensors other than cameras can be incorporated. The complementary integration is systematically done by an AdaBoost-based training. In addition, we use the local feature which is less discriminative to the important difference among the gestures..
77. Particle Filter with Mode Switching
Tracking of a moving robot in surveillance video is an important task for coexistence of human beings with robots. An essential technology to manage coexistence environment of human beings and moving robots is separation and tracking of moving robots. For this task, the moving robot should be separated from other moving objects, i.e., human beings. We assume that the robot provides its motion information to the surveillance system to ease the task. The robot can be tracked from the other objects as a moving region being consistent with the motion information. For this purpose, we modify a tracking algorithm based on particle filter in order to incorporate the motion information..
78. A Primary Study on a Data-Embedding Pen.
79. HMM for On-Line Handwriting Recognition by Selective Use of Pen-Coordinate Feature and Pen-Direction Feature
本論文では,高精度なオンライン文字認識のために,方向特徴並びに座標特徴を適切に使い分け可能な隠れマルコフモデル(HMM)を提案する.両特徴はいずれもオンライン文字認識の基本的な特徴量でありながら,全く異なった性質を示す.すなわち,線分内で方向特徴が定常的なのに対し,座標特徴は常に非定常である.したがって,HMMの枠組みにおいて両特徴を同等に扱うのは問題が多い.実際従来法では,座標特徴を用いずに方向特徴だけが用いられることが多かった.本論文で提案するHMMでは,方向特徴を状態内自己遷移における出力シンボルとして使用し,座標特徴を状態間遷移における出力シンボルとして使用する.このようにすることで,線分方向が一定した定常的な部分においては方向特微が,線分の方向が変化する過渡的な部分においては座標特徴が評価されることになる.このように特徴を使い分けることで,従来法に比べ認識精度を大幅に向上できることを,多画文字(漢字)の筆順フリー認識実験並びにその詳細な考察を通して示す..
80. An HMM Representing Stroke Order Variations and Its Application to Online Character Recognition
本論文では,筆順フリーなオンライン文字認識の高精度化を目指し,(i)筆順変動の統計的モデルの構築,及び(ii)その認識における利用,の2点について検討する.一般に筆順フリー化には不自然な画対応の許容による誤認識の問題があるが,提案する筆順変動モデルを用いることでそれらを抑制できる.この筆順変動モデルは,筆順フリー認識のためのグラフモデル(キューブグラフ)の確率的拡張として定式化され,結果的に文字形状に関するゆう度と筆順のゆう度を同時に扱うことが可能な隠れマルコフモデル(HMM)の一種となる.公開されているオンライン文字データベース"HANDS-kuchibue.d-97-06-10"を用いた認識実験により,筆順変動モデル導入の有効性及び妥当性を明らかにした..
81. 2D/3D Registration by Back Projection and Geometrical Constraints
レンジセンサにより取得した幾何モデルにカラーセンサで撮影したテクスチャ画像を貼り付けて表示するテクスチャマッピングを容易に実現するには,テクスチャ画像と幾何モデルのみからカラー・レンジセンサ間の相対位置関係を知ることが望ましい.本論文では,幾何拘束に基づく大域的手法とエッジの対応付けに基づく局所的手法の組合せにより,センサ間の相対位置・姿勢を初期値の変動にロバストにかつ高精度に推定し,テクスチャ画像と幾何モデルの位置合せを実現する手法を提案する.本手法はまず,テクスチャ画像から稜線と平面領域を抽出する.次に,この稜線と平面領域を幾何モデルに逆投影し,対象における幾何拘束条件を推定しつつ,この拘束条件のもとでセンサ間の相対位置・姿勢の初期推定値を求める.最後に,テクスチャ画像と幾何モデルの各エッジ間の対応付けに基づき,センサ間の相対位置・姿勢を決定する.実験では,エッジ間の対応付けに基づく従来手法と比較して,位置合せの成功率が41%から75%に向上した..
82. DP-1-3 GRAND CHALLENGE FOR PATTERN RECOGNITION AND MEDIA UNDERSTANDING : FROM THE VIEWPOINT OF DOCUMENT ANALYSIS AND RECOGNITION.
83. Investigation of using pen-coordinate and pen-directive features by HMM in the online character recognition
A new hidden Markov model (HMM) is proposed to represent character strokes of on-line handwriting patterns. The proposed HMM deals with two typical features describing strokes, pen-direction feature and pen-coordinate feature. These two features are quite different in their stationarity; the pen-direction feature is stationary within every line segment of strokes whereas the pen-coordinate feature is not. To deal with these contrasting features by a single HMM, they are used selectively in the HMM. Specifically speaking, the pen-direction feature is output repeatedly at the intra-state transition whereas the pen-coordinate feature is output once at the inter-state transition. The usefulness of the proposed HMM over the conventional HMMs were shown through stroke order-free Chinese character recognition experiments..
84. Recognition and Analysis of English Historical Documents : Purpose, Problems, and Preliminary Study
One of the goals of the project "The Development of Anglo-Saxon Language and Linguistic Universals" (organized by Senshu University, Japan) is to develop an OCR system for medieval English manuscripts, that is, historical handwritten English documents. In this report, we will discuss characteristics of medieval English printed documents as a preliminary study towards the above goal. In addition, a preliminary recognition experiment was conducted on a small-scale character set from a medieval printed document, called "Pierce Plowman" (printed in 1550)..
85. An Improvement of Instance-Based Skew Estimation
The purpose of this report is to improve an instance-based deskewing technique which is free from the conventional assumption that text lines are straight and parallel. The instances describe the relation among the skew angle, a skew variant, and a skew invariant in an compact manner. A main idea of the improvement is to increase the number of the skew invariants for more stable estimation of the skew angle. An experimental result on 55 document images showed that their skew angles were successfully estimated with errors smaller than 2.0 degrees..
86. Boosting-Like Training for Early Recognition and Its Application to Online Character Recognition
This paper describes an algorithm for recognizing sequential patterns at their beginning. The algorithm is based on a boosting-like scheme for training weak-learners prepared at individual frames. Training samples misrecognized by the weak-learner at a certain frame are heavily weighted at the training of the weak-learner at the next frame. The algorithm was applied to an online character recognition task for showing its usefulness..
87. Document Skew Estimation by Instance-Based Learning
各文字の回転変形に対する変量と不変量を事例として学習しておき,それらを利用することで文書画像の回転角を推定する方法を提案する.本手法は,文字単位で回転角を効率的に推定するため,文字列が直線的かつ平行にレイアウトされているという仮定が不要であり,したがって様々なレイアウトの文書に利用可能である..
88. Relationship Between Errors of Supplementary Information and Misrecognition Rates
Pattern recognition with supplementary information is a new pattern recognition framework that determines an output class by combining a classifier's output and supplementary information suggesting the true class. Under the condition that supplementary information does not contain error, we have proposed a theory that reduces misrecognition rates. However, in the real world, we cannot observe any measure without error. Thus, in this paper, we discuss how to reduce misrecognition rates using the erroneous supplementary information, and show the possibility to reduce misrecognition rates experimentally..
89. Supplementary Information Embedment with Area Ratio for Camera-Based Character Recognition
ディジタルカメラを入力デバイスとして実環境中の文字を高精度に認識するために,文字画像と同時に認識補助のための付加情報を提示する方法が検討されている.付加情報は,人間にとって自然な形で提示されること,及び,幾何学的変形に対してロバストに抽出できることが要求される.本論文では,これらの要求を満たす手法として,面積比を利用した付加情報提示手法を提案する.すなわち,文字パターンを2色で印字することを前提とし,それぞれの色の領域の面積比を特定の値とするようにデザインする.具体的には,文字に影を付加したり輪郭線を別の色とする.これらは文字パターンのデザインとして既に行われており,提案手法はその線幅や面積を変えるにすぎない.したがって,提案手法は様々な用途に広く応用することが可能である.面積比はアフィン変換に不変であり,アフィン変換を受けた環境においても誤りなく抽出されることが期待される.実際に付加情報を埋め込んだ文字パターンを作成し,ディジタルカメラで撮影された画像中の文字パターンから付加情報を抽出する実験を行い,提案手法の有効性を確認する.また,付加情報を用いて文字を認識する実験を行い,認識精度が向上することを確認する..
90. On methods for taking pedstrians' paths for prediction.
91. FSA-Guided Optimal Segmentation and Its Application to Camera-Based Character Recognition
本論文では,動的計画法(DP)と有限状態オートマトン(FSA)の組合せに基づいた,一次元信号の最適セグメンテーション手法を提案する.具体的には,信号の性質(例えば信号の値が高い区間と低い区間が交互に繰り返すと言った性質)をFSA表現した上で制約条件としてセグメンテーション問題に組み込み,その制約下での大局的最適セグメンテーションをDPにより効率的に求める.FSAの導入により,信号の性質と一致しないセグメンテーション結果は排除され,精度の向上が見込める.更に,FSA状態と各区間の対応結果によって各区間の意味付けも可能となる.本論文では本手法の詳細を述べるとともに,更にある種の実環境文字画像認識タスクに適用することでその有効性を評価する..
92. Fast 3D Shape Reconstruction of Moving Object by Parallel Fast Level Set Method
多数台のカメラによりシーン内に存在する対象物体の全周の幾何情報及び光学情報を取得し,任意視点からの画像を生成する手法として,視体積交差法と多視点ステレオ法が提案されている.しかしこれらの手法は単一物体あるいはオクルージョンの生じない複数物体を対象とした手法であり,シーン内に複数物体が存在し物体間に相互オクルージョンが生じる場合,それぞれの物体形状を同時に復元することは困難であった.この問題に対し,我々はこれまでに高速な境界追跡手法であるFast Level Set Methodを複数ステレオ距離画像に適用し,複数対象物体の三次元形状をオクルージョンに頑強に復元するシステムを構築している.本論文では,これまでに構築したシステムを8台の計算機からなるPCクラスタへ実装し,Fast Level Set Method処理の並列計算により,より高速な三次元形状の復元を実現する.また対象物体が移動する場合,その移動方向を予測し,移動体を処理する計算機の計算負荷を低減することで,移動体の正確な三次元形状を遅れなく復元する手法を提案する.更に,舞踊の測定実験により,対象が高速に移動しても,従来システムと比較してより正確な三次元形状の復元が可能であることを示す..
93. Analytical DP Matching
パターン認識・画像処理において多用される弾性マッチング手法に動的計画法によるマッチング,いわゆるDPマッチングがある.DPマッチングは離散化された最適化問題の幅優先探索に基づく解法であり,したがって探索の幅が非常に大きくなる問題に対しては適用困難であった.この問題を解決すべく本論文では解析的DPマッチングを提案する.本手法では,マッチングの評価に用いられる局所的な誤差関数を二次関数近似することで,幅優先探索なしに解析的に近似解(二次関数近似された問題の厳密解)を与えることができる.本論文では一次元パターンに対するマッチングアルゴリズムを導出し,更に実際の問題に適用し得ることをオンライン文字データを用いて実験的に検証する..
94. Detection of Similar Sub-Sequence by Logical DP Matching
本論文では,論理判定型DPマッチングによる類似区間検出手法について提案する.論理判定型DPマッチングとは,サポートと呼ばれる論理関数を基準として用いて二つのパターン間の非線形マッチングを行うアルゴリズムである.本手法の特徴は,パターン間に複数存在する類似区間の始端及び終端をマッチングの過程で最適に決定していく点にある.また,本手法の有効性を評価するための一応用として,ジェスチャの基本動作抽出についても検討する.実験の結果,本手法の基本的な性能を示すことができた..
95. 付加情報の利用による認識率100%の実現―誤りのないパターン認識手法の理論と実践―.
96. Separation of Touching Characters Using DP Matching
Ideally, an OCR system would partition the set of connected black components on a page into subsets representing individual characters. However, this approach is inadequate if some component partially belongs to several touching characters. We present a DP matching-based method that cuts such a component apart, given a hypothetical classification for the leftmost part. Our method produces better quality cuts than well-known methods, particularly in mathematical expressions, where characters are often slanted and may touch in widely varying configurations. A good cut allows single-character recognition techniques to be applied to the cut part and the residual image, in order to judge whether the hypothetical classification was correct..
97. Evaluation of Data Extraction Accuracy toward the Realization of Data-embedding Pen
In order to use handwritings as a universal man-machine interface, we assume a pen device - data-embedding pen - which can embed digital data into handwriting by invisible ink in a real-time manner. This paper evaluates accuracy of extracting data from ink dots embedded in real patterns. Furthermore, a method to extract ink dots and recover data is proposed..
98. Desynchronization of Features on Pattern Matching
Desynchronization of feature sequences and its effect in online character recognition based on elastic matching are investigated. The investigation has provided the following results. First, desynchronized use of features (e.g., x-coordinate and y-coordinate) realizes the wide-range shape adaptation between character patterns. Second, local desynchronization is quite useful to improve the recognition accuracy..
99. Skew Detection of Document Images by a Combination of Variant and Invariant
A novel deformation estimation technique is proposed and applied to document skew estimation. The proposed method has two properties. First, it utilizes an invariant and a variant of a target deformation to be estimated. Second, it is an instance-based method where the deformation is estimated by referring stored instances which describe the relation among the deformation, the variant, and the invariant. The result of a skew estimation experiment on 44 document images has shown that the skew angles of 42 document images were successfully estimated with errors smaller than 2.0 degrees..
100. A Predictive DP Matching Algorithm and Its Application to On-Line Character Recognition
For on-line character recognition, predictive DP matching is proposed where two physically different features, coordinate features and directional features, are handled in a unified manner. For this unification, the distance of the directional features is converted into a distance of the coordinate features by a feature prediction technique. An experimental result showed that the predictive DP matching could attain a higher recognition rate than that of the conventional DP matching which requires the costly optimization of the weight to balance the two features..
101. Statistical Extension of Logical DP Matching for the Detection of Similar Sub-Sequences
The logical DP matching algorithm has been proposed for the detection of similar sub-sequences from two sequential patterns. The logical DP algorithm evaluates the local dissimilarity between the two patterns based on simple Euclidean distance. In this paper, the local dissimilarity is extended as a statistical distance, called Bhattacharyya distance, for considering spacial variations of each pattern. Experiments of the detection of similar sub-sequences of gestures were conducted and the effect of statistical extension was ensured through the detection accuracy..
102. Pattern Recognition with Supplementary Information
本論文ではパターンが属するクラスの情報(付加情報)をパターンと同時に識別器に入力し,パターンと付加情報から矛盾のない答を導くことで誤認識を防ぐ方式を検討する.この方式では付加情報の情報量が増えれば増えるほど認識率は100%に近づく.そのため,従来のパターン認識のように,いかに認識性能を向上させるかではなく,ある認識率を達成するために必要な付加情報の情報量をいかに小さくできるかが課題となる.本論文では付加情報の割当方と認識性能の関係を導き,実験によりデモンストレーションする..
103. DP Matching : Fundamentals and Applications
Dynamic programming (DP) matching has been developed in early 1970's and vastly employed in various pattern recognition and image processing problems as an efficient algorithm to provide optimal elastic matching (nonlinear correspondence) between two patterns. This paper describes DP matching for 1D-1D, 1D-2D, and 2D-2D pattern matching problems with techniques for reducing computational complexity. Several combinations with learning algorithms are also described..
104. A Report on MIRU2006 Young Researchers' Program
MIRU Young Researchers' Program was held in conjunction with MIRU2006 symposium for promoting exchanges between young researchers in CV and PR. This report gives a summary of the program..
105. 黄瀬 浩一, Masakazu IWAMURA, Yoshio FURUYA, Shinichiro OMACHI, Seiichi UCHIDA, Better Decision Boundary for Pattern Recognition with Supplementary Information, IEICE Technical Report, Vol.106, No.PRMU-376, pp.159-164, 2006.11.
106. Early Recognition and Prediction of Gestures for Embodied Proactive Human Interface
This paper concerns three topics for realizing embodied“proactive”human interface, where a humanoid is used as an interface capable of making some reaction against to user's gesture input in advance to the termination of the gesture. The first topic is early recognition of gestures: the recognition result of a gesture is provided at the beginning part of the gesture. The second topic is motion prediction: the subsequent posture of the person who makes a gesture is predicted by using the result of early recognition. The third topic is a network model constructed for improving the performance of early recognition and motion prediction. The effectiveness of these methods was shown by experimental results..
107. Effect of Shifting Decision Boundaries in Pattern Recognition with Supplementary Information : Experimental Research Using Artificial Samples From Normal Distributions
Pattern recognition with supplementary information which differs from the conventional pattern recognition has been proposed. This framework is capable of decreasing error rates by using not only a pattern itself but also its supplementary information that assists recognition. In the previous report, we confirmed a better recognition rate is achievable by the shift of the decision boundaries from the Bayesian ones in the experiment using a character data set. However, it was not the strict proof of the existence of the achievability because the Bayesian decision boundaries are estimates. In this report, to confirm the achievability in the strict sense, we make use of the artificial samples following the normal distributions. This enables us to obtain the Bayesian decision boundaries precisely..
108. Telecommunication via embodied proactive interface
The purpose of this research is the development of a new interface called "proactive interface" for natural telecommunication. Features of the proactive interface are twofolds. The first feature is an embodied device using robot technology. Instead of virtual media, humanoids are used as the interface for presenting gesture of a user to a distant user. The second feature is estimation of user's intention for compensating system delays. A recognition-based gesture prediction scheme can be used for the estimation. A two-way telecommunication system connecting two distant campuses was developed to demonstrate the proactive interface..
109. Telecommunication via embodied proactive interface
The purpose of this research is the development of a new interface called "proactive interface" for natural telecommunication. Features of the proactive interface are twofolds. The first feature is an embodied device using robot technology. Instead of virtual media, humanoids are used as the interface for presenting gesture of a user to a distant user. The second feature is estimation of user's intention for compensating system delays. A recognition-based gesture prediction scheme can be used for the estimation. A two-way telecommunication system connecting two distant campuses was developed to demonstrate the proactive interface..
110. Telecommunication via embodied proactive interface
The purpose of this research is the development of a new interface called "proactive interface" for natural telecommunication. Features of the proactive interface are twofolds. The first feature is an embodied device using robot technology. Instead of virtual media, humanoids are used as the interface for presenting gesture of a user to a distant user. The second feature is estimation of user's intention for compensating system delays. A recognition-based gesture prediction scheme can be used for the estimation. A two-way telecommunication system connecting two distant campuses was developed to demonstrate the proactive interface..
111. Recognition and Understanding of Characters and Documents Using Digital Cameras
ディジタルスチルカメラやビデオカメラの普及と発展に伴って,撮影した画像内の文字・文書を情報処理に利用したいという要求が高まっている.本稿では,このようなカメラを用いた文字・文書の認識・理解を通して,我々は何を得ることができるのか,また実現には何が問題であり,現在どのような取組みがなされているのかについて解説する.加えて,残された研究課題について触れるとともに,エーザインタフェースへの適用の視点から筆者らが進めている新しい試みについても紹介する..
112. Category Data Embedding for Camera-Based Character Recognition
本研究は,バーコードと同程度の精度で三次元実環境中の文字パターンを認識することを目標としている.実環境中の文字パターンは,撮影状況により様々なひずみ,例えば射影変換ひずみを受ける.このため,通常の文字認識手法の延長線上でこの目標を達成しようとしても,相当の困難が予想される.そこで本論文では,文字そのものに機械可読性を補強するような情報を埋め込む方式を検討する.具体的には,文字画像に対し,しま模様状のパターンを埋め込む.このパターンを構成する各しまの幅から計算される複比は,文字パターンがどのように射影変換ひずみを受けてたとしても常に一定値となる.したがって,カテゴリーと複比の値をあらかじめ対応づけておけば,抽出された複比を識別の手掛りとして認識時に利用できる.シミュレーション実験の結果,複比と文字形状情報を併用することで,射影変換ひずみを受けても非常に高い認識精度が得られることが分かった..
113. マロン クリストファー, 鈴木 昌和, 内田 誠一, サポートベクターマシンによる数学記号認識, 電子情報通信学会技術研究報告. TL, 思考と言語, Vol.105, No.612, pp.49-54, 2006.02, Mathematical formulas challenge an OCR system with a range of similar-looking characters whose bold, calligraphic, and italic varieties must be recognized distinctly, though the fonts to be used in an article are not known in advance. We describe the use of support vector machines (SVM) to learn and predict about 300 classes of styled characters and symbols..
114. マロン クリストファー, 鈴木 昌和, 内田 誠一, サポートベクターマシンによる数学記号認識, 電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解, Vol.105, No.614, pp.49-54, 2006.02, Mathematical formulas challenge an OCR system with a range of similar-looking characters whose bold, calligraphic, and italic varieties must be recognized distinctly, though the fonts to be used in an article are not known in advance. We describe the use of support vector machines (SVM) to learn and predict about 300 classes of styled characters and symbols..
115. Mosaicing-by-Recognition with Interframe Matching
The authors have investigated a Mosaicing-by-Recognition technique, where video mosaicing and text recognition are simultaneously and collaboratively optimized in a one step manner. Specifically, multiple frames in which a long line of text appears are captured by a moving camera, and are optimally matched and concatenated with a guidance of the text recognition framework. In this report, we improve the Mosaicing-by-Recognition technique by introducing interframe matching..
116. Mosaicing-by-Recognition with Interframe Matching
The authors have investigated a Mosaicing-by-Recognition technique, where video mosaicing and text recognition are simultaneously and collaboratively optimized in a one step manner. Specifically, multiple frames in which a long line of text appears are captured by a moving camera, and are optimally matched and concatenated with a guidance of the text recognition framework. In this report, we improve the Mosaicing-by-Recognition technique by introducing interframe matching..
117. On-line data embedding into handwriting patterns
In order to use handwritings as a universal man-machine interface, we assume a pen device -data-embedding pen- which can embed digital data into a handwriting by invisible ink in a real-time manner. This paper discusses the system design, application, and required technologies around the data-embedding pen. Especially, a novel stroke recovery algorithm is proposed for retrieving the embedded data along writing order. In the algorithm, embedded data is used to help the recovery. A simulation experiment showed that the algorithm can attain high accuracy on the stroke recovery and the data retrieval..
118. On-line data embedding into handwriting patterns
In order to use handwritings as a universal man-machine interface, we assume a pen device-data-embedding pen-which can embed digital data into a handwriting by invisible ink in a real-time manner. This paper discusses the system design, application, and required technologies around the data-embedding pen. Especially, a novel stroke recovery algorithm is proposed for retrieving the embedded data along writing order. In the algorithm, embedded data is used to help the recovery. A simulation experiment showed that the algorithm can attain high accuracy on the stroke recovery and the data retrieval..
119. Analytical DP matching and its application to pattern recognition
DP (dynamic programming) matching is one of the most fundamental techniques for various pattern recognition and image processing problems. This report describes a novel DP matching algorithm, called analytical DP matching. Conventional DP matching is organized as a breadth-first search algorithm. Thus, its computational complexity depends on the search width. In contrast, analytical DP matching has a different organization; specifically, it is an analytical solution method and can provide optimal matching with computational complexity which does not depend on the search width. The details of the algorithm and performance evaluation results are discussed in this report..
120. Analytical DP matching and its application to pattern recognition
DP (dynamic programming) matching is one of the most fundamental techniques for various pattern recognition and image processing problems. This report describes a novel DP matching algorithm, called analytical DP matching. Conventional DP matching is organized as a breadth-first search algorithm. Thus, its computational complexity depends on the search width. In contrast, analytical DP matching has a different organization; specifically, it is an analytical solution method and can provide optimal matching with computational complexity which does not depend on the search width. The details of the algorithm and performance evaluation results are discussed in this report..
121. On-line Character Recognition based on Subspace Method and DP Matching
The authors have investigated into online character recognition technique with quadratic discriminant function of a difference vector, which expresses a global feature of a character. This technique successfully reduced the misrecognitions due to overfitting. On the other hand, this technique also caused the misrecognitions due to insufficient of training samples. In this report, which is a further study of the previous investigation, we newly investigate the method based on the subspace method and DP matching. The results of recognition experiment on UNIPEN database showed the usefulness of the proposed technique..
122. On-line Character Recognition based on Subspace Method and DP Matching
The authors have investigated into online character recognition technique with quadratic discriminant function of a difference vector, which expresses a global feature of a character. This technique successfully reduced the misrecognitions due to overfitting. On the other hand, this technique also caused the misrecognitions due to insufficient of training samples. In this report, which is a further study of the previous investigation, we newly investigate the method based on the subspace method and DP matching. The results of recognition experiment on UNIPEN database showed the usefulness of the proposed technique..
123. Area Ratio as Supplementary Information for Camera-Based Character Recognition
In order to achieve a highly accurate recognition of characters in a scene image with a digital camera, there are some attempts on offering supplementary information for recognition with a character image. The information should be robust against geometric distortions since an image taken by a digital camera is usually geometrically distorted. In this paper, we propose a method of embedding information in a character pattern by designing a character pattern in two colors so that the information is embedded as the area ratio of the two colors. It is expected that the area ratio is correctly extracted even if the character pattern is affine-transformed since the area ratio is affine invariant. We evaluate generated character patterns with the embedded information and discuss the effectiveness of the proposed method..
124. Area Ratio as Supplementary Information for Camera-Based Character Recognition
In order to achieve a highly accurate recognition of characters in a scene image with a digital camera, there are some attempts on offering supplementary information for recognition with a character image. The information should be robust against geometric distortions since an image taken by a digital camera is usually geometrically distorted. In this paper, we propose a method of embedding information in a character pattern by designing a character pattern in two colors so that the information is embedded as the area ratio of the two colors. It is expected that the area ratio is correctly extracted even if the character pattern is affine-transformed since the area ratio is affine invariant. We evaluate generated character patterns with the embedded information and discuss the effectiveness of the proposed method..
125. Elastic matching of images : a fundamental technique for both pattern recognition and image processing
This tutorial is concerned with elastic matching, which is one of the most fundamental technique for both image processing and pattern recognition. For example, elastic matching is used in video compression for compensating motions between consecutive frames. Elastic matching is also used in pattern recognition for evaluating a similarity between two image patterns. Intuitively speaking, elastic matching is "rubber-sheet matching" where one image is nonlinearly/linearly fitted to another image. From a mathematical viewpoint, elastic matching is formulated as an optimization problem of a 2D-2D mapping function, called warping function, which specifies pixel-to-pixel correspondence between two images. The property of elastic matching is determined by the definition of the warping function and the algorithm for optimizing the warping function..
126. An Efficient Stroke-Order-Free On-Line Character Recognition Algorithm Based on Radical Reference Pattern
グラフサーチにより最適画間対応を定めて筆順自由性を実現するオンライン文字認識法であるキューブサーチ法の動作速度と認識精度の改善を検討した.筆順変動を部首内の変動と部首間の変動に分離して, 部首単位標準パターンに基づく2段階のキューブサーチアルゴリズムを構成した.併せて, 処理量最小化条件を含む部首単位分割の指針を示した.教育漢字を対象とする画数固定条件での認識実験により, 速度, 精度両面での改善が確認され, 併せて, 処理量最小化部首分割条件の妥当性が確認された..
127. Current Status and Future Prospects of Camera-Based Character Recognition and Document Image Analysis
Pervasive use of handy digital cameras with higher resolution is now defining new roles of character recognition and document image analysis as a mean of analyzing camera-captured images. In this report, we survey state-of-the-art of research and technologies of camera based character recognition and document image analysis. We also describe the current position and future prospects of character recognition and document image analysis in comparison with related technologies such as "barcodes". In addition, we briefly introduce our research entitled "embedding information on characters using cross ratios" whose final goal is to make character recognition as easy and accurate as bar-code reading..
128. Current Status and Future Prospects of Camera-Based Character Recognition and Document Image Analysis
Pervasive use of handy digital cameras with higher resolution is now defining new roles of character recognition and document image analysis as a mean of analyzing camera-captured images. In this report, we survey state-of-the-art of research and technologies of camera based character recognition and document image analysis. We also describe the current position and future prospects of character recognition and document image analysis in comparison with related technologies such as "barcodes". In addition, we briefly introduce our research entitled "embedding information on characters using cross ratios" whose final goal is to make character recognition as easy and accurate as bar-code reading..
129. A Data Compression Technique for Stereo-pairs Using Pixel-based Disparity Compensation
In this paper, we describe a data compression technique for stereo-pairs using pixel-based disparity compensation (DC). The bit-rate of prediction residual of the proposed pixel-based DC is lower than that of block-based DC, which has been used commonly. Although the bit-rate of disparity becomes high on the pixel-based DC, we can relax this problem by imposing several restrictions on DC. It is also shown that the performance of the proposed technique can be improved by the following two modifications. Firstly, previous pixel prediction is selectively used around occlusion areas. Secondly, instead of pixels, blocks of one pixel width are employed as a unit of DC. The effectiveness of the proposed technique is observed through experiments..
130. Video Mosaicing for Camera-Based Text Recognition.
131. Video Mosaicing for Camera-Based Text Recognition
In this paper, a mosaicing-by-recognition technique is proposed, where video mosaicing and text recognition are simultaneously and collaboratively optimized in a one-step manner. Specifically, multiple frames where a long text line is captured while moving a camera are optimally matched and concatenated with a guide of the text recognition framework. The optimization is performed by a DP-based algorithm and can compensates rotation, scaling, and speed fluctuation which appear in texts captured by hand-held cameras. The results of an experiment to evaluate not only the accuracy of mosaicing but also that of text recogntion indicates that the proposed technique is very practical and can provide reasonable results in most cases..
132. Video Mosaicing for Camera-Based Text Recognition
In this paper, a mosaicing-by-recognition technique is proposed, where video mosaicing and text recognition are simultaneously and collaboratively optimized in a one-step manner. Specifically, multiple frames where a long text line is captured while moving a camera are optimally matched and concatenated with a guide of the text recognition framework. The optimization is performed by a DP-based algorithm and can compensates rotation, scaling, and speed fluctuation which appear in texts captured by hand-held cameras. The results of an experiment to evaluate not only the accuracy of mosaicing but also that of text recogntion indicates that the proposed technique is very practical and can provide reasonable results in most cases..
133. Improvements of On-line Character Recognition Based on Eigen-Deformations
The authors have investigated into online character recognition technique with eigen-deformations, which express frequent deformations of each chategory. To reduce overfitting, this technique evaluates a divergence between eigen-deformations and a fitting result obtained by DP matching between an input pattern and a reference pattern. In this report, which is a further study of the previous investigation, we describe the improvements or investigation mainly on (i) how to evaluate thedeformations, (ii) how to express the deformations. The results of recognition experiment on UNIPEN database showed the usefulness of the proposed technique..
134. Improvements of On-line Character Recognition Based on Eigen-Deformations
The authors have investigated into online character recognition technique with eigen-deformations, which express frequent deformations of each chategory. To reduce overfitting, this technique evaluates a divergence between eigen-deformations and a fitting result obtained by DP matching between an input pattern and a reference pattern. In this report, which is a further study of the previous investigation, we describe the improvements or investigation mainly on (i) how to evaluate thedeformations, (ii) how to express the deformations. The results of recognition experiment on UNIPEN database showed the usefulness of the proposed technique..
135. A Ground-Truthed Mathematical Character and Symbol Image Database.
136. A Ground-Truthed Mathematical Character and Symbol Image Database
This paper is a specification of our ground-truthed mathematical character and symbol image database, called InftyCDB-1. The ground-truth of each character is composed of type, font, quality (touched/broken) and link (relative position), etc. The database includes all the characters and symbols of 467pages of 30articles on mathematics, and is organized so that it can be used as word image database or as mathematical formula image database. InftyCDB-1 is a public database and freely usable for research and development purposes..
137. A Ground-Truthed Mathematical Character and Symbol Image Database
This paper is a specification of our ground-truthed mathematical character and symbol image database, called InftyCDB-1. The ground-truth of each character is composed of type, font, quality (touched/broken) and link (relative position), etc. The database includes all the characters and symbols of 467pages of 30articles on mathematics, and is organized so that it can be used as word image database or as mathematical formula image database. InftyCDB-1 is a public database and freely usable for research and development purposes..
138. Quantity of Information of Recognition : How Many Bits Are Lacking for 100% Recognition?
The ultimate dream in pattern recognition is to achieve 100% of recognition rate. However, it is not so easy. In this report, for achieving 100% of recognition rate and 0% of rejection rate, we propose a new framework that the discriminator receives not only a pattern itself but also supplementary information about the class that the pattern belongs to. For printed characters, experiments showed that 4 bits are required in the leave-one-out (L) method and 1 bit is in the resubstitution (R) method. Such kind of quantity of information has different characteristics from recognition rates and ambiguity after recognition. This criterion can be a new criterion of a discriminator..
139. Quantity of Information of Recognition : How Many Bits Are Lacking for 100% Recognition?
The ultimate dream in pattern recognition is to achieve 100% of recognition rate. However, it is not so easy. In this report, for achieving 100% of recognition rate and 0% of rejection rate, we propose a new framework that the discriminator receives not only a pattern itself but also supplementary information about the class that the pattern belongs to. For printed characters, experiments showed that 4 bits are required in the leave-one-out (L) method and 1 bit is in the resubstitution (R) method. Such kind of quantity of information has different characteristics from recognition rates and ambiguity after recognition. This criterion can be a new criterion of a discriminator..
140. Foreword: Special section on document image understanding and digital documents.
141. An HMM Implementation for On-line Handwriting Recognition Based on Pen-Coordinate Information and Pen-Direction Information
An on-line handwritten character recognition technique based on a new HMM is proposed. In the proposed HMM, not only pen-direction features but also pen-coordinate features are utilized for describing the shape variation of on-line characters more accurately than conventional HMM where the pen-coordinate features are not utilized because of their non-stationarity. Specifically speaking, the proposed HMM outputs a pen-coordinate feature at each inter-state transition and outputs a pen-direction feature at each intra-state transition, i.e., self-loop. Thus, each state of the proposed HMM can specify the starting position and the direction of a line segment by its incoming inter-state transition and intra-state transition, respectively. The results of recognition experiments on 10-stroke Chinese characters show that the proposed HMM outperforms conventional HMMs..
142. Online Character Recognition Using Elastic Matching and Eigen-deformations(Image Processing)(Next Generation Mobile Communication Systems)
In online character recognition based on elastic matching, such as DP matching, many of misrecognitions are often due to overfitting, which is the phenomenon that a wrong reference pattern is closely fitted an input pattern by the matching. In this report, a technique to reduce those misrecognitions is proposed, where frequent deformations of each category, called eigen-deformations, are employed. In case of overfitting, the matching between the two patterns will not be expressed by the eigen-deformations of the category of the reference pattern. Thus, the overfitting can be detected by evaluating the divergence of the matching result from the eigen-deformations. The results of recognition experiment showed the usefulness of the proposed technique..
143. Early Recognition and Prediction of Gestures for Proactive Human-Machine Interface
This paper concerns two topics on gesture recognition. The first topic is early recognition for providing the recognition result of a gesture before the gesture is completed. The second topic is motion prediction for guessing the subsequent posture of the person who makes a gesture. Both topics are mutually related and linked to the realization of proactive human-machine interface. For each of those two topics, a simple technique is developed and examined to reveal its limitation. Possible directions to deal with the limitation are also discussed as the future work on those topics..
144. Early Recognition and Prediction of Gestures for Proactive Human-Machine Interface
This paper concerns two topics on gesture recognition. The first topic is early recognition for providing the recognition result of a gesture before the gesture is completed. The second topic is motion prediction for guessing the subsequent posture of the person who makes a gesture. Both topics are mutually related and linked to the realization of proactive human-machine interface. For each of those two topics, a simple technique is developed and examined to reveal its limitation. Possible directions to deal with the limitation are also discussed as the future work on those topics..
145. A Data Compression Technique for Stereopairs Using Pixel-Based Disparity Compensation
In this paper, we describe a data compression technique for stereopairs using pixel based disparity compensation (DC). The bit-rate of prediction residual of the proposed pixel-based DC is lower than that of block-based DC, which is used commonly. Although the bit-rate of disparity is high on the pixel-based DC, we can solve this problem applying several restrictions to DC. lt is also shown that the performance of the proposed technique can be improved by the following two modifications. Firstly, the neighboring pixel of the same image is used for the prediction around occlusion. Secondly, instead of pixels, long-sized blocks are employed as the unit of DC. The effectiveness of the proposed technique is observed through experiments..
146. Study on Proactive Human Interface - Experiments of Prediction-based Active Interface -.
147. Fast Elastic Image Matching Algorithm Based on Coarse-to-Fine DP
In image pattern recognition, elastic matching based on dynamic programming (DP) has been used as an effective technique to obtain a deformation-invariant distance between image patterns. A practical problem of elastic matching is its huge computation time. In this report, a fast elastic matching technique based on coarse-to-fine DP (CFDP) is proposed. In CFDP, a heuristic search strategy is employed to reduce computation time while keeping the global optimality of matching. The effect of the proposed technique on reducing computation time was indicated by experimental results..
148. I-075 Gesture recognition for proactive human interface.
149. A Clustering Algorithm for Elastic Matching-Based Image Pattern Recognition
弾性マッチングに基づく画像パターン認識のための標準パターン設定法について述べる.本手法はクラスタリング法の一種であるが,従来法がユークリッド距離を基準としているのに対し,本手法では識別時と同じ弾性マッチングによる距離を基準とする..
150. Category-Dependent Elastic Matching Based on a Linear Combination of Eigen-Deformations
画像パターンの認識において,パターンに生じた変形を補償するための手法として,弾性マッチングの利用が検討されている.従来法がすべてのカテゴリーに共通の変形特性を仮定していたのに対し,本論文では各カテゴリーに固有の変形特性を組み込んだ手法を提案する.具体的には,各カテゴリーの任意の変形をそのカテゴリーに固有ないくつかの変形の線形結合で表現する.その結果,各カテゴリー内に生じる変形だけが適切に補償されることになり,過変形の抑制及び計算効率の向上といった効果が得られる.本手法は,一種の非線形最適化問題として定式化される.本論文ではその解法についても述べ,実験を通して有効性を検証する..
151. Bookshelf Image Analysis Based on Model Fitting
本論文では画像処理による書籍管理を目的として書棚画像から各書籍の境界を検出する手法を提案する.従来法ではエッジや影からハフ変換などの直線検出法を用いて書籍境界を検出している.本手法では,そのような局所的な情報だけでなく大域的な最適性も考慮して,書棚画像の最適領域分割(各書籍の背表紙領域,書棚背景領域)を動的計画法に基づくアルゴリズムにより行い,各書籍の境界を検出する.更に最適化問題として定式化する際,書棚画像の文法モデルを組み込むことで高精度化を図っている.実験により,本手法の有効性を定性的及び定量的に確認した..
152. Online Character Recognition Using Eigen-Deformations
In online character recognition based on elastic matching, such as DP matching, many of misrecognitions are often due to over fitting, which is the phenomenon that a wrong reference pattern is closely fitted an input pattern. In this report, a technique to suppress those misrecognitions is proposed, where frequent deformations of each category, called eigen-deformations, are utilized. In case of overfitting, the matching between the two patterns will diverge from the eigen-deformations of the category of the reference pattern. Thus, the overfitting can be detected by evaluating the divergence. The result of a recognition experiment showed the usefulness of the proposed technique..
153. Online Character Recognition Using Eigen-Deformations
In online character recognition based on elastic matching, such as DP matching, many of misrecognitions are often due to overfitting, which is the phenomenon that a wrong reference pattern is closely fitted an input pattern. In this report, a technique to suppress those misrecognitions is proposed, where frequent deformations of each category, called eigen-deformations, are utilized. In case of overfitting, the matching between the two patterns will diverge from the eigen-deformations of the category of the reference pattern. Thus, the overfitting can be detected by evaluating the divergence. The result of a recognition experiment showed the usefulness of the proposed technique..
154. Quantitative Analysis of Mathematical Documents
Mathematical documents are analyzed from several viewpoints to develop practical OCR for mathematical and other scientific documents. Specifically, the following four viewpoints are quantified using a large-scale database of mathematical documents, which contains manually ground-truthed 670,000 characters: (i) the number of character categories, (ii) abnormal characters (e.g., touching characters), (iii) character size variation, and (iv) the complexity of math expressions. The result of those analyses clarifies the difficulties on recognizing math documents and then suggests the promising directions to overcome them..
155. Quantitative Analysis of Mathematical Documents
Mathematical documents are analyzed from several viewpoints to develop practical OCR for mathematical and other scientific documents. Specifically, the following four viewpoints are quantified using a large-scale database of mathematical documents, which contains manually ground-truthed 670,000 characters : (i) the number of character categories, (ii) abnormal characters (e. g., touching characters), (iii) character size variation, and (iv) the complexity of math expressions. The result of those analyses clarifies the difficulties on recognizing math documents and then suggests the promising directions to overcome them..
156. Clustering Method for Image Pattern Recognition Based on Elastic Matching
A technique for setting standard patterns for image pattern recognition based on elastic matching is investigated. The proposed technique is a kind of clustering techniques, which generally provide standard patterns as the centroids of the distribution of training patterns in pattern space. In conventional clustering techniques, the centroid is defined as the local center of gravity under the metric of the Euclidean distance. Contrary to this, in the proposed technique an elastic matching distance is newly employed as the metric. Thus, the same elastic matching-based metric is consistently used at the standard pattern setting stage and the recognition stage with the proposed technique, whereas different metrics are inconsistently used in those stages with the conventional technique. From experimental results, it was shown that high recognition rates can be attained with the standard patterns provided by the proposed technique because of the consistency of the metric..
157. A Technique for Setting Standard Patterns for Image Pattern Recognition Based on Elastic Matching
A technique for setting standard patterns for image pattern recognition based on elastic matching is investigated. The proposed technique is a kind of clustering techniques, which generally provide standard patterns as the centroids of the distribution of training patterns in pattern space. In conventional clustering techniques, the centroid is defined as the local center of gravity under the metric of the Euclidean distance. Contrary to this, in the proposed technique an elastic matching distance is newly employed as the metric. Thus, the same elastic matching-based metric is consistently used at the standard pattern setting stage and the recognition stage with the proposed technique, whereas different metrics are inconsistently used in those stages with the conventional technique. From experimental results, it was shown that high recognition rates can be attained with the standard patterns provided by the proposed technique because of the consistency of the metric..
158. A Technique for Setting Standard Patterns for Image Pattern Recognition Based on Elastic Matching
A technique for setting standard patterns for image pattern recognition based on elastic matching is investigated. The proposed technique is a kind of clustering techniques, which generally provide standard patterns as the centroids of the distribution of training patterns in pattern space. In conventional clustering techniques, the centroid is defined as the local center of gravity under the metric of the Euclidean distance. Contrary to this, in the proposed technique an elastic matching distance is newly employed as the metric. Thus, the same elastic matching-based metric is consistently used at the standard pattern setting stage and the recognition stage with the proposed technique, whereas different metrics are inconsistently used in those stages with the conventional technique. From experimental results, it was shown that high recognition rates can be attained with the standard patterns provided by the proposed technique because of the consistency of the metric..
159. A Technique for Setting Standard Patterns for Image Pattern Recognition Based on Elastic Matching
A technique for setting standard patterns for image pattern recognition based on elastic matching is investigated. The proposed technique is a kind of clustering techniques, which generally provide standard patterns as the centroids of the distribution of training patterns in pattern space. In conventional clustering techniques, the centroid is defined as the local center of gravity under the metric of the Eucludean distance. Contrary to this, in the proposed technique an elastic matching destance is newly employed as the metric. Thus, the same elastic metching-based metric is consistently used at the standard pattern setting stage and the recognition stage with the proposed technique, whereas different metrics are inconsistently used in those stages with the conventional technique. From experimental results, it was shown that high recognition rates can be attained with the standard patterns provided by the proposed technique because of the consistency of the metric..
160. HANDWRITTEN CHARACTER RECOGNITION USING A CLASS-DEPENDENT DEFORMATION MODEL.
161. Handwritten character recognition using elastic matching based on a category-dependent deformation model
手書き文字の認識において,手書き変形を補償するための手法として,弾性マッチングの利用が検討されている.従来法がすべてのカテゴリに共通の変形特性を仮定していたのに対し,本報告では各カテゴリに固有の変形特性を組み込んだ手法を提案する.具体的には,各カテゴリの任意の変形をそのカテゴリに固有ないくつかの変形の加重和で表現する.その結果,各カテゴリ内に生じる変形だけが適切に補償されることになり,過変形の抑制および計算効率の向上といった効果が得られる.本手法は,一種の非線形最適化問題として定式化される.本報告ではその近似解法についても述べる..
162. Detection and Segmantation of Touching Characters in Mathematical Expressions
A technique for the detection and the segmentation of touching characters in mathematical expressions is presented. In the detection stage, a connected component initally recognized into some category is judged as a candidate of touched characters if its feature values deviate from the standard feature values of the category. In the segmentation stage, two component characters of the candidate are decided by the comparison with touching character images synthesized from two single character images. Experimental results showed the effectiveness on the accuracy improvement of the recognition of mathematical expressions..
163. Book Boundary Detection in Bookshelf Image Using Dynamic Programming
For the automatic management of book arrangement in bookshelves, several systems to extract book information, such as title, from bookshelf images have been studied. In this paper, we propose a technique for detecting boundaries of the books in the bookshelf image. In the present technique, the book boundary detection are formulated as an optimal slant estimation problem under a FSA model of the bookshelf images. The globally optimal solution is searched for by a dynamic programming-based algorithm. The effectiveness of the present technique was shown by experiments..
164. A preliminary study of pixel-based motion compensation
For efficient video compression, a pixel-based motion compensation technique where motion is determined at every pixel is investigated. The pixel-based motion compensation technique is expected to provide better performance on the minimization of prediction error than conventional block-based motion compensation techniques. In order to suppress the increase of motion information, several constraints are imposed on the motions of neighboring pixels. Experimental results have shown that the present technique can attain higher compression rates than a block-based technique for image sequences with large motion..
165. A preliminary study of pixel-based motion compensation
For efficient video compression, a pixel-based motion compensation technique where motion is determined at every pixel is investigated. The pixel-based motion compensation technique is expected to provide better performance on the minimization of prediction error than conventional block-based motion compensation techniques. In order to suppress the increase of motion information, several constraints are imposed on the motions of neighboring pixels. Experimental results have shown that the present technique can attain higher compression rates than a block-based technique for image sequences with large motion..
166. Two-Dimensional Warping for Face Image Matching.
167. Implem entation of Kanji Learning System with Strok e Order Checking.
168. Slant Correction for Bookshelf Image Using Dynamic Programming
For building book database or managing book arrangement automatically, systems to extract book information, such as title, from bookshelf images have been studied. In such systems, slant of each book in bookshelf images may degrade the extraction accuracy. In this paper, a slant correction technique for bookshelf images is proposed where the slant correction problem is formulated as an optimal estimation problem of local slant angles at all horizontal positions. The optimal estimation is provided by a dynamic programming-based algorithm. In the present technique, color features are utilized for improving the slant estimation accuracy. The effectiveness of the present technique was shown by experiments..
169. A Priori Knowledge Free Piecewise Linear Two-Dimensional Warping
Piecewise linear two-dimensional warping (PL2DW) is a practical elastic image matching technique where the pixel-to-pixel correspondence function between a pair of image patterns is defined as a piecewise linear 2D-2D mapping. For accurate matching, the boundary points of linearization, called "pivots", should be placed at the bending and stretching points of image patterns. In conventional PL2DW, it is assumed that the pivots are properly placed by users before their mapping is optimized. This assumption, however, is acceptable only when the a priori knowledge about the deformation characteristics of the image patterns is available. In this paper, an improved PL3DW technique is proposed. In this technique, along with the mapping of pivots, their placement is simultaneously optimized. As a result, pivots are placed automatically at the bending and stretching points of the target and therefore accurate matching is obtained without any a priori knowledge..
170. Eigen-Deformations of Handwritten Characters
Deformations in handwritten characters can be considered to have their peculiar directions. For example, handwritten characters of class "A" are often deformed by global slant transformation, whereas they are not deformed to be similar to "R". In this paper, the extraction of such peculiar deformation directions called eigen-deformations are investigated. The key idea is the principal component analysis of a set of deformations collected automatically by elastic matching. From experimental results, it was shown that the typical deformations of each character class were extracted as eigen-deformations. In addition, it was also experimentally shown that the those eigen-deformations are useful to improve the performance of an elastic matching-based recognition system by suppressing overdeformation..
171. ロニー モハマッド アサッド, 内田 誠一, 迫江 博昭, 区分線形2次元ワープによる手書き文字認識, 電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解, Vol.100, No.701, pp.31-38, 2001.03, 本報告では,手書き文字認識における,区分線形2次元ワープの文字歪みの吸収に対する有効性を検討する.区分線形2次元ワープは,画像の弾性マッチングを目的とした,区分的に線形化された2次元-2次元写像の一種であり,回転や部分的な位置移動といった手書き文字画像における形状歪みの大部分を吸収できる程度の自由度を有している.ワープは区分線形化の境界点を基準として制御され,動的計画法(DP)に基づくアルゴリズムにより最適化される.アルゴリズムの計算量は区分線形化の効果により,現実的なレベルに押えられている.ETL6の英大文字26文字種を対象とした認識実験により,従来のDPに基づくいくつかの2次元ワープ法に対する本手法の優位性を確認した.また,自由度のより高い他の2次元ワープ法との比較を行なった結果,本手法はほとんどのカテゴリーにおいてほぼ同等の認識率を示しており,このことから区分線形化の影響が少ないことを確認した..
172. Speaker Normalization Based on Piecewise Linear Frequency Warping
An efficient algorithm for speaker-independent spoken word recognition is presented. This algorithm is based on the time-frequency warping with inter-frame consistency, where each frame of an input pattern is mapped to a reference pattern by controlling the mapping of several points (pivots) on the frame. The mapping of non-pivot points is given by linear interpolation between mapping of two consecutive pivots. The optimal mapping is obtained by using a dynamic programming based algorithm. The computational complexity of the algorithm is reduced to less than that of the previous time-frequency warping algorithm with inter-frame consistency. Experimental results show advantageous characteristics of the present algorithm..
173. Nonuniform Slant Correction of Handwritten Word
Slant correction is an indispensable technique to improve recognition accuracy for handwritten word recognition. In convensional slant correction techniques, uniform slant correction was performed under the assumption that each word is written with a constant slant. In this paper, a nonuniform slant correction technique is presented where the slant correction problem is formulated as an optimal estimation problem of local slant angles at all horizontal positions. The optimal estimation is performed by a dymanic programming-based algorithm. Experimental results show the slant correction ability of the present technique over the uniform slant corretion technique..
174. Slant Correction of Handwritten Word Using Two-Dimensional Warping
Slant correction of characters is necessary in the segmentation of a handwritten word into component characters. Conventional slant correction techniques estimate the average slant angle of component characters and correct only uniform slant resulting residual error for each character. In this paper, a slant correction technique which can well correct nonuniform slant is proposed. In the present technique, the slant correction problem is formulated as a non-linear mapping problem of slanted strokes onto vertical straight lines. Then a dynamic programming-based two-dimensional warping algorithm is applied to optimize the mapping. The effectiveness of the present technique was shown by experiments..
175. Piecewise Linear Two-Dimensional Warping.
176. A Handwritten Character Recognition Experiment Using Monotonic and Continuous Two-Dimensional Warping.
177. Handwritten Character Recognition Experiment Using Monotonic and Continuous Two-Dimensional Warping
A handwritten character recognition experiment using a monotonic and continuous two-dimensional warping algorithm is reported. This warping algorithm is based on dynamic programming and searches for the optimal pixel-to-pixel mapping between given two images subject to two-dimensional monotonicity and continuity constraints. Experimental comparisons with rigid matching and local perturbation show the performance superiority of the monotonic and continuous warping in character recognition..
178. A Priori Knowledge Free Piecewise Lincar Two-Dimensional Warping.
179. An Efficient Elastic Image Matching and Its Application to Handwritten Character Recognition : Dutch Roll Warping.
180. An Efficient Elastic Image Matching and Its Application to Handwritten Character Recognition : Dutch Roll Warping
An efficient elastic image matching technique is investigated with application to off-line handwritten character recognition. In the present technique, each column of an image is mapped to another by controlling the mappings of the two endpoints of the column and linearly interpolating between them. Thus, the technique can adjust nonlinear deformation of images, with local slant and translation. The dynamic programming (DP) based algorithm searches for the optimal mapping with a reasonable amount of computation. The effectiveness of the present technique was indicated by recognition experiments on handwritten English alphabets..
181. Experimental Study on Handwritten Character Recognition Using Piecewise Linear Two-Dimensional Warping
In this paper, a character recognition experiment using piecewise linear two-dimensional warping, a dynamic programming-based elastic image matching technique, is reported. The present technique requires far less computations than the previous methods, such as monotonic and continuous two-dimensional warping, since line segment is used instead of pixel as the unit of mapping. This consideration does not decrease the recognition accuracy, as shown by the experimental result. Additional results also show the superiority of the present technique over template matching and two-dimensional warping techniques based on combinations of orthogonal one-dimensional warpings..
182. S Uchida, H Sakoe, An approximation algorithm for two-dimensional warping, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, Vol.E83D, No.1, pp.109-111, 2000.01, A new efficient two-dimensional warping algorithm is presented, in which sub-optimal warping is attained by iterating DP-based local optimization of warp on partially overlapping subplane sequence. From an experimental comparison with a conventional approximation algorithm based on beam search DP, relative superiority of the proposed algorithm is established..
183. Speaker Normalization Based on Piecewise Linear Frequency Warping.
184. Piecewise Linear Two-Dimensional Warping.
185. Piecewise Linear Two-Dimensional Warping
A new efficient algorithm for two-dimensional warping is proposed. Each line in an image is warped to another image by controlling the mappings of several points (pivots) on the line and linearly interpolating between them. A dynamic programming based algorithm searches for the globally optimal position of each pivot. The computational complexity of the algorithm is far less than that for the conventional monotonic and continuous two-dimensional warping, reported by the authors. Through experimental results, the characteristics of present warping were investigated..
186. An Approximatiuon Algorithm for Monotonic and Continuous Two-Dimension Warping.
187. Handwritten Hiragana character recognition using monotonic and continuous two-dimensional warping
Rigid template matching is one of the simplest techniques for off-line character recognition. The major drawback of the technique is its sensitivity to deformations of characters, such as translation, rotation and nonlinear deformation. In this paper, a handwritten Hiragana character recognition experiment using the monotonic and continuous two-dimensional warping algorithm previously reported by the authors is investigated. This warping algorithm is based on dynamic programming(DP)and searches for the optimal pixel-to-pixel mapping between given two images subject to the two-dimensional monotonicity and continuity constraints. Experimental comparisons with the rigid template matching and a local perturbation method show the superiority of the two-dimensional warping..
188. Speaker Normalization Based on Time-Frequency Warp with Inter-Frame Consistency
A new algorithm for speaker-independent spoken word recognition is presented. The algorithm is based on the time-frequency warping technique where frequency axis warping is performed in order to adjust individual spectral difference, in addition to time axis warping. In the conventional algorithm, frequency axis warping is independently determined at each frame (i.e., time). In this case, such warp have a tendency to yield excessive deformations of time-frequency plane, it is feared. In order to suppress such excessive deformations, inter-frame consistency of frequency axis warping is newly taken into account as constraints on the warping. The optimal warping is obtained by using dynamic programming with the constraints. As an implementation technique, beam search based acceleration is also investigated. Experimental results indicates advantageous characteristics of the present algorithm over the conventional algorithm..
189. Monotonic and Continuous Two-Dimensional Warping Based on Dynamic Progarmming
2画像間の最大一致を実現する画素間のマッピングとして定義される2次元ワープは, パターンに生じる変形に適応可能なテンプレートマッチング法とみなすことができる.本論文では新しい2次元ワープ法の枠組みを提案し, 基礎的な考察を行う.本手法の第一の特徴は, 2次元的な自由度をもちながら, パターンの位相を保存するワープを構成できることである.この性質はワープに対する単調性および連続性制約により実現される.第2の特徴は, 画像全体での最適性が保証されるように構成された動的計画法(DP)を, 最大一致の探索法として用いる点である.DPの利用により, 評価関数に対する微分可能性の制約がないなどの特長も生じる.実験により, 提案した手法の基本的特性を確認した..
190. An Efficient Algorithm for Two-Dimensional Warping.
191. A Markov Model Formulation of Two-Dimensional Warping Problem and Its Solution by Dynamic Programming
The authors have proposed a monotonic and continuous two-dimensional warping method based on dynamic programming (DP). It can achieve the optimal pel-to-pel correspondence between two images with preserving topological features. This is useful property for many pattern matching problems. However, a huge amount of computational resources is required to obtain the optimal warping. In this paper, we propose a new representation of the two-dimensional warping problem. It is described as a Markovian-type decision process based on left-to-right, FSA. The number of allowed transitions is significantly fewer than that of previous representation. Therefore, the complexity of DP-based algorithm is remarkably reduced. An sub-optimal algorithm based on pruning technique is also investigated..
192. Speaker-Independent Word Recognition Based on Frequency Warp with Warp Constraint between Consecutive Frames
An improvement of frequency warp technique, applied to speaker-independent word recognition, was investigated. In the conventional method, frequency warp was independently applied to each frame, and it was feared that significant discontinuity might occur in spectrum transition. A new algorithm was proposed in which continuity warps of consecutive frames is forced by a constraint on dynamic programming search. By a preliminary experiment, it was observed that significant cases of erroneous recognition by the conventional algorithm, can be eliminated by the new algorithm..
193. Speaker-Independent Word Recognition Based on Frequency Warp with Warp Constraint between Consecutive Frames
An improvement of frequency warp technique, applied to speaker-independent word recognition, was investigated. In the conventional method, frequency warp was independently applied to each frame, and it was feared that significant discontinuity might occur in spectrum transition. A new algorithm was proposed in which continuity warps of consecutive frames is forced by a constraint on dynamic programming search. By a preliminary experiment, it was observed that significant cases of erroneous recognition by the conventional algorithm, can be eliminated by the new algorithm..
194. Speaker-Independent Word Recognition Based on Frequency Warp with Warp Constraint between Consecutive Frames
An improvement of frequency warp technique, applied to speaker-independent word recognition, was investigated. In the conventional method, frequency warp was independently applied to each frame, and it was feared that significant discontinuity might occur in spectrum transition. A new algorithm was proposed in which continuity warps of consecutive frames is forced by a constraint on dynamic programming search. By a preliminary experiment, it was observed that significant cases of erroneous recognition by the conventional algorithm, can be eliminated by the new algorithm..
195. Practical Improvements for Monotonous and Continuous Two Dimensional - Two Dimensional Warping
The monotonous and continuous two dimensional-two dimensional warping method, proposed by the authors, is expected to be very useful tool for many pattern analysis/recognition problems. It simulates topology-preserving nonlinear deformation of an image. However, its high computational complexity makes its practical application difficult. In this paper, we introduce a pruning technique to the method by which a sub-optimal warp is obtained in polynomial time. It is demonstrated through several experiments that the warp between practical images is improved both accuracy and efficiency using this pruning technique with penalty..
196. A Monotonous and Continuous Planar Warping for Pattern Matching
2次元ワープは, 入力向像の各ピクセルからモデル画像のピクセルへの最適なマッピングと定義され, そのマッピングに対する制約条件と, マッピングアルゴリズムによって, 幾つかのタイプに分類される. 本稿では, Levinらで考慮されていた単調性に加え, 新たに連続性を制約条件として導入した, ノンパラメトリックな2次元ワープを提案する. この連続性によりピクセルの近傍関係を保存するワープが実現し, 2画像間の対応点検出, 文字認識, 画像データベース検索, frequency warping等への応用が期待できる. 単調連続2次元ワープを求める問題には動的計画法 (DP) が適用でき, 最適解を求めることができる. また枝刈法の導入により, 計算量を大幅に低減できることを示す..
197. A Monotonous and Continuous Planar Warping for Pattern Matching
Planar warping is generally defined as determining the optimal pixel correspondence between input image and model image. Several classes can be defined according to constraint conditions given to warping. In this paper a nonparametric, continuous and monotonous planar warping is investigated. This type of planar warping simulates topology-preserving nonlinear deformation of an image. Use of dynamic programming enabled us to implement a practical algorithm for determination of warp. Computational complexity and utilization of context sensitive local metric are also discussed..
198. An Efficient CYK-based Algorithm for Continuous Speech Recognition
An efficient implementation of CYK-based continuous speech recognition is investigated. First, the word level and the sentence level processes in Ney's algorithm were reorganized so that they proceed in synchronization with input frame. Then, beam search prunings were incorporated into the two processing levels. A new acceleration technique, beam data driven parsing, was successfully introduced. Considerable improvements in computational and memory efficiency were established through a sentence speech recognition experiment..