Updated on 2025/03/06

Information

 

写真a

 
THOMAS GABRIEL FRANCIS DIEGO
 
Organization
Faculty of Information Science and Electrical Engineering Department of Advanced Information Technology Associate Professor
School of Engineering Department of Electrical Engineering and Computer Science(Concurrent)
Graduate School of Information Science and Electrical Engineering Department of Information Science and Technology(Concurrent)
Title
Associate Professor
Contact information
メールアドレス
Tel
0928023571
Profile
My research is related to the automatic 3D modeling of static and dynamic indoor scenes using consumer-grade RGB-D cameras. At first, I focused my research activities on the 3D registration task (i.e., how to align two parts of the same object captured from different viewpoint). Then I focused my research in deriving more suitable 3D models (i.e., 3D mathematical representations) for efficient fusion of videos of (aligned) depth images for large- scale, real-time 3D modeling of indoor scenes. Recently, my research aimed at handling animations in the 3D modeling process. More precisely, I focused my research theme on the animation and 3D modeling of the human body. Education: -Year 2017-Now: Assistant professor at Kyushu University, Fukuoka (Japan) -Year 2015-2017: JSPS Post-doc research at Kyushu University, Fukuoka (Japan) -Year 2012-2015: Post-doc researcher at the National Institute of Informatics, Tokyo (Japan). -Year 2009-2012: Phd course at the National Institute of Informatics, Tokyo (Japan) as a student of SOKENDAI. (Best student award) -Year 2005-2008: Master course at the ENSIMAG-INPG (Engineering school of Computer Science and Mathematics), Grenoble, option IRV (Image and Virtual Reality). Diploma with honors. -Year 2003-2005: Two-year intensive course in science and mathematics at undergraduate level as preparation for competitive entrance exam to French engineering schools, lycee Champollion, in Grenoble.

Degree

  • Master (France)

  • Ph. D

Research History

  • I was a post doc researcher at the National Institute of Informatics from April 2012 to March 2015.   

Research Interests・Research Keywords

  • Research theme: Digital humans

    Keyword: generative AI; 3D and 4D capture; motion retargeting; gesture

    Research period: 2023.1

  • Research theme: AI-based avatar animation synthesis

    Keyword: deep learning; avatar animation; dense deformation; texture

    Research period: 2021.6 - 2022.6

  • Research theme: Aerial-based outdoor 3D scene mapping

    Keyword: aerial drone; RGB-D SLAM; outdoor scene

    Research period: 2020.4 - 2022.4

  • Research theme: 3D shape from a single image

    Keyword: Deep learning, 3D shape estimation

    Research period: 2019.4 - 2021.8

  • Research theme: Mediated Reality Agents for educational applications targeting young children

    Keyword: Education support, Virtual Reality, Augmented Reality, Mixed Reality, Virtual Assistant, RGB-D camera.

    Research period: 2018.5 - 2020.6

  • Research theme: High frame-rate 3D reconstruction with multiple cameras

    Keyword: RGB-D camera; high frame rate; multi-view set-up; real time; distributed system; GPU optimization; volumetric reconstruction; fast and uncontrolled motion

    Research period: 2017.12 - 2018.2

  • Research theme: Human body 3D reconstruction in dynamic scenes

    Keyword: RGB-D camera; fast motion; skeleton; deforming bounding boxes; volumetric depth fusion; ICP; GPU optimization; large-scale scene

    Research period: 2017.4 - 2018.2

  • Research theme: Facial 3D reconstruction and expression tracking

    Keyword: RGB-D camera; Facial expression; Blendshape; Template mesh; Texturing; 3D modeling; Retargeting; Deviation mapping; Real-time.

    Research period: 2015.4 - 2018.2

  • Research theme: 3D reconstruction of large-scale static indoor scenes using consumer-grade RGB-D cameras

    Keyword: RGB-D camera; SLAM; Depth Fusion; 3D modeling; Camera tracking; Loop closure

    Research period: 2012.4 - 2017.4

Awards

  • MIRU Nagao Award

    2024.8   Meeting on Image Recognition and Understanding (MIRU)   3D Shape Modeling with Adaptive Centroidal Voronoi Tesselation on Signed Distance Field

    Diego Thomas (Kyusyu University), Jean-Sebastien Franco (INRIA), Edmond Boyer (INRIA)

     More details

    Award type:Award from Japanese society, conference, symposium, etc.  Country:Japan

    受賞理由: 多視点画像を用いた3次元再構成問題において、適応的重心ボロノイ分割を用いた新たなニューラル場の表現による手法を提案する研究であり、任意の法線を持つ物体表面形状を直接表現できるという特徴を活かして、従来の座標軸方向に拘束された離散化に基づく手法と比較して少ない分割数で高い再構成精度を実現している。高速な最適化手法および微分可能レンダリングなど実装上の工夫も提案しており、アイデアと実装双方において完成度が高く、MIRU長尾賞にふさわしい論文である。

  • MIRU Excellent Paper Award

    2024.8   Meeting on Image Recognition and Understanding (MIRU)   Text-Guided Diverse Scene Interaction Synthesis by Disentangling Actions from Scenes

    Hitoshi Teshima (Kyushu University), Naoki Wake (Microsoft), Diego Thomas (Kyushu University), Yuta Nakashima (Osaka University), Hiroshi Kawasaki (Kyushu University), Katsushi Ikeuchi (Microsoft)

     More details

    Country:Japan

    受賞理由: シーン中オブジェクトとのインタラクションを伴う動作生成という挑戦的課題に取り組んだ研究である。このため、動作指示のみからの動作生成結果から取り出したKey Poseとシーンの接合を既存の基盤モデルで推定した上で、そこに至る軌跡を生成し動作生成するパイプラインにより、シーン情報が含まれない既存データセットのみで学習させる現実的な方法を提案した。周辺環境に応じた動作生成という難しい問題設定に挑戦し有効性を確認した点は高い評価に値する。

  • Best paper award

    2019.11   The 9th Pacific-Rim Symposium on Image and Video Technology (PSIVT 2019)   This prize was received after the presentation by Akihiko Sayo about his joint work on "Human shape reconstruction with loose clothes from partially observed data by pose specific deformation"

  • Best poster presentation award

    2019.11   Machine Perception and Robotics (MPR 2019)   This prize was obtained after the presentation of Hayato Onizuka about his work on "Regression of 3D body shapes from a single image in a tetrahedral volume"

  • Outstanding research achievement and contribution to ASPCIT 2019 Annual Meeting Invited Presentation

    2019.7   Asia Pacific Society for Computing and Information Technology   This prize was received after an invited talk at APSCIT 2019 organized in Sapporo.

  • Best poster award

    2019.2   IW-FCV2019   This prize was received after the poster presentation of Maxence Remy about the joint research "Merging SLAM and photometric stereo for 3D reconstruction with a moving camera"

  • Outstanding reviewer

    2015.7   MIRU 2015   Outstanding reviewer

  • Best student award

    2012.3   National Insitute of Informatics   Best student award at the end of my Ph.D course.

▼display all

Papers

  • TetraTSDF: 3D human reconstruction from a single image with a tetrahedral outer shell Reviewed International journal

    Hayato Onizuka, Zehra Hayirci, Diego Thomas, Akihiro Sugimoto, Hideaki Uchiyama, Rin-ichiro Taniguchi

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition   2020.6

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Recovering the 3D shape of a person from its 2D appearance is ill-posed due to ambiguities. Nevertheless, with the help of convolutional neural networks (CNN) and prior knowledge on the 3D human body, it is possible to overcome such ambiguities to recover detailed 3D shapes of human bodies from single images. Current solutions, however, fail to reconstruct all the details of a person wearing loose clothes. This is because of either (a) huge memory requirement that cannot be maintained even on modern GPUs or (b) the compact 3D representation that cannot encode all the details. In this paper, we propose the tetrahedral outer shell volumetric truncated signed distance function (TetraTSDF) model for the human body, and its corresponding part connection network (PCN) for 3D human body shape regression. Our proposed model is compact, dense, accurate, and yet well suited for CNN-based regression task. Our proposed PCN allows us to learn the distribution of the TSDF in the tetrahedral volume from a single image in an end-to-end manner. Results show that our proposed method allows to reconstruct detailed shapes of humans wearing loose clothes from single RGB images.

  • Human shape reconstruction with loose clothes from partially observed data by pose specific deformation Reviewed International journal

    #Akihiko Sayo, #Hayato Onizuka, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, and Katsushi Ikeuchi

    he 9th Pacific-Rim Symposium on Image and Video Technology   2019.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Recent approaches for full-bodyreconstruction use a statistical shape model, which is built upon accu-rate full-body scans of people in skin-tight clothes. Such a model can befitted to a point cloud of a person wearing loose clothes, however, it can-not represent the detailed shape of loose clothes, such as wrinkles and/orfolds. In this paper, we propose a method that reconstructs 3D modelof full-body human with loose clothes by reproducing the deformationsas displacements from the skin-tight body mesh. We take advantage ofa statistical shape model as base shape of full-body human mesh, andthen, obtain displacements from the base mesh by non-rigid registration.To efficiently represent such displacements, we use lower dimensional em-beddings of the deformations. This enables us to regress the coefficientscorresponding to the small number of bases. We also propose a methodto reconstruct shape only from a single 3D scanner, which is realized byshape fitting to only visible meshes as well as intra-frame shape inter-polation. Our experiments with both unknown scene and partial bodyscans confirm the reconstruction ability of our proposed method.

  • Revisiting Depth Image Fusion with Variational Message Passing Reviewed International journal

    Diego Thomas, Ekaterina Sirazitdinova, Akihiro Sugimoto, Rin-ichiro Taniguchi

    International conference on 3D vison 2019.   2019.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    The running average approach has long been perceived as the best choice for fusing depth measurements captured by a consumer-grade RGB-D camera into a global 3D model. This strategy, however, assumes exact correspondences between points in a 3D model and points in the captured RGB-D images. Such assumption does not hold true in many cases because of errors in motion tracking, noise, occlusions, or inconsistent surface sampling during measurements. Accordingly, reconstructed 3D models suffer unpleasant visual artifacts. In this paper, we visit the depth fusion problem from a probabilistic viewpoint and formulate it as a probabilistic optimization using variational message passing in a Bayesian network. Our formulation enables us to fuse depth images robustly, accurately, and fast for high quality RGB-D keyframe creation, even if exact point correspondences are not always available. Our formulation also allows us to smoothly combine depth and color information for further improvements without increasing computational speed. The quantitative and qualitative comparative evaluation on built keyframes of indoor scenes show that our proposed framework achieves promising results for reconstructing accurate 3D models while using low computational power and being robust against misalignment errors without post-processing.

  • Landmark-guided deformation transfer of template facial expressions for automatic generation of avatar blendshapes Reviewed International journal

    Hayato Onizuka, Diego Thomas, Hideaki Uchiyama, Rin-ichiro Taniguchi

    Proceedings of the IEEE International Conference on Computer Vision Workshops   2019.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Blendshape models are commonly used to track and re-target facial expressions to virtual avatars using RGB-D cameras and without using any facial marker. When using blendshape models, the target avatar model must possess a set of key-shapes that can be blended depending on the estimated facial expression. Creating realistic set of key-shapes is extremely difficult and requires time and professional expertise. As a consequence, blendshape-based re-targeting technology can only be used with a limited amount of pre-built avatar models, which is not attractive for the large public. In this paper, we propose an automatic method to easily generate realistic key-shapes of any avatar that map directly to the source blendshape model (the user is only required to select a few facial landmarks on the avatar mesh). By doing so, captured facial motion can be easily re-targeted to any avatar, even when the avatar has largely different shape and topology compared with the source template mesh. Our experimental results show the accuracy of our proposed method compared with the state-of-the-art method for mesh deformation transfer.

  • Dense 3D reconstruction by combining photometric stereo and key frame-based SLAM with a moving smartphone and its flashlight Reviewed International journal

    @Remy Maxence, Hideaki Uchiyama, Hiroshi Kawasaki, Diego Thomas, Vincent Nozick, Hideo Saito

    International Conference on 3D vision   2019.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    The standard photometric stereo is a technique to densely reconstruct objects’ surfaces using light variation under the assumption of a static camera with a moving light source. In this work, we use photometric stereo to reconstruct dense 3D scenes while moving the camera and the light altogether. In such non-static case, camera poses as well as correspondences between pixels of each frame to apply photometric stereo are required. ORB-SLAM is a technique that can be used to estimate camera poses. To retrieve correspondences, our idea is to start from a sparse 3D mesh obtained with ORB SLAM and then densify the mesh by a plane sweep method using a multi-view photometric consistency. By combining ORB-SLAM and photometric stereo, it is possible to reconstruct dense 3D scenes with a off-the-shelf smartphone and its embedded torchlight. Note that SLAM systems usually struggle with textureless object, which is effectively compensated by the photometric stereo in our method. Experiments are conducted to show that our proposed method gives better results than SLAM alone or COLMAP, especially for partially textureless surfaces.

  • SegmentedFusion: 3D human body reconstruction using stitched bounding boxes Reviewed International journal

    Shih Hsuan Yao, Diego Thomas, Akihiro Sugimoto, Shang-Hong Lai, Rin-Ichiro Taniguchi Kyushu

    2018 International Conference on 3D Vision (3DV)   2018.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    This paper presents SegmentedFusion, a method possessing the capability of reconstructing non-rigid 3D models of a human body by using a single depth camera with skeleton information. Our method estimates a dense volumetric 6D motion field that warps the integrated model into the live frame by segmenting a human body into different parts and building a canonical space for each part. The key feature of this work is that a deformed and connected canonical volume for each part is created, and it is used to integrate data. The dense volumetric warp field of one volume is represented efficiently by blending a few rigid transformations. Overall, SegmentedFusion is able to scan a non-rigidly deformed human surface as well as to estimate the dense motion field by using a consumer-grade depth camera. The experimental results demonstrate that SegmentedFusion is robust against fast inter-frame motion and topological changes. Since our method does not require prior assumption, SegmentedFusion can be applied to a wide range of human motions.

  • Augmented blendshapes for real-time simultaneous 3d head modeling and facial motion capture Reviewed International journal

    Diego Thomas, Rin-Ichiro Taniguchi

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition   2016.6

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We propose a method to build in real-time animated 3D head models using a consumer-grade RGB-D camera. Our framework is the first one to provide simultaneously com- prehensive facial motion tracking and a detailed 3D model of the user’s head. Anyone’s head can be instantly recon- structed and his facial motion captured without requiring any training or pre-scanning. The user starts facing the camera with a neutral expression in the first frame, but is free to move, talk and change his face expression as he wills otherwise. The facial motion is tracked using a blendshape representation while the fine geometric details are captured using a Bump image mapped over the template mesh. We propose an efficient algorithm to grow and refine the 3D model of the head on-the-fly and in real-time. We demon- strate robust and high-fidelity simultaneous facial motion tracking and 3D head modeling results on a wide range of subjects with various head poses and facial expressions. Our proposed method offers interesting possibilities for an- imation production and 3D video telecommunications.

  • Range Image Registration Using a Photometric Metric under Unknown Lighting Reviewed International journal

    Diego Thomas, Akihiro Sugimoto

    IEEE transactions on pattern analysis and machine intelligence   2013.9

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    Based on the spherical harmonics representation of image formation, we derive a new photometric metric for evaluating the correctness of a given rigid transformation aligning two overlapping range images captured under unknown, distant, and general illumination. We estimate the surrounding illumination and albedo values of points of the two range images from the point correspondences induced by the input transformation. We then synthesize the color of both range images using albedo values transferred using the point correspondences to compute the photometric reprojection error. This way allows us to accurately register two range images by finding the transformation that minimizes the photometric reprojection error. We also propose a practical method using the proposed photometric metric to register pairs of range images devoid of salient geometric features, captured under unknown lighting. Our method uses a hypothesize-and-test strategy to search for the transformation that minimizes our photometric metric. Transformation candidates are efficiently generated by employing the spherical representation of each range image. Experimental results using both synthetic and real data demonstrate the usefulness of the proposed metric.

  • ActiveNeuS: Neural Signed Distance Fields for Active Stereo Reviewed International journal

    Kazuto Ichimaru, Takaki Ikeda, Diego Thomas, Takafumi Iwaguchi, Hiroshi Kawasaki

    International Conference on 3D Vision   2024.3

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Weakly-Supervised 3D Reconstruction of Clothed Humans via Normal Maps International journal

    Wu, Jane, Diego Thomas, and Ronald Fedkiw

    arXiv   2023.11

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

  • A Two-step Approach for Interactive Animatable Avatars Reviewed International journal

    #Takumi Kitamura, Diego Thomas, Hiroshi Kawasaki, Naoya Iwamoto

    COMPUTER GRAPHICS INTERNATIONAL 2023   2023.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We propose a new two-step human body animation technique based on displacement mapping that can learn a detailed deformation space, works at interactive time (more than 30 fps) and can be directly integrated into standard animation environments. To achieve real-time animation we employ the template-based approach and model pose-dependent deformations with 2D displacement images. We propose our own template model to facilitate and automatize training data preparation. Key to achieve detailed animation with few artifacts is to learn pose-dependent displacements directly in the pose space, without having to predict skinning weights. In order to generalize to totally new motions we employ a two step approach where the first step contains knowledge about general human motion while second step contains information about user specific motion. Our experimental results show that our proposed method can animate an avatar up to 300 times faster than baselines while keeping similar or even better level of details.

  • ACT2G: Attention-based Contrastive Learning for Text-to-Gesture Generation Reviewed International journal

    #Teshima Hitoshi, Naoki Wake, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, Katsushi Ikeuchi

    ACM on Computer Graphics and Interactive Techniques 6   2023.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Deep Gesture Generation for Social Robots Using Type-Specific Libraries Reviewed International journal

    Hitoshi Teshima, Naoki Wake, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, Katsushi Ikeuchi

    2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)   2022.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Body language such as conversational gesture is a powerful way to ease communication. Conversational gestures do not only make a speech more lively but also contain semantic meaning that helps to stress important information in the discussion. In the field of robotics, giving conversational agents (humanoid robots or virtual avatars) the ability to properly use gestures is critical, yet remain a task of extraordinary difficulty. This is because given only a text as input, there are many possibilities and ambiguities to generate an appropriate gesture. Different to previous works we propose a new method that explicitly takes into account the gesture types to reduce these ambiguities and generate human-like conversational gestures. Key to our proposed system is a new gesture database built on the TED dataset that allows us to map a word to one of three types of gestures: “Imagistic” gestures, which express the content …

  • Self-calibration of multiple-line-lasers based on coplanarity and Epipolar constraints for wide area shape scan using moving camera Reviewed International journal

    Genki Nagamatsu, Takaki Ikeda, Takafumi Iwaguchi, Diego Thomas, Jun Takamatsu, Hiroshi Kawasaki

    2022 26th International Conference on Pattern Recognition (ICPR)   2022.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    High-precision three-dimensional scanning systems have been intensively researched and developed. Recently, for acquisition of large scale scene with high density, simultaneous localisation and mapping (SLAM) technique is preferred because of its simplicity; a single sensor that is moved around freely during 3D scanning. However, to integrate multiple scans, captured data as well as position of each sensor must be highly accurate, making these systems difficult to use in environments not accessible by humans, such as underwater, internal body, or outer space. In this paper, we propose a new, flexible system with multiple line lasers that reconstructs dense and accurate 3D scenes. The advantages of our proposed system are (1) no need of synchronization nor precalibration between lasers and a camera, and (2) the system can reconstruct 3D scenes in extreme conditions, such as underwater. We propose a …

  • 3D pedestrian localization using multiple cameras: a generalizable approach Reviewed International journal

    João Paulo Lima, Rafael Roberto, Lucas Figueiredo, Francisco Simões, Diego Thomas, Hideaki Uchiyama, Veronica Teichrieb

    Machine Vision and Applications   2022.7

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    Pedestrian detection is a critical problem in many areas, such as smart cities, surveillance, monitoring, autonomous driving, and robotics. AI-based methods have made tremendous progress in the field in the last few years, but good performance is limited to data that match the training datasets. We present a multi-camera 3D pedestrian detection method that does not need to be trained using data from the target scene. The core idea of our approach consists in formulating consistency in multiple views as a graph clique cover problem. We estimate pedestrian ground location on the image plane using a novel method based on human body poses and person’s bounding boxes from an off-the-shelf monocular detector. We then project these locations onto the ground plane and fuse them with a new formulation of a clique cover problem from graph theory. We propose a new vertex ordering strategy to define fusion …

  • Generalizable Online 3D Pedestrian Tracking with Multiple Cameras Reviewed International journal

    Victor Lyra, Isabella de Andrade, Joao Paulo Lima, Rafael Roberto, Lucas Figueiredo, Joao Marcelo Teixeira, Diego Thomas, Hideaki Uchiyama, Veronica Teichrieb

    Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022)   2022.3

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    3D pedestrian tracking using multiple cameras is still a challenging task with many applications such as surveillance, behavioral analysis, statistical analysis, and more. Many of the existing tracking solutions involve training the algorithms on the target environment, which requires extensive time and effort. We propose an online 3D pedestrian tracking method for multi-camera environments based on a generalizable detection solution that does not require training with data of the target scene. We establish temporal relationships between people detected in different frames by using a combination of graph matching algorithm and Kalman filter. Our proposed method obtained a MOTA and MOTP of 77.1% and 96.4%, respectively on the test split of the public WILDTRACK dataset. Such results correspond to an improvement of approximately 3.4% and 22.2%, respectively, compared to the best existing online technique. Our experiments also demonstrate the advantages of using appearance information to improve the tracking performance.

  • Refining OpenPose With a New Sports Dataset for Robust 2D Pose Estimation Reviewed International journal

    Takumi Kitamura, Hitoshi Teshima, Diego Thomas, Hiroshi Kawasaki

    Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision   2022.1

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    3D marker-less motion capture can be achieved by triangulating estimated multi-views 2D poses. However, when the 2D pose estimation fails, the 3D motion capture also fails. This is particularly challenging for sports performance of athletes, which have extreme poses. In extreme poses (like having the head down) state-of-the-art 2D pose estimator such as OpenPose do not work at all. In this paper, we propose a new method to improve the training of 2D pose estimators for extreme poses by leveraging a new sports dataset and our proposed data augmentation strategy. Our results show significant improvements over previous methods for 2D pose estimation of athletes performing acrobatic moves, while keeping state-of-the-art performance on standard datasets.

  • Integration of gesture generation system using gesture library with DIY robot design kit Reviewed International journal

    Hitoshi Teshima, Naoki Wake, Diego Thomas, Yuta Nakashima, David Baumert, Hiroshi Kawasaki, Katsushi Ikeuchi

    2022 IEEE/SICE International Symposium on System Integration (SII)   2022.1

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Self-calibrated dense 3D sensor using multiple cross line-lasers based on light sectioning method and visual odometry Reviewed International journal

    Genki Nagamatsu, Jun Takamatsu, Takafumi Iwaguchi, Diego Thomas, Hiroshi Kawasaki

    2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)   2021.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • PoseRN: A 2D pose refinement network for bias-free multi-view 3D human pose estimation Reviewed International journal

    Akihiko Sayo, Diego Thomas, Hiroshi Kawasaki, Yuta Nakashima, Katsushi Ikeuchi

    2021 IEEE International Conference on Image Processing (ICIP)   2021.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We propose a new 2D pose refinement network that learns to predict the human bias in the estimated 2D pose. There are biases in 2D pose estimations that are due to differences between annotations of 2D joint locations based on annotators’ perception and those defined by motion capture (MoCap) systems. These biases are crafted into publicly available 2D pose datasets and cannot be removed with existing error reduction approaches. Our proposed pose refinement network allows us to efficiently remove the human bias in the estimated 2D poses and achieve highly accurate multi-view 3D human pose estimation.

  • Unsupervised 3D Human Pose Estimation in Multi-view-multi-pose Video Reviewed International journal

    #Cheng Sun, Diego Thomas, Hiroshi Kawasaki

    2020 25th International Conference on Pattern Recognition (ICPR)   5959 - 5964   2021.1

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Analysis and classification of gestures in ted talks Reviewed International journal

    Hitoshi Teshima, Naoki Wake, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, Katsushi Ikeuchi

    IEICE Technical Report; IEICE Tech. Rep.   2020.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • On-the-fly Extrinsic Calibration of Non-Overlapping in-Vehicle Cameras based on Visual SLAM under 90-degree Backing-up Parking Reviewed International journal

    Kazuki Nishiguchi, Hideaki Uchiyama, Kazutaka Hayakawa, Jun Adachi, Diego Thomas, Atsushi Shimada, Rin-Ichiro Taniguchi

    2020 IEEE Intelligent Vehicles Symposium (IV)   2020.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Real-time Simultaneous 3D Head Modeling and Facial Motion Capture with an RGB-D camera International journal

    Diego Thomas

    arXiv preprint arXiv:2004.10557   2020.4

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

  • Generating a consistent global map under intermittent mapping conditions for large-scale vision-based navigation Reviewed International journal

    Kazuki Nishiguchi, Walid Bousselham, Hideaki Uchiyama, Diego Thomas, Atsushi Shimada, Rin Ichiro Taniguchi

    15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2020   2020.1

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Localization is the process to compute sensor poses based on vision technologies such as visual Simultaneous Localization And Mapping (vSLAM). It can generally be applied to navigation systems . To achieve this, a global map is essential such that the relocalization process requires a single consistent map represented with an unified coordinate system. However, a large-scale global map cannot be created at once due to insufficient visual features at some moments. This paper presents an interactive method to generate a consistent global map from intermittent maps created by vSLAM independently via global reference points. First, vSLAM is applied to individual image sequences to create maps independently. At the same time, multiple reference points with known latitude and longitude are interactively recorded in each map. Then, the coordinate system of each individual map is converted into the one that has metric scale and unified axes with the reference points. Finally, the individual maps are merged into a single map based on the relative position of each origin. In the evaluation, we show the result of map merging and relocalization with our dataset to confirm the effectiveness of our method for navigation tasks. In addition, the report on participating in the navigation competition in a practical environment is also discussed.

  • Regression of 3D human body shapes from a single image in a tetrahedral volume International journal

    #Hayato Onizuka, Diego Thomas, @Zehra Hayirci, Akihiro Sugimoto, Hideaki Uchiyama, Rin-ichiro Taniguchi

    THE 15TH JOINT WORKSHOP ON MACHINE PERCEPTION AND ROBOTICS   2019.11

     More details

    Language:English   Publishing type:Research paper (bulletin of university, research institution)  

    Reconstructing a 3D shape from a single 2D image is an ill-posed problem. This is because different 3D shapes may produce the same 2D image. Nevertheless, under some conditions and with the help of deep neural networks (DNN), approximate solutions can be obtained. The recent advances in convolutional neural networks (CNNs) for 3D object shape reconstruction from a single image are particularly thrilling for the case of 3D human body shape retrieval. The 3D human body has been extensively studied and modelled using standard computer vision techniques, which give us a sufficient amount of prior knowledge to constrain the 3D shape recovery problem using DNN. Current solutions, however, fail to reconstruct the fine details of the body due to a required huge amount of memory that cannot be maintained even on modern GPUs. In this paper, we propose the tetrahedral volumetric truncated signed distance function (TSDF) model for the human body, and its corresponding part connection network (PCN) for detailed shape regression. Our proposed 3D representation requires a low amount of memory and allows us to reconstruct detailed shapes from a single RGB image. Experimental results using real data demonstrate that our proposed method is promising.

  • Real-Time Facial Motion Capture Using RGB-D Images Under Complex Motion and Occlusions Reviewed International journal

    @Joao Otavio de Lucena, Joao Paulo Lima, Diego Thomas, Veronica Teichrieb

    THE SYMPOSIUM ON VIRTUAL AND AUGMENTED REALITY 2019   2019.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We present a technique for capturing facial performance in real time using an RGB-D camera. Such method can be applied to face augmentation by leveraging facial expression changes. The technique is able to perform both 3D facial modeling and facial motion tracking without the need of pre-scanning or training for a specific user.
    The proposed approach builds on an existing method that we refer as FaceCap, which uses a blendshape representation and a Bump image for tracking facial motion and capturing geometric details. The original FaceCap algorithm fails in some scenarios with complex motion and occlusions, mainly due to problems in the face detection and tracking steps. FaceCap also has problems with the Bump image filtering step that generates outliers, causing more distortion on the 3D augmented blendshape.
    In order to solve these problems, we propose two refinements: (a) a new framework for face detection and landmark localization based on the state-of-the-art methods MTCNN and CE-CLM, respectively; and (b) a simple but effective modification in the filtering step, removing reconstruction failures in the eye region.
    Experiments showed that the proposed approach can deal with unconstrained scenarios, such as large head pose variations and partial occlusions, while achieving real-time execution.

  • MeRA: An Interactive Mediated Reality Agent for Educational Application Reviewed International journal

    @Guillaume Quiniou, Frederic Rayar, Diego Thomas

    International Symposium on Mixed and Augmented Reality | ISMAR 2019   2019.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    The recent developments of Mixed Reality devices and advances in 3D scene understanding and mapping unlock new possibilities for richer interactions between users, the surrounding 3D environment but also virtual agents. In this work, we present MeRA: an interactive Meditated Reality agent for ludic and educational applications. The agent evolves in mediated tabletop environment and can help the user to learn, play or create Tangram, a jigsaw-like traditional game. This opens new exciting perspectives for educational support of young children, who require active and human-like interactions.

  • Landmark-guided deformation transfer of template facial expressions for automatic generation of avatar blendshapes Reviewed International journal

    #Hayato Onizuka, Diego Thomas, Hideaki Uchiyama, Rin-ichiro Taniguchi

    The 2nd Workshop on 3D Reconstruction in the Wild (3DRW2019) in conjunction with ICCV2019   2019.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Blendshape models are commonly used to track and re-target facial expressions to virtual avatars using RGB-D cameras and without using any facial marker. When using blendshape models, the target avatar model must possess a set of key-shapes that can be blended depending on the estimated facial expression. Creating realistic set of key-shapes is extremely difficult and requires time and professional expertise. As a consequence, blendshape-based re-targeting technology can only be used with a limited amount of pre-built avatar models, which is not attractive for the large public. In this paper, we propose an automatic method to easily generate realistic key-shapes of any avatar that map directly to the source blendshape model (the user is only required to select a few facial landmarks on the avatar mesh). By doing so, captured facial motion can be easily re-targeted to any avatar, even when the avatar has largely different shape and topology compared with the source template mesh. Our experimental results show the accuracy of our proposed method compared with the state-of-the-art method for mesh deformation transfer.

  • Blended-Keyframes for Mobile Mediated Reality Applications Reviewed International journal

    #Yu Xue, Diego Thomas, Frederic Rayar, Hideaki Uchiyama, Rin-ichiro Taniguchi, Boacai Yin

    IEEE International Symposium on Mixed and Augmented Reality (ISMAR)   2019.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    With the recent developments of Mixed Reality (MR) devices and advances in 3D scene understanding, MR applications on mobile devices are becoming available to a large part of the society. These applications allow users to mix virtual content into the surrounding environment. However the ability to mediate (\textit{i.e.}, modify or alter) the surrounding environment remains a difficult and unsolved problem that limits the degree of immersion of current MR applications on mobile devices. In this paper, we present a method to mediate 2D views of a real environment using a single consumer-grade RGB-D camera and without the need of pre-scanning the scene. Our proposed method creates in real-time a dense and detailed keyframe-based 3D map of the real scene and takes advantage of a semantic instance segmentation to isolate target objects. We show that our proposed method allows to remove target objects in the environment and to replace them by their virtual counterpart, which are built on-the-fly. Such an approach is well suited for creating mobile Mediated Reality applications.

  • 仲介現実を用いた次世代教育に向けるアプリ

    #Xue Yu, Diego Thomas, Frederic Rayar, Hideaki Uchiyama, Yin Baocai, Rin-ichiro Taniguchi

    第22回 画像の認識・理解シンポジウム (MIRU2019)   2019.8

     More details

    Language:Japanese   Publishing type:Research paper (other academic)  

    近年では、仲介現実(Mediated Reality)の設備と三次元復元に関する3D シーン理解の研究が行われてきた。仮想エージェントと人と周囲環境の相互作用手段が適用可能となった。本稿では仲介現実エージェントを紹介して、市販の手持ちRGBDカメラを用いた、モバイル機器上に仲介現実環境を構築する方法を提案した。

  • 3D Body and Background Reconstruction in a Large-scale Indoor Scene using Multiple Depth Cameras Reviewed International journal

    Daisuke Kobayashi ; Diego Thomas ; Hideaki Uchiyama ; Rin-ichiro Taniguchi

    2019 12th Asia Pacific Workshop on Mixed and Augmented Reality (APMAR)   2019.3

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    3D reconstruction of indoor scenes that contain non-rigidly moving human body using depth cameras is a task of extraordinary difficulty. Despite intensive efforts from the researchers in the 3D vision community, existing methods are still limited to reconstruct small scale scenes. This is because of the difficulty to track the camera motion when a target person moves in a totally different direction. Due to the narrow field of view (FoV) of consumer-grade red-green-blue-depth (RGB-D) cameras, a target person (generally put at about 2-3 meters from the camera) covers most of the FoV of the camera. Therefore, there are not enough features from the static background to track the motion of the camera. In this paper, we propose a system which reconstructs a moving human body and the background of an indoor scene using multiple depth cameras. Our system is composed of three Kinects that are approximately set in the same line and facing the same direction so that their FoV do not overlap (to avoid interference). Owing to this setup, we capture images of a person moving in a large scale indoor scene. The three Kinect cameras are calibrated with a robust method that uses three large non parallel planes. A moving person is detected by using human skeleton information, and is reconstructed separately from the static background. By separating the human body and the background, static 3D reconstruction can be adopted for the static background area while a method specialized for the human body area can be used to reconstruct the 3D model of the moving person. The experimental result shows the performance of proposed system for human body in a large-scale indoor scene.

  • 3D Body and Background Reconstruction in a Large-scale Indoor Scene using Multiple Depth Cameras Reviewed International journal

    #Daisuke Kobayashi, Diego Thomas, Hideaki Uchiyama, Rin-ichiro Taniguchi

    The 12th Asia Pacific Workshop on Mixed and Augmented Reality   2019.3

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    3D reconstruction of indoor scenes that contain non-rigidly moving human body using depth cameras is a task of extraordinary difficulty. Despite intensive efforts from the researchers in the 3D vision community, existing methods are still limited to reconstruct small scale scenes. This is because of the difficulty to track the camera motion when a target person moves in a totally different direction. Due to the narrow field of view (FoV) of consumer-grade red-green-blue-depth (RGB-D) cameras, a target person (generally put at about $2-3$ meters from the camera) covers most of the FoV of the camera. Therefore, there are not enough features from the static background to track the motion of the camera. In this paper, we propose a system which reconstructs a moving human body and the background of an indoor scene using multiple depth cameras. Our system is composed of three Kinects that are approximately set in the same line and facing the same direction so that their FoV do not overlap (to avoid interference). Owing to this setup, we capture images of a person moving in a large scale indoor scene. The three Kinect cameras are calibrated with a robust method that uses three large non parallel planes. A moving person is detected by using human skeleton information, and is reconstructed separately from the static background. By separating the human body and the background, static 3D reconstruction can be adopted for the static background area while a method specialized for the human body area can be used to reconstruct the 3D model of the moving person. The experimental result shows the performance of proposed system for human body in a large-scale indoor scene.

  • Solving monocular visual odometry scale factor with adaptive step length estimates for pedestrians using handheld devices Reviewed International journal

    Nicolas Antigny, Hideaki Uchiyama, Myriam Servières, Valérie Renaudin, Diego Thomas, Rin-ichiro Taniguchi

    MDPI Sensors   2019.1

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    The urban environments represent challenging areas for handheld device pose estimation (ie, 3D position and 3D orientation) in large displacements. It is even more challenging with low-cost sensors and computational resources that are available in pedestrian mobile devices (ie, monocular camera and Inertial Measurement Unit). To address these challenges, we propose a continuous pose estimation based on monocular Visual Odometry. To solve the scale ambiguity and suppress the scale drift, an adaptive pedestrian step lengths estimation is used for the displacements on the horizontal plane. To complete the estimation, a handheld equipment height model, with respect to the Digital Terrain Model contained in Geographical Information Systems, is used for the displacement on the vertical axis. In addition, an accurate pose estimation based on the recognition of known objects is punctually used to correct the pose estimate and reset the monocular Visual Odometry. To validate the benefit of our framework, experimental data have been collected on a 0.7 km pedestrian path in an urban environment for various people. Thus, the proposed solution allows to achieve a positioning error of 1.6–7.5% of the walked distance, and confirms the benefit of the use of an adaptive step length compared to the use of a fixed-step length.

  • Incremental 3D Cuboid Modeling with Drift Compensation Reviewed International journal

    Masashi Mishima, Hideaki Uchiyama, Diego Thomas, Rin-ichiro Taniguchi, Rafael Roberto, João Lima, Veronica Teichrieb

    MDPI Sensors   2019.1

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    This paper presents a framework of incremental 3D cuboid modeling by using the mapping results of an RGB-D camera based simultaneous localization and mapping (SLAM) system. This framework is useful in accurately creating cuboid CAD models from a point cloud in an online manner. While performing the RGB-D SLAM, planes are incrementally reconstructed from a point cloud in each frame to create a plane map. Then, cuboids are detected in the plane map by analyzing the positional relationships between the planes, such as orthogonality, convexity, and proximity. Finally, the position, pose, and size of a cuboid are determined by computing the intersection of three perpendicular planes. To suppress the false detection of the cuboids, the cuboid shapes are incrementally updated with sequential measurements to check the uncertainty of the cuboids. In addition, the drift error of the SLAM is compensated by the registration of the cuboids. As an application of our framework, an augmented reality-based interactive cuboid modeling system was developed. In the evaluation at cluttered environments, the precision and recall of the cuboid detection were investigated, compared with a batch-based cuboid detection method, so that the advantages of our proposed method were clarified.

  • Indoor Positioning System Based on Chest-Mounted IMU Reviewed International journal

    Chuanhua Lu, Hideaki Uchiyama, Diego Thomas, Atsushi Shimada, Rin-ichiro Taniguchi

    MDPI Sensors   2019.1

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    Demand for indoor navigation systems has been rapidly increasing with regard to location-based services. As a cost-effective choice, inertial measurement unit (IMU)-based pedestrian dead reckoning (PDR) systems have been developed for years because they do not require external devices to be installed in the environment. In this paper, we propose a PDR system based on a chest-mounted IMU as a novel installation position for body-suit-type systems. Since the IMU is mounted on a part of the upper body, the framework of the zero-velocity update cannot be applied because there are no periodical moments of zero velocity. Therefore, we propose a novel regression model for estimating step lengths only with accelerations to correctly compute step displacement by using the IMU data acquired at the chest. In addition, we integrated the idea of an efficient map-matching algorithm based on particle filtering into our system to improve positioning and heading accuracy. Since our system was designed for 3D navigation, which can estimate position in a multifloor building, we used a barometer to update pedestrian altitude, and the components of our map are designed to explicitly represent building-floor information. With our complete PDR system, we were awarded second place in 10 teams for the IPIN 2018 Competition Track 2, achieving a mean error of 5.2 m after the 800 m walking event.

  • FusionMLS: Highly dynamic 3D reconstruction with consumer-grade RGB-D cameras Reviewed International journal

    Siim Meerits, Diego Thomas, Vincent Nozick, Hideo Saito

    Computational Visual Media   2018.12

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    Multi-view dynamic three-dimensional reconstruction has typically required the use of custom shutter-synchronized camera rigs in order to capture scenes containing rapid movements or complex topology changes. In this paper, we demonstrate that multiple unsynchronized low-cost RGB-D cameras can be used for the same purpose. To alleviate issues caused by unsynchronized shutters, we propose a novel depth frame interpolation technique that allows synchronized data capture from highly dynamic 3D scenes. To manage the resulting huge number of input depth images, we also introduce an efficient moving least squares-based volumetric reconstruction method that generates triangle meshes of the scene. Our approach does not store the reconstruction volume in memory, making it memory-efficient and scalable to large scenes. Our implementation is completely GPU based and works in real time. The results shown herein, obtained with real data, demonstrate the effectiveness of our proposed method and its advantages compared to state-of-the-art approaches.

  • Sparse cost volume for efficient stereo matching Reviewed International journal

    Chuanhua Lu, Hideaki Uchiyama, Diego Thomas, Atsushi Shimada, Rin-ichiro Taniguchi

    MDPI Remote Sensing   2018.11

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume Net (SCV-Net) achieving high accuracy, low memory cost and fast computation. The idea of the cost volume for stereo matching was initially proposed in GC-Net. In our work, by making the cost volume compact and proposing an efficient similarity evaluation for the volume, we achieved faster stereo matching while improving the accuracy. Moreover, we propose to use weight normalization instead of commonly-used batch normalization for stereo matching tasks. This improves the robustness to not only sensor noises in images but also batch size in the training process. We evaluated our proposed network on the Scene Flow and KITTI 2015 datasets, its performance overall surpasses the GC-Net. Comparing with the GC-Net, our SCV-Net achieved to:(1) reduce 73.08% GPU memory cost;(2) reduce 61.11% processing time;(3) improve the 3PE from 2.87% to 2.61% on the KITTI 2015 dataset.

  • RGB-D SLAM based incremental cuboid modeling Reviewed International journal

    Masashi Mishima, Hideaki Uchiyama, Diego Thomas, Rin-ichiro Taniguchi, Rafael Roberto, Veronica Teichrieb

    The European Conference on Computer Vision (ECCV) workshops, 2018   2018.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    This paper present a framework for incremental 3D cuboid modeling combined with RGB-D SLAM. While performing RGB-D SLAM, planes are incrementally reconstructed from point clouds. Then, cuboids are detected in the planes by analyzing the positional relationships between the planes; orthogonality, convexity, and proximity. Finally, the position, pose and size of a cuboid are determined by computing the intersection of three perpendicular planes. In addition, the cuboid shapes are incrementally updated to suppress false detections with sequential measurements. As an application of our framework, an augmented reality based interactive cuboid modeling system is introduced. In the evaluation at a cluttered environment, the precision and recall of the cuboid detection are improved with our framework owing to stable plane detection, compared with a batch based method.

  • Live structural modeling using RGB-D SLAM Reviewed International journal

    Nicolas Olivier, Hideaki Uchiyama, Masashi Mishima, Diego Thomas, Rin-Ichiro Taniguchi, Rafael Roberto, João Paulo Lima, Veronica Teichrieb

    2018 IEEE International Conference on Robotics and Automation (ICRA)   2018.5

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    This paper presents a method for localizing primitive shapes in a dense point cloud computed by the RGB-D SLAM system. To stably generate a shape map containing only primitive shapes, the primitive shape is incrementally modeled by fusing the shapes estimated at previous frames in the SLAM, so that an accurate shape can be finally generated. Specifically, the history of the fusing process is used to avoid the influence of error accumulation in the SLAM. The point cloud of the shape is then updated by fusing the points in all the previous frames into a single point cloud. In the experimental results, we show that metric primitive modeling in texture-less and unprepared environments can be achieved online.

  • Synthesis of environment maps for mixed reality Reviewed International journal

    David R Walton, Diego Thomas, Anthony Steed, Akihiro Sugimoto

    2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)   2017.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    When rendering virtual objects in a mixed reality application, it is helpful to have access to an environment map that captures the appearance of the scene from the perspective of the virtual object. It is straightforward to render virtual objects into such maps, but capturing and correctly rendering the real components of the scene into the map is much more challenging. This information is often recovered from physical light probes, such as reflective spheres or fisheye cameras, placed at the location of the virtual object in the scene. For many application areas, however, real light probes would be intrusive or impractical.
    Ideally, all of the information necessary to produce detailed en- vironment maps could be captured using a single device. We intro- duce a method using an RGBD camera and a small fisheye camera, contained in a single unit, to create environment maps at any lo- cation in an indoor scene. The method combines the output from both cameras to correct for their limited field of view and the dis- placement from the virtual object, producing complete environment maps suitable for rendering the virtual content in real time. Our method improves on previous probeless approaches by its ability to recover high-frequency environment maps. We demonstrate how this can be used to render virtual objects which shadow, reflect and refract their environment convincingly.

  • Fast 3D point cloud segmentation using supervoxels with geometry and color for 3D scene understanding Reviewed International journal

    Francesco Verdoja, Diego Thomas, Akihiro Sugimoto

    IEEE International Conference on Multimedia and Expo (ICME), 2017   2017.7

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Segmentation of 3D colored point clouds is a research field with renewed interest thanks to recent availability of inexpensive consumer RGB-D cameras and its importance as an unavoidable low-level step in many robotic applications. However, 3D data's nature makes the task challenging and, thus, many different techniques are being proposed, all of which require expensive computational costs. This paper presents a novel fast method for 3D colored point cloud segmentation. It starts with supervoxel partitioning of the cloud, i.e., an oversegmentation of the points in the cloud. Then it leverages on a novel metric exploiting both geometry and color to iteratively merge the supervoxels to obtain a 3D segmentation where the hierarchical structure of partitions is maintained. The algorithm also presents computational complexity linear to the size of the input. Experimental results over two publicly available datasets demonstrate that our proposed method outperforms state-of-the-art techniques.

  • Parametric surface representation with bump image for dense 3d modeling using an rbg-d camera Reviewed International journal

    Diego Thomas, Akihiro Sugimoto

    International Journal of Computer Vision   123 ( 2 )   2017.6

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    When constructing a dense 3D model of an indoor static scene from a sequence of RGB-D images, the choice of the 3D representation (e.g. 3D mesh, cloud of points or implicit function) is of crucial importance. In the last few years, the volumetric truncated signed distance function (TSDF) and its extensions have become popular in the community and largely used for the task of dense 3D modelling using RGB-D sensors. However, as this representation is voxel based, it offers few possibilities for manipulating and/or editing the constructed 3D model, which limits its applicability. In particular, the amount of data required to maintain the volumetric TSDF rapidly becomes huge which limits possibilities for portability. Moreover, simplifications (such as mesh extraction and surface simplification) significantly reduce the accuracy of the 3D model (especially in the color space), and editing the 3D model is difficult. We propose a novel compact, flexible and accurate 3D surface representation based on parametric surface patches augmented by geometric and color texture images. Simple parametric shapes such as planes are roughly fitted to the input depth images, and the deviations of the 3D measurements to the fitted parametric surfaces are fused into a geometric texture image (called the Bump image). A confidence and color texture image are also built. Our 3D scene representation is accurate yet memory efficient. Moreover, updating or editing the 3D model becomes trivial since it is reduced to manipulating 2D images. Our experimental results demonstrate the advantages of our proposed 3D representation through a concrete indoor scene reconstruction application.

  • Modeling large-scale indoor scenes with rigid fragments using RGB-D cameras Reviewed International journal

    Diego Thomas, Akihiro Sugimoto

    Computer Vision and Image Understanding   2017.4

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Hand-held consumer depth cameras have become a commodity tool for constructing 3D models of indoor environments in real time. Recently, many methods to fuse low quality depth images into a single dense and high fidelity 3D model have been proposed. Nonetheless, dealing with large-scale scenes remains a challenging problem. In particular, the accumulation of small errors due to imperfect camera localization becomes crucial (at large scale) and results in dramatic deformations of the built 3D model. These deformations have to be corrected whenever it is possible (when a loop exists for example). To facilitate such correction, we use a structured 3D representation where points are clustered into several planar patches that compose the scene. We then propose a two-stage framework to build in details and in real-time a large-scale 3D model. The first stage (the local mapping) generates local structured 3D models with rigidity constraints from short subsequences of RGB-D images. The second stage (the global mapping) aggregates all local 3D models into a single global model in a geometrically consistent manner. Minimizing deformations of the global model reduces to re-positioning the planar patches of the local models thanks to our structured 3D representation. This allows efficient, yet accurate computations. Our experiments using real data confirm the effectiveness of our proposed method.

  • Multi-view facial landmark detector learned by the Structured Output SVM Reviewed International journal

    Michal Uřičář, Vojtěch Franc, Diego Thomas, Akihiro Sugimoto, Václav Hlaváč

    Image and Vision Computing   47   2016.3

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    We propose a real-time multi-view landmark detector based on Deformable Part Models (DPM). The detector is composed of a mixture of tree based DPMs, each component describing landmark configurations in a specific range of viewing angles. The usage of view specific DPMs allows to capture a large range of poses and to deal with the problem of self-occlusions. Parameters of the detector are learned from annotated examples by the Structured Output Support Vector Machines algorithm. The learning objective is directly related to the performance measure used for detector evaluation. The tree based DPM allows to find a globally optimal landmark configuration by the dynamic programming. We propose a coarse-to-fine search strategy which allows real-time processing by the dynamic programming also on high resolution images. Empirical evaluation on “in the wild” images shows that the proposed detector is competitive with the state-of-the-art methods in terms of speed and accuracy yet it keeps the guarantee of finding a globally optimal estimate in contrast to other methods.

    DOI: https://doi.org/10.1016/j.imavis.2016.02.004

  • Real-time multi-view facial landmark detector learned by the structured output SVM Reviewed International journal

    Michal Uřičář, Vojtěch Franc, Diego Thomas, Akihiro Sugimoto, Václav Hlaváč

    11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), 2015   2015.5

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    While the problem of facial landmark detection is getting big attention in the computer vision
    community recently, most of the methods deal only with near-frontal views and there is only
    a few really multi-view detectors available, that are capable of detection in a wide range of
    yaw angle (eg Φ ε (-90°, 90°)). We describe a multi-view facial landmark detector based on
    the Deformable Part Models, which treats the problem of the simultaneous landmark
    detection and the viewing angle estimation within a structured output classification
    framework. We present an easily extensible and flexible framework which provides a real-
    time performance on the “in the wild” images, evaluated on a challenging “Annotated Facial
    Landmarks in the Wild” database. We show that our detector achieves better results than the
    current state of the art in terms of the localization error.

    DOI: 10.1109/FG.2015.7284810

  • A two-stage strategy for real-time dense 3D reconstruction of large-scale scenes. Reviewed International journal

    Diego Thomas, Akihiro Sugimoto

    IEEE European Conference on Computer Vision Workshops (ECCVW), 2014   2014.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    The frame-to-global-model approach is widely used for accurate 3D modeling from sequences of RGB-D images. Because still no perfect camera tracking system exists, the accumulation of small errors generated when registering and integrating successive RGB-D images causes deformations of the 3D model being built up. In particular, the deformations become significant when the scale of the scene to model is large. To tackle this problem, we propose a two-stage strategy to build in details a large-scale 3D model with minimal deformations where the first stage creates accurate small-scale 3D scenes in real-time from short subsequences of RGB-D images while the second stage re-organises all the results from the first stage in a geometrically consistent manner to reduce deformations as much as possible. By employing planar patches as the 3D scene representation, our proposed method runs in real-time to build accurate 3D models with minimal deformations even for large-scale scenes. Our experiments using real data confirm the effectiveness of our proposed method.

  • A flexible scene representation for 3D reconstruction using an RGB-D camera Reviewed International journal

    Diego Thomas, Akihiro Sugimoto

    IEEE International Conference on Computer Vision (ICCV), 2013   2013.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Updating a global 3D model with live RGB-D measure- ments has proven to be successful for 3D reconstruction of indoor scenes. Recently, a Truncated Signed Distance Function (TSDF) volumetric model and a fusion algorithm have been introduced (KinectFusion), showing significant advantages such as computational speed and accuracy of the reconstructed scene. This algorithm, however, is expen- sive in memory when constructing and updating the global model. As a consequence, the method is not well scalable to large scenes. We propose a new flexible 3D scene repre- sentation using a set of planes that is cheap in memory use and, nevertheless, achieves accurate reconstruction of in- door scenes from RGB-D image sequences. Projecting the scene onto different planes reduces significantly the size of the scene representation and thus it allows us to generate a global textured 3D model with lower memory requirement while keeping accuracy and easiness to update with live RGB-D measurements. Experimental results demonstrate that our proposed flexible 3D scene representation achieves accurate reconstruction, while keeping the scalability for large indoor scenes.

  • Learning to discover objects in RGB-D images using correlation clustering Reviewed International journal

    Michael Firman, Diego Thomas, Simon Julier, Akihiro Sugimoto

    IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013   1107 - 1112   2013.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We introduce a method to discover objects from RGB-D image collections which does not
    require a user to specify the number of objects expected to be found. We propose a
    probabilistic formulation to find pairwise similarity between image segments, using a
    classifier trained on labelled pairs from the recently released RGB-D Object Dataset. We
    then use a correlation clustering solver to both find the optimal clustering of all the segments
    in the collection and to recover the number of clusters. Unlike traditional supervised learning
    methods, our training data need not be of the same class or category as the objects we
    expect to discover. We show that this parameter-free supervised clustering method has
    superior performance to traditional clustering methods.

  • Compact and accurate 3-d face modeling using an rgb-d camera: Let's open the door to 3-d video conference Reviewed International journal

    Pavan Kumar Anasosalu, Diego Thomas, Akihiro Sugimoto

    The IEEE International Conference on Computer Vision (ICCV) Workshops   2013.5

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We present a method for producing an accurate and compact 3-D face model in real time using a low cost RGB-D sensor like the Kinect camera. We extend and use Bump Images for highly accurate and low memory consumption 3-D reconstruction of the human face. Bump Images are generated by representing the Cartesian coordinates of points on the face in the spherical coordinate system whose origin is the center of the head. After initialization, the Bump Images are updated in real time with every RGB-D frame with respect to the current viewing direction and head pose that are estimated using the frame-to-global-model registration strategy. While high accuracy of the representation allows to recover fine details, low memory use opens new possible applications of consumer depth cameras such as 3-D video conferencing. We validate our approach by quantitatively comparing our result with the result obtained by a commercial high resolution laser scanner. We also discuss the potential of our proposed method for a 3-D video conferencing application with existing internet speeds.

  • Robust simultaneous 3D registration via rank minimization Reviewed International journal

    Diego Thomas, Yasuyuki Matsushita, Akihiro Sugimoto

    Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), 2012   2012.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We present a robust and accurate 3D registration method for a dense sequence of depth images taken from unknown viewpoints. Our method simultaneously estimates multiple extrinsic parameters of the depth images to obtain a registered full 3D model of the scanned scene. By arranging the depth measurements in a matrix form, we formulate the problem as a simultaneous estimation of multiple extrinsics and a low-rank matrix, which corresponds to the aligned depth images as well as a sparse error matrix. Unlike previous approaches that use sequential or heuristic global registration approaches, our solution method uses an advanced convex optimization technique for obtaining a robust solution via rank minimization. To achieve accurate computation, we develop a depth projection method that has minimum sensitivity to sampling by reading projected depth values in the input depth images. We demonstrate the effectiveness of the proposed method through extensive experiments and compare it with previous standard techniques.

  • Range Image Registration Based on Photometry Reviewed

    Diego Thomas

    PhD thesis, The National Institute of Informatics, SOKENDAI, Tokyo, Japan   2012.3

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    3D modeling of a real scene stands for constructing a virtual represen- tation of the scene, generally simplified that can be used or modified at our will. Constructing such a 3D model by hand is a laborious and time consuming task, and automating the whole process has attracted growing in- terest in the computer vision field. In particular, the task of registering (i.e. aligning) different parts of the scene (called range images) acquired from different viewpoints is of crucial importance when constructing 3D models. During the last decades, researchers have concentrated their efforts on this problem and proposed several methodologies to automatically register range images. Thereby, key-point detectors and descriptors have been utilized to match points across different range images using geometric features or tex- tural features. Several similarity metrics have also been proposed to identify the overlapping regions. In spite of the advantages of the current methods, several limitation cases have been reported. In particular, when the scene lacks in discriminative geometric features, the difficulty of accounting for the changes in appearance of the scene observed in different poses, or from different viewpoints, significantly degrades the performance of the current methods. We address this issue by investigating the use of photometry (i.e. the relationship between geometry, reflectance properties and illumination) for range image registration. First, we propose a robust descriptor using albedo that is permissive to errors in the illumination estimation. Second, we propose an albedo extraction technique for specular surfaces that enlarges the range of materials we can deal with. Third, we propose a photometric metric under unknown lighting that allows registration of range images with- out any assumptions on the illumination. With these proposed methods, we significantly enlarge the practicability and range of applications of range image registration.

  • Illumination-free photometric metric for range image registration Reviewed International journal

    Diego Thomas, Akihiro Sugimoto

    IEEE Workshop on Applications of Computer Vision (WACV), 2012   2012.1

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    This paper presents an illumination-free photometric metric for evaluating the goodness of a rigid transformation aligning two overlapping range images, under the assumption of Lambertian surface. Our metric is based on photometric re-projection error but not on feature detection and matching. We synthesize the color of one image using albedo of the other image to compute the photometric re-projection error. The unknown illumination and albedo are estimated from the correspondences induced by the input transformation using the spherical harmonics representation of image formation. This way allows us to derive an illumination-free photometric metric for range image alignment. We use a hypothesize-and-test method to search for the transformation that minimizes our illumination-free photometric function. Transformation candidates are efficiently generated by employing the spherical representation of each image. Experimental results using synthetic and real data show the usefulness of the proposed metric.

  • Robustly registering range images using local distribution of albedo Reviewed International journal

    Diego Thomas and Akihiro Sugimoto

    Computer Vision and Image Understanding   115 ( 5 )   649 - 667   2011.5

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    We propose a robust method for registering overlapping range images of a Lambertian object under a rough estimate of illumination. Because reflectance properties are invariant to changes in illumination, the albedo is promising to range image registration of Lambertian objects lacking in discriminative geometric features under variable illumination. We use adaptive regions in our method to model the local distribution of albedo, which enables us to stably extract the reliable attributes of each point against illumination estimates. We use a level-set method to grow robust and adaptive regions to define these attributes. A similarity metric between two attributes is also defined to match points in the overlapping area. Moreover, remaining mismatches are efficiently removed using the rigidity constraint of surfaces. Our experiments using synthetic and real data demonstrate the robustness and effectiveness of our proposed method.

    DOI: https://doi.org/10.1016/j.cviu.2010.11.016

  • Range image registration of specular objects under complex illumination Reviewed International journal

    Diego Thomas, Akihiro Sugimoto

    Fifth International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT2010)   2010.6

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We present a method for range image registration of specular objects devoid of salient
    geometric properties under complex lighting environment. Our method uses illumination
    consistency on two range images to detect specular highlights, which are used to obtain
    diffuse reflection components. By using light information estimated from the specular
    highlights and the diffuse reflection components, we extract albedo at the surface of an
    object, even under unknown complex lighting environment. We then robustly register the two
    range images using extracted albedo. This technique can handle various kind of illumination
    situations and can be applied to a wide range of materials. Our experiments using synthetic
    data and real data show the effectiveness, the robustness and the accuracy of our proposed
    method.

  • Range image registration of specular objects Reviewed International journal

    Diego Thomas, Akihiro Sugimoto

    Proc. of CVWW’10   2010.2

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We present a method for range image registration of specular objects devoid of salient
    geometric properties under complex lighting environment. We propose to use illumination
    consistency on two range images to detect specular highlights, which are used to obtain
    diffuse reflection components. By using light information estimated from the specular
    highlights and the diffuse reflection components, we extract photometric features invariant to
    changes in pose and illumination, even under unknown complex lighting environment. We
    then robustly register the two range images using these features. This technique can handle
    various kind of illumination situations and can be applied to a wide range of materials. Our
    experiments using synthetic data show the effectiveness, the robustness and the accuracy of
    our proposed method.

  • Robust range image registration using local distribution of albedo Reviewed International journal

    Diego Thomas, Akihiro Sugimoto

    IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), 2009   2009.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We propose a robust registration method for range images under a rough estimate of
    illumination. Because reflectance properties are invariant to changes in illumination, they
    are promising to range image registration of objects lacking in discriminative geometric
    features under variable illumination. In our method, we use adaptive regions to model the
    local distribution of reflectance, which enables us to stably extract reliable attributes of each
    point against illumination estimation. We use a level set method to grow robust and adaptive
    regions to define these attributes. A similarity metric between two attributes is defined using
    the principal component analysis to find matches. Moreover, remaining mismatches are
    efficiently removed using the rigidity constraint of surfaces. Our experiments using synthetic
    and real data demonstrate the robustness and effectiveness of our proposed method.

▼display all

Presentations

  • TetraTSDF: 3D human reconstruction from a single image with a tetrahedral outer shell International conference

    Hayato Onizuka, Zehra Hayirci, Diego Thomas, Akihiro Sugimoto, Hideaki Uchiyama, Rin-ichiro Taniguchi

    IEEE/CVF Conference on Computer Vision and Pattern Recognition  2020.6 

     More details

    Event date: 2021.5

    Language:English  

    Country:Other  

  • Human shape reconstruction with loose clothes from partially observed data by pose specific deformation International conference

    Akihiko Sayo, Hayato Onizuka, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, Katsushi Ikeuchi

    Pacific-Rim Symposium on Image and Video Technology  2019.11 

     More details

    Event date: 2021.5

    Language:English  

    Country:Australia  

  • ActiveNeuS: Neural Signed Distance Fields for Active Stereo International conference

    Kazuto Ichimaru, Takaki Ikeda, Diego Thomas, Takafumi Iwaguchi, Hiroshi Kawasaki

    International Conference on 3D Vision  2024.3 

     More details

    Event date: 2024.3

    Language:English  

    Venue:Davos   Country:Switzerland  

  • A Two-step Approach for Interactive Animatable Avatars International conference

    Takumi Kitamura, Naoya Iwamoto, Hiroshi Kawasaki, Diego Thomas

    COMPUTER GRAPHICS INTERNATIONAL 2023  2023.8 

     More details

    Event date: 2023.8 - 2023.9

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Shanghai   Country:China  

    We propose a two-step human body animation technique that generates pose-dependent detailed deformations in real-time on standard animation pipeline. In order to accomplish real-time animation, we utilize the template-based approach and represent pose-dependent deformations using 2D displacement maps. In order to generalize to totally new motions, we employ a two-step strategy: 1) the first step aligns the topology of the Skinned Multi-Person Linear Model (SMPL) [23] model to our proposed template model. 2) the second step models detailed clothes and muscles deformation for the specific motion. Our experimental results show that our proposed method can animate an avatar up to 30 times faster than baselines while keeping similar or even better level of details.

  • Refining OpenPose With a New Sports Dataset for Robust 2D Pose Estimation International conference

    Takumi Kitamura, Hitoshi Teshima, Diego Thomas, Hiroshi Kawasaki

    IEEE/CVF Winter Conference on Applications of Computer Vision  2022.1 

     More details

    Event date: 2022.1

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Hawai (online)   Country:United States  

    3D marker-less motion capture can be achieved by triangulating estimated multi-views 2D poses. However, when the 2D pose estimation fails, the 3D motion capture also fails. This is particularly challenging for sports performance of athletes, which have extreme poses. In extreme poses (like having the head down) state-of-the-art 2D pose estimator such as OpenPose do not work at all. In this paper, we propose a new method to improve the training of 2D pose estimators for extreme poses by leveraging a new sports dataset and our proposed data augmentation strategy. Our results show significant improvements over previous methods for 2D pose estimation of athletes performing acrobatic moves, while keeping state-of-the-art performance on standard datasets.

    Other Link: https://openaccess.thecvf.com/content/WACV2022W/CV4WS/html/Kitamura_Refining_OpenPose_With_a_New_Sports_Dataset_for_Robust_2D_WACVW_2022_paper.html

  • スポーツ選手のマーカレスモーションキャプチャーのための効率的なOpenpose再学習

    #北村 卓弥, 川崎 洋, ディエゴ トマ

    情報処理学会 第248回自然言語処理・第226回コンピュータビジョンとイメージメディア合同研究発表会  2021.5 

     More details

    Event date: 2021.5

    Language:Japanese   Presentation type:Symposium, workshop panel (public)  

    Country:Japan  

  • Analysis and Classification of Gestures in TED Talks

    Hitoshi Teshima, Naoki Wake, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, Katsushi Ikeuchi

    パターン認識・メディア理解研究会 (PRMU 2020)  2020.10 

     More details

    Event date: 2021.5

    Language:Japanese   Presentation type:Symposium, workshop panel (public)  

    Country:Japan  

  • Unsupervised 3D Human Pose Estimation in Multi-view-multi-pose Video International conference

    Sun, Cheng, Diego Thomas, and Hiroshi Kawasaki

    25th International Conference on Pattern Recognition (ICPR)  2021.1 

     More details

    Event date: 2021.5

    Language:English  

    Country:Other  

  • 3D human body reconstruction using RGB-D camera Invited International conference

    Diego Thomas

    Asia Pacific Society for Computing and Information Technology 2019 Annual Meeting  2019.7 

     More details

    Event date: 2019.7

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Sapporo, Hokkaido   Country:Japan  

    Consumer grade RGB-D cameras have become the commodity tool to build dense 3D models of indoor scenes. Motivated by the strong demand to build high-fidelity personal 3D avatars, there is now many efforts done to use RGB-D cameras to automatically reconstruct high-quality 3D models of the human body. This is a very difficult task because the human body non-rigidly moves during the scanning process. How to simultaneously reconstruct the detailed 3D shape of the body while accurately tracking the non-rigid motion is the main challenge that all successful systems must solve. In addition, to be used in portable devices such as smartphones, solutions that require few memory consumption and low computational power are needed. In this talk, we will first briefly review existing successful strategies for real-time 3D human body reconstruction. Then, we will present our proposed solution for 3D human body reconstruction that is light in memory consumption and computational power. Our main idea here is to separate the full body non-rigid reconstruction into multiple nearly-rigid reconstructions of body parts that are tightly stitched together.

  • VMPFusion: Variational Message Passing for dynamic 3D face reconstruction Invited International conference

    Diego Thomas

    IDS/JFLI workshop  2018.5 

     More details

    Event date: 2019.6

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Osaka   Country:Japan  

    In this talk I will describe a probabilistic approach for dynamic 3D face modeling using a consumer-grade RGB-D camera. In this research my goal is to formulate a strategy to fuse noisy 3D measurements captured with a Kinect camera into a 3D facial model without relying on explicit point correspondences. We propose to tackle this challenging problem with the Variational Message Passing (VMP) algorithm, which optimize a variational distribution using a message passing procedure on a graphical model. We show the validity of our formulation with real-data experiments.

  • 3D Modeling of Large-Scale Indoor Scenes Using RGB-D Cameras Invited International conference

    Diego Thomas, Akihiro Sugimoto

    The 1st International Conference on Advanced Imaging  2015.6 

     More details

    Event date: 2018.6

    Language:English   Presentation type:Oral presentation (general)  

    Venue:National Center of Science, Tokyo, Japan   Country:Japan  

  • Synthesis of environment maps for mixed reality

    David R. Walton, Diego Gabriel Francis Thomas, Anthony Steed, Akihiro Sugimoto

    16th IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2017  2017.11 

     More details

    Event date: 2017.10

    Language:English  

    Venue:Nantes   Country:France  

    When rendering virtual objects in a mixed reality application, it is helpful to have access to an environment map that captures the appearance of the scene from the perspective of the virtual object. It is straightforward to render virtual objects into such maps, but capturing and correctly rendering the real components of the scene into the map is much more challenging. This information is often recovered from physical light probes, such as reflective spheres or fisheye cameras, placed at the location of the virtual object in the scene. For many application areas, however, real light probes would be intrusive or impractical. Ideally, all of the information necessary to produce detailed environment maps could be captured using a single device. We introduce a method using an RGBD camera and a small fisheye camera, contained in a single unit, to create environment maps at any location in an indoor scene. The method combines the output from both cameras to correct for their limited field of view and the displacement from the virtual object, producing complete environment maps suitable for rendering the virtual content in real time. Our method improves on previous probeless approaches by its ability to recover high-frequency environment maps. We demonstrate how this can be used to render virtual objects which shadow, reflect and refract their environment convincingly.

  • Fast 3D point cloud segmentation using supervoxels with geometry and color for 3D scene understanding

    Francesco Verdoja, Diego Gabriel Francis Thomas, Akihiro Sugimoto

    2017 IEEE International Conference on Multimedia and Expo, ICME 2017  2017.8 

     More details

    Event date: 2017.7

    Language:English  

    Venue:Hong Kong  

    Segmentation of 3D colored point clouds is a research field with renewed interest thanks to recent availability of inexpensive consumer RGB-D cameras and its importance as an unavoidable low-level step in many robotic applications. However, 3D data's nature makes the task challenging and, thus, many different techniques are being proposed, all of which require expensive computational costs. This paper presents a novel fast method for 3D colored point cloud segmentation. It starts with supervoxel partitioning of the cloud, i.e., an oversegmentation of the points in the cloud. Then it leverages on a novel metric exploiting both geometry and color to iteratively merge the supervoxels to obtain a 3D segmentation where the hierarchical structure of partitions is maintained. The algorithm also presents computational complexity linear to the size of the input. Experimental results over two publicly available datasets demonstrate that our proposed method outperforms state-of-the-art techniques.

  • Augmented blendshapes for real-time simultaneous 3D head modeling and facial motion capture

    Diego Gabriel Francis Thomas, Rin-Ichiro Taniguchi

    2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016  2016 

     More details

    Event date: 2016.6 - 2016.7

    Language:English  

    Venue:Las Vegas   Country:United States  

    We propose a method to build in real-time animated 3D head models using a consumer-grade RGB-D camera. Our framework is the first one to provide simultaneously comprehensive facial motion tracking and a detailed 3D model of the user's head. Anyone's head can be instantly reconstructed and his facial motion captured without requiring any training or pre-scanning. The user starts facing the camera with a neutral expression in the first frame, but is free to move, talk and change his face expression as he wills otherwise. The facial motion is tracked using a blendshape representation while the fine geometric details are captured using a Bump image mapped over the template mesh. We propose an efficient algorithm to grow and refine the 3D model of the head on-the-fly and in real-time. We demonstrate robust and high-fidelity simultaneous facial motion tracking and 3D head modeling results on a wide range of subjects with various head poses and facial expressions. Our proposed method offers interesting possibilities for animation production and 3D video telecommunications.

  • Dense 3D reconstruction using RGB-D cameras Invited International conference

    Diego Thomas

    International Conference on 3DVision 2014  2014.12 

     More details

    Event date: 2014.12

    Language:English   Presentation type:Public lecture, seminar, tutorial, course, or other speech  

    Venue:3DV2014, Tokyo, Japan.   Country:Japan  

    The generation of fine 3D models from RGB-D (color plus depth) measurements is of great interest for the computer vision community. Although the 3D reconstruction pipeline has been widely studied in the last decades, a new era has started recently with the advent of low cost consumer depth cameras (called RGB-D cameras) that capture RGB-D images at a video rate (e.g., Microsoft Kinect or Asus Xtion Pro). The introduction to the public of 3D measurements has brought its own revolution to the scientific community with many projects and applications using RGB-D cameras.

    In this tutorial, we will give an overview of the existing 3D reconstruction methods using a single RGB-D camera using various 3D representations, including point based representations (SURFELS), implicit volumetric representations (TSDF), patch based representations and parametric representations. These different 3D scene representations give us powerful tools to build virtual representations of the real world in real-time from RGB-D cameras. We can not only reconstruct small-scale static scenes but also large-scale scenes and dynamic scenes. We will also discuss about current trend in depth sensing and future challenges for 3D scene reconstruction.

  • A two-stage strategy for real-time dense 3D reconstruction of large-scale scenes

    Diego Gabriel Francis Thomas, Akihiro Sugimoto

    13th European Conference on Computer Vision, ECCV 2014  2015.1 

     More details

    Event date: 2014.9

    Language:English  

    Venue:Zurich   Country:Switzerland  

    The frame-to-global-model approach is widely used for accurate 3D modeling from sequences of RGB-D images. Because still no perfect camera tracking system exists, the accumulation of small errors generated when registering and integrating successive RGB-D images causes deformations of the 3D model being built up. In particular, the deformations become significant when the scale of the scene to model is large. To tackle this problem, we propose a two-stage strategy to build in details a large-scale 3D model with minimal deformations where the first stage creates accurate small-scale 3D scenes in real-time from short subsequences of RGB-D images while the second stage re-organises all the results from the first stage in a geometrically consistent manner to reduce deformations as much as possible. By employing planar patches as the 3D scene representation, our proposed method runs in real-time to build accurate 3D models with minimal deformations even for large-scale scenes. Our experiments using real data confirm the effectiveness of our proposed method.

  • A flexible scene representation for 3D reconstruction using an RGB-D camera

    Diego Gabriel Francis Thomas, Akihiro Sugimoto

    2013 14th IEEE International Conference on Computer Vision, ICCV 2013  2013 

     More details

    Event date: 2013.12

    Language:English  

    Venue:Sydney, NSW   Country:Australia  

    Updating a global 3D model with live RGB-D measurements has proven to be successful for 3D reconstruction of indoor scenes. Recently, a Truncated Signed Distance Function (TSDF) volumetric model and a fusion algorithm have been introduced (KinectFusion), showing significant advantages such as computational speed and accuracy of the reconstructed scene. This algorithm, however, is expensive in memory when constructing and updating the global model. As a consequence, the method is not well scalable to large scenes. We propose a new flexible 3D scene representation using a set of planes that is cheap in memory use and, nevertheless, achieves accurate reconstruction of indoor scenes from RGB-D image sequences. Projecting the scene onto different planes reduces significantly the size of the scene representation and thus it allows us to generate a global textured 3D model with lower memory requirement while keeping accuracy and easiness to update with live RGB-D measurements. Experimental results demonstrate that our proposed flexible 3D scene representation achieves accurate reconstruction, while keeping the scalability for large indoor scenes.

  • Compact and accurate 3-D face modeling using an RGB-D camera Let's open the door to 3-D video conference

    Pavan Kumar Anasosalu, Diego Gabriel Francis Thomas, Akihiro Sugimoto

    2013 14th IEEE International Conference on Computer Vision Workshops, ICCVW 2013  2013 

     More details

    Event date: 2013.12

    Language:English  

    Venue:Sydney, NSW   Country:Australia  

    We present a method for producing an accurate and compact 3-D face model in real time using a low cost RGB-D sensor like the Kinect camera. We extend and use Bump Images for highly accurate and low memory consumption 3-D reconstruction of the human face. Bump Images are generated by representing the Cartesian coordinates of points on the face in the spherical coordinate system whose origin is the center of the head. After initialization, the Bump Images are updated in real time with every RGB-D frame with respect to the current viewing direction and head pose that are estimated using the frame-to-global-model registration strategy. While high accuracy of the representation allows to recover fine details, low memory use opens new possible applications of consumer depth cameras such as 3-D video conferencing. We validate our approach by quantitatively comparing our result with the result obtained by a commercial high resolution laser scanner. We also discuss the potential of our proposed method for a 3-D video conferencing application with existing internet speeds.

  • Learning to discover objects in RGB-D images using correlation clustering

    Michael Firman, Diego Gabriel Francis Thomas, Simon Julier, Akihiro Sugimoto

    2013 26th IEEE/RSJ International Conference on Intelligent Robots and Systems: New Horizon, IROS 2013  2013.12 

     More details

    Event date: 2013.11

    Language:English  

    Venue:Tokyo   Country:Japan  

    We introduce a method to discover objects from RGB-D image collections which does not require a user to specify the number of objects expected to be found. We propose a probabilistic formulation to find pairwise similarity between image segments, using a classifier trained on labelled pairs from the recently released RGB-D Object Dataset. We then use a correlation clustering solver to both find the optimal clustering of all the segments in the collection and to recover the number of clusters. Unlike traditional supervised learning methods, our training data need not be of the same class or category as the objects we expect to discover. We show that this parameter-free supervised clustering method has superior performance to traditional clustering methods.

  • Robust simultaneous 3D registration via rank minimization

    Diego Gabriel Francis Thomas, Yasuyuki Matsushita, Akihiro Sugimoto

    2nd Joint 3DIM/3DPVT Conference: 3D Imaging, Modeling, Processing, Visualization and Transmission, 3DIMPVT 2012  2012 

     More details

    Event date: 2012.10

    Language:English  

    Venue:Zurich   Country:Switzerland  

    We present a robust and accurate 3D registration method for a dense sequence of depth images taken from unknown viewpoints. Our method simultaneously estimates multiple extrinsic parameters of the depth images to obtain a registered full 3D model of the scanned scene. By arranging the depth measurements in a matrix form, we formulate the problem as a simultaneous estimation of multiple extrinsics and a low-rank matrix, which corresponds to the aligned depth images as well as a sparse error matrix. Unlike previous approaches that use sequential or heuristic global registration approaches, our solution method uses an advanced convex optimization technique for obtaining a robust solution via rank minimization. To achieve accurate computation, we develop a depth projection method that has minimum sensitivity to sampling by reading projected depth values in the input depth images. We demonstrate the effectiveness of the proposed method through extensive experiments and compare it with previous standard techniques.

  • Illumination-free photometric metric for range image registration

    Diego Gabriel Francis Thomas, Akihiro Sugimoto

    2012 IEEE Workshop on the Applications of Computer Vision, WACV 2012  2012 

     More details

    Event date: 2012.1

    Language:English  

    Venue:Breckenridge, CO   Country:United States  

    This paper presents an illumination-free photometric metric for evaluating the goodness of a rigid transformation aligning two overlapping range images, under the assumption of Lambertian surface. Our metric is based on photometric re-projection error but not on feature detection and matching. We synthesize the color of one image using albedo of the other image to compute the photometric re-projection error. The unknown illumination and albedo are estimated from the correspondences induced by the input transformation using the spherical harmonics representation of image formation. This way allows us to derive an illumination-free photometric metric for range image alignment. We use a hypothesize-and-test method to search for the transformation that minimizes our illumination-free photometric function. Transformation candidates are efficiently generated by employing the spherical representation of each image. Experimental results using synthetic and real data show the usefulness of the proposed metric.

  • Robust range image registration using local distribution of albedo

    Diego Gabriel Francis Thomas, Akihiro Sugimoto

    2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops 2009  2009 

     More details

    Event date: 2009.9 - 2009.10

    Language:English  

    Venue:Kyoto   Country:Japan  

    We propose a robust registration method for range images under a rough estimate of illumination. Because reflectance properties are invariant to changes in illumination, they are promising to range image registration of objects lacking in discriminative geometric features under variable illumination. In our method, we use adaptive regions to model the local distribution of reflectance, which enables us to stably extract reliable attributes of each point against illumination estimation. We use a level set method to grow robust and adaptive regions to define these attributes. A similarity metric between two attributes is defined using the principal component analysis to find matches. Moreover, remaining mismatches are efficiently removed using the rigidity constraint of surfaces. Our experiments using synthetic and real data demonstrate the robustness and effectiveness of our proposed method.

▼display all

Professional Memberships

  • IEEE

Academic Activities

  • Area chair

    第27回 画像の認識・理解シンポジウム MIRU2024  ( Kumamoto Japan ) 2024.8

     More details

    Type:Competition, symposium, etc. 

    Number of participants:1,000

  • Screening of academic papers

    Role(s): Peer review

    2023

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:3

    Proceedings of International Conference Number of peer-reviewed papers:25

    Proceedings of domestic conference Number of peer-reviewed papers:3

  • Program commitee International contribution

    CVPR2022  ( New Orleans, Louisiana UnitedStatesofAmerica ) 2022.6

     More details

    Type:Competition, symposium, etc. 

    Number of participants:10,000

  • Senior Program Committee International contribution

    AAAI 2022  ( Vancouver Canada ) 2022.2 - 2022.3

     More details

    Type:Competition, symposium, etc. 

    Number of participants:8,000

  • Screening of academic papers

    Role(s): Peer review

    2022

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:8

    Proceedings of International Conference Number of peer-reviewed papers:18

    Proceedings of domestic conference Number of peer-reviewed papers:3

  • Senior Program Committee Member International contribution

    30th International Joint Conference on Artificial Intelligence (IJCAI-21)  ( Montreal, Canada Canada ) 2021.8 - 2021.5

     More details

    Type:Competition, symposium, etc. 

    Number of participants:1,000

  • Program committee International contribution

    CVPR 2021  ( Online UnitedStatesofAmerica ) 2021.6 - 2021.5

     More details

    Type:Competition, symposium, etc. 

    Number of participants:10,000

  • Program committee International contribution

    WACV 2021  ( UnitedStatesofAmerica ) 2021.3 - 2021.5

     More details

    Type:Competition, symposium, etc. 

    Number of participants:400

  • 講演座長

    情報処理学会第83回全国大会  ( Online Japan ) 2021.3 - 2021.5

     More details

    Type:Competition, symposium, etc. 

    Number of participants:100

  • Local chair International contribution

    3D Vision (3DV 2020)  ( Fukuoka Japan ) 2020.11

     More details

    Type:Competition, symposium, etc. 

    Number of participants:300

  • Program committee International contribution

    CVPR 2020  ( Seattle, Washington UnitedStatesofAmerica ) 2020.6

     More details

    Type:Competition, symposium, etc. 

    Number of participants:8,000

  • Program committee International contribution

    WACV 2020  ( Aspen UnitedStatesofAmerica ) 2020.3

     More details

    Type:Competition, symposium, etc. 

    Number of participants:1,000

  • Screening of academic papers

    Role(s): Peer review

    2020

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:4

    Number of peer-reviewed articles in Japanese journals:2

    Proceedings of International Conference Number of peer-reviewed papers:25

    Proceedings of domestic conference Number of peer-reviewed papers:4

  • Program chair International contribution

    Machine Perception and Robotics (MPR 2019)  ( Biwako Kusatsu Campus (BKC), Ritsumeikan University Japan ) 2019.11

     More details

    Type:Competition, symposium, etc. 

    Number of participants:80

  • Area chair International contribution

    The 9th Pacific-Rim Symposium on Image and Video Technology (PSIVT 2019)  ( Sydney Australia ) 2019.11

     More details

    Type:Competition, symposium, etc. 

    Number of participants:80

  • Screening of academic papers

    Role(s): Peer review

    2019

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:15

    Proceedings of International Conference Number of peer-reviewed papers:25

  • Publicity chair International contribution

    The 12th International Workshop on Information Search, Integration, and Personalization (ISIP2018)  ( Kyushu University, Fukuoka, Japan Japan ) 2018.5

     More details

    Type:Competition, symposium, etc. 

    Number of participants:40

  • Publicity chair International contribution

    The 12th International Workshop on Information Search, Integration, and Personalization (ISIP 2018)  ( Kyushu University, Fukuoka Japan ) 2018.5

     More details

    Type:Competition, symposium, etc. 

    Number of participants:50

  • Screening of academic papers

    Role(s): Peer review

    2018

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:20

    Proceedings of International Conference Number of peer-reviewed papers:20

  • Local arrangement chair International contribution

    JFLI-KYUDAI JOINT WORKSHOP ON INFORMATICS  ( Ito Campus, Kyushu University, Fukuoka, Japan Japan ) 2017.9

     More details

    Type:Competition, symposium, etc. 

    Number of participants:15

  • Screening of academic papers

    Role(s): Peer review

    2017

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:10

    Proceedings of International Conference Number of peer-reviewed papers:24

  • Program Committee International contribution

    SITIS2016  ( Naples Italy ) 2016.11 - 2016.12

     More details

    Type:Competition, symposium, etc. 

  • Program Committee

    MIRU2016  ( Hamamatsu Japan ) 2016.8

     More details

    Type:Competition, symposium, etc. 

  • Screening of academic papers

    Role(s): Peer review

    2016

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:2

    Proceedings of International Conference Number of peer-reviewed papers:19

    Proceedings of domestic conference Number of peer-reviewed papers:1

  • Program Committee

    MIRU2015  ( Osaka Japan ) 2015.7

     More details

    Type:Competition, symposium, etc. 

  • Screening of academic papers

    Role(s): Peer review

    2015

     More details

    Type:Peer review 

    Number of peer-reviewed articles in foreign language journals:1

    Proceedings of International Conference Number of peer-reviewed papers:12

  • Program committee

    MIRU2014  ( Okayama Japan ) 2014.7

     More details

    Type:Competition, symposium, etc. 

▼display all

Research Projects

  • A new data-driven approach to bring humanity into virtual worlds with computer vision

    Grant number:23H03439  2023 - 2025

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (B)

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

  • NerF-based multi-view 3D shape reconstruction using Centroidal Voronoi Tessellation International coauthorship

    2022.4 - 2023.6

    Kyushu University (Japan) 

      More details

    Authorship:Principal investigator 

    We investigate the use of CVT to jointly optimize 3D shape, appearance and discretization of the 3D space for high definition 3D mesh reconstruction from multi-view images.

  • Multi-Camera 3D Pedestrian Detection with Domain Adaptation and Generalization

    2022 - 2023

    Japan Society for the Promotion of Science  JSPS Invitational Fellowships for Research in Japan (short term)

      More details

    Authorship:Principal investigator  Grant type:Joint research

  • AI-based animation of 3D avatars.

    2021.6 - 2022.5

    Joint research

      More details

    Authorship:Principal investigator  Grant type:Other funds from industry-academia collaboration

  • Deep human avatar animation International coauthorship

    2021.5 - 2022.5

    Japan 

      More details

    Authorship:Principal investigator 

    This is a joint research project with HUAWEI about learning to generate avatar animations from 2D videos in real-time

  • Realistic environment rendering with real humans for architecture project visualization

    2021.4 - 2022.5

      More details

    Authorship:Principal investigator 

    This is a joint project with Professor Koga (architecture design) and Professor Ochiai (Maths for industry) about generating immersive virtual environments of architectural project to support design and evaluation.

  • Multi-view 3D pedestrian localisation International coauthorship

    2021.3 - 2023.4

    Brazil 

      More details

    Authorship:Coinvestigator(s) 

    The project is about identifying, localizing and tracking pedestrians in 3D from multi-view videos.

  • A new approach for supporting architectural works with virtual reality environments.

    2021 - 2022

    QR Tsubasa (つばさプロジェクト)

      More details

    Authorship:Principal investigator  Grant type:On-campus funds, funds, etc.

  • Weakly-supervised human 3D body shape estimation from single images International coauthorship

    2020.9 - 2021.8

    U.S.A 

      More details

    Authorship:Coinvestigator(s) 

    We are working on a solution to learn to estimate 3D shape of human bodies from 2D observation in an unsupervised manner.

  • Dynamic human motion tracking using dual quaternion algebra International coauthorship

    2020.7 - 2022.3

    Japan 

      More details

    Authorship:Coinvestigator(s) 

    Joint research project with Vincent Nozick from Gustave-Eiffel University in France . This project is about reconstructing non-rigid motion of human bodies captured by RGB-D cameras.

  • Human body 3D shape estimation, animation and gesture synthesis

    2020.4 - 2021.3

    Joint research

      More details

    Authorship:Principal investigator  Grant type:Other funds from industry-academia collaboration

  • Personalized avatars with real emotions for next generation holoportation systems International coauthorship

    2020.1 - 2021.1

    Microsoft Research Asia 

      More details

    Authorship:Principal investigator 

    Personalized avatars are the key towards more natural communication in the virtual space. If you can express yourself with not only your own voice, but your own body, expressions or emotions it allows you to better communicate. This is also a powerful way to avoid being cheated by fake characters. And there is a huge demand for real avatars and emotes, with a big business opportunity. When communicating in the virtual space it is important to transmit real expressions and real emotions, but it is also important to keep the possibility to remain anonymous. While ultra-realistic avatars that have someone’s own appearance, skin and face will surely break anonymity, body motion and gesture can convey a large part of real expressions and emotions without revealing a person’s identity. In this project, we aim at capturing full body 3D motion and fine gestures and re-targeting them into a mixed reality telepresence system (also called holoportation) deployed on the Microsoft Hololens. To achieve our objective there are three main challenges to tackle: (1) detailed 3D motion of the human body must be captured from standard RGB cameras; (2) the human motion must be faithfully re-targeted to a virtual avatar, which may have different animation characteristics than the human; (3) the avatar must be displayed in 3D with the Hololens while considering the surrounding illumination conditions. Fundamental findings unveiled in the project will provide new insights for human motion estimation, re-targeting to other bodies with different kinematics and environment mapping with mixed reality devices.

  • 2 years training and international research

    2020 - 2022

    SENTAN-Q

      More details

    Authorship:Principal investigator  Grant type:On-campus funds, funds, etc.

  • 3D shape estimation and motion retargeting from 2D videos for future Holoportation systems.

    2020

    QR Wakaba challenge

      More details

    Authorship:Principal investigator  Grant type:On-campus funds, funds, etc.

  • Unifying multiple RGB and depth cameras for real-time large-scale dynamic 3D modeling with unmanned micro aerial vehicles.

    2019.4 - 2021.4

    KAKENHI 

      More details

    Authorship:Principal investigator 

    The project is about real-time 3D reconstruction of large-scale dynamic scenes (i.e., scenes containing one or more moving objects to be modeled, possibly with shape deformation) from unmanned micro aerial vehicles. The objective is to investigate fusion of multiple RGB and depth sensors mounted on multiple micro aerial vehicle for real-time 3D reconstruction of large-scale dynamic 3D scenes. Fundamental algorithms that will be unveiled here will be used to build large-scale dynamic 3D models and provide the necessary tools for real-time automatic dynamic 3D scene understanding.

  • Unifying multiple RGB and depth cameras for real-time large-scale dynamic 3D modeling with unmanned micro aerial vehicles

    Grant number:19K20297  2019 - 2020

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Early-Career Scientists

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

  • Facial motion capture International coauthorship

    2017.10 - 2018.9

    Huawei Technologies Japan K.K (China). 

      More details

    Authorship:Collaborating Investigator(s) (not designated on Grant-in-Aid) 

    This project is divided into three stages, the first stage is that roughly evaluates our base algorithm, and the second stage is that evaluates the robustness for overall reconstruction (expression) ability of the facial impression transfer to any 3D avatar by any person. And the third stage is that improves facial model quality (as for providing complete facial model, we need to add eye ball and mouth).

  • Facial motion capture system

    2017.9 - 2018.8

    Joint research

      More details

    Authorship:Collaborating Investigator(s) (not designated on Grant-in-Aid)  Grant type:Other funds from industry-academia collaboration

  • Free-form dynamic 3D scene reconstruction at high resolution

    2017 - 2018

    スタートアップ支援経費

      More details

    Authorship:Coinvestigator(s)  Grant type:On-campus funds, funds, etc.

  • Large-scale and dynamic 3D reconstruction using an RGB-D camera

    2015 - 2017

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for JSPS Fellows

      More details

    Authorship:Principal investigator  Grant type:Scientific research funding

▼display all

Educational Activities

  • I am teaching practically exercises of data science. In this class we teach program implementation for research subject of each student. We also provide individual guidance within the lecture time.
    I am teaching the class "Information Science" for the first year students of the University.
    I have been teaching the class "Programming in Python" for the first year students of the University.
    I am teaching the class "Digital Humans I & II" at the faculty of Information Science and Electrical Engineering.
    I am teaching the experimental class "Distributed robots" at the faculty of Information Science and Electrical Engineering.

Class subject

  • デジタルヒューマンII

    2024.6 - 2024.8   Summer quarter

  • デジタルヒューマンⅠ

    2024.4 - 2024.6   Spring quarter

  • 情報科学(英語)

    2023.10 - 2024.3   Second semester

  • 情報理工学論議Ⅱ

    2023.10 - 2024.3   Second semester

  • 情報理工学論述Ⅱ

    2023.10 - 2024.3   Second semester

  • 情報理工学演示

    2023.10 - 2024.3   Second semester

  • デジタルヒューマンⅡ

    2023.6 - 2023.8   Summer quarter

  • 【通年】情報理工学講究

    2023.4 - 2024.3   Full year

  • 【通年】情報理工学研究Ⅰ

    2023.4 - 2024.3   Full year

  • 【通年】情報理工学演習

    2023.4 - 2024.3   Full year

  • 情報理工学論議Ⅰ

    2023.4 - 2023.9   First semester

  • 情報理工学読解

    2023.4 - 2023.9   First semester

  • 情報理工学論述Ⅰ

    2023.4 - 2023.9   First semester

  • 分散ロボット実験

    2023.4 - 2023.6   Spring quarter

  • デジタルヒューマンⅠ

    2023.4 - 2023.6   Spring quarter

  • 情報科学(英語)

    2022.10 - 2023.3   Second semester

  • データサイエンス演習第一

    2022.10 - 2023.3   Second semester

  • データサイエンス演習第二

    2022.10 - 2023.3   Second semester

  • 情報知能工学演習 第三

    2021.10 - 2022.3   Second semester

  • 情報知能工学演習 第一

    2021.10 - 2022.3   Second semester

  • データサイエンス演習第二

    2021.10 - 2022.3   Second semester

  • データサイエンス演習第一

    2021.10 - 2022.3   Second semester

  • プログラミング演習(P)

    2021.6 - 2021.8   Summer quarter

  • 情報知能工学演習 第二

    2021.4 - 2021.9   First semester

  • データサイエンス演習第一

    2020.10 - 2021.3   Second semester

  • データサイエンス演習第二

    2020.10 - 2021.3   Second semester

  • 情報科学

    2020.4 - 2020.9   First semester

  • 情報科学

    2019.10 - 2020.3   Second semester

  • データサイエンス演習第一

    2019.4 - 2019.9   First semester

  • データサイエンス演習第二

    2019.4 - 2019.9   First semester

  • データサイエンス演習第一

    2018.4 - 2018.9   First semester

  • データサイエンス演習第二

    2018.4 - 2018.9   First semester

  • データサイエンス演習第二

    2017.4 - 2017.9   First semester

  • データサイエンス演習第一

    2017.4 - 2017.9   First semester

▼display all

Social Activities

  • JSPS Science Dialogue

    Fukui Prefectural Wakasa High School (Wakasa-city, Fukui)  2017.1

     More details

    Audience:Infants, Schoolchildren, Junior students, High school students

    Type:Seminar, workshop

Travel Abroad

  • 2016.12

    Staying countory name 1:France   Staying institution name 1:INRIA Grenoble

  • 2011.3 - 2011.7

    Staying countory name 1:China   Staying institution name 1:Microsoft Research Asia

  • 2010.2

    Staying countory name 1:Czech Republic   Staying institution name 1:Center for Machine Perception (CMP)