TransPoser:Transformer作为关节对象形状和姿态估计的优化器 TransPoser: Transformer as an Optimizer for Joint Object Shape and Pose Estimation

作者:Yuta Yoshitake Mai Nishimura Shohei Nobuhara Ko Nishino


We propose a novel method for joint estimation of shape and pose of rigidobjects from their sequentially observed RGB-D images. In sharp contrast topast approaches that rely on complex non-linear optimization, we propose toformulate it as a neural optimization that learns to efficiently estimate theshape and pose. We introduce Deep Directional Distance Function (DeepDDF), aneural network that directly outputs the depth image of an object given thecamera viewpoint and viewing direction, for efficient error computation in 2Dimage space. We formulate the joint estimation itself as a Transformer which werefer to as TransPoser. We fully leverage the tokenization and multi-headattention to sequentially process the growing set of observations and toefficiently update the shape and pose with a learned momentum, respectively.Experimental results on synthetic and real data show that DeepDDF achieves highaccuracy as a category-level object shape representation and TransPoserachieves state-of-the-art accuracy efficiently for joint shape and poseestimation.



Related posts