The 31st British Machine Vision (Virtual) Conference 2020 : Learning 3D Global Human Motion Estimation from Unpaired, Disjoint Datasets

Learning 3D Global Human Motion Estimation from Unpaired, Disjoint Datasets

Julian Habekost, Takaaki Shiratori, Yuting Ye and Taku Komura

Keywords: 3d human pose estimation 3d global motion estimation unpaired training 2d to 3d pose regression monocular motion capture human time sequence modeling

Abstract: We propose a novel method to compute both the local and global 3D motion of the human body from a 2D monocular video. Our approach only uses unpaired sets of 2D keypoints from target videos and 3D motion capture data for training. The estimation target video dataset is assumed to lack any ground truth and thus our supervision signal comes from motion datasets that are fully disjoint from the target datasets. For each time step, a temporal convolutional generator configures the human pose in the global space to satisfy both a reprojection loss and an adversarial loss. The translational and rotational global motion is then derived and converted into the egocentric representation in a differentiable manner for adversarial learning. We compare our system to state-of-the-art architectures that use the Human3.6M dataset for paired training, and demonstrate comparable precision even though our system is never trained on the ground truth Human3.6M 3D motion capture data. Due to its unpaired and disjoint nature in the training data, our system can be trained on a large set of videos and 3D motion capture data, which can considerably expand the domain of the applicable motion data types.

Paper Supplemental Poster Session 1

Learning 3D Global Human Motion Estimation from Unpaired, Disjoint Datasets

Julian Habekost, Takaaki Shiratori, Yuting Ye and Taku Komura

Keywords: 3d human pose estimation 3d global motion estimation unpaired training 2d to 3d pose regression monocular motion capture human time sequence modeling

Video

Discussion