Please note: We are currently experiencing some performance issues across the site, and some pages may be slow to load. We are working on restoring normal service soon. Importing new articles from Word documents is also currently unavailable. We apologize for any inconvenience.

Qianxing Li

and 4 more

3D human pose estimation (3DHPE) in images aims at estimating 3D joint positions from images. The state-of-theart for 3DHPE is dominated by deep learning model whose accuracy is obviously affected by loss functions. The existing 3DHPE methods usually define the loss function as the error measured by Euclidean distance between the locations of the predicted joints and the ground truth of joints, which confuses two different kinds of errors: the error caused by different pose structures and the others. But in fact, the characteristics of these two kinds of errors are obviously different and should not be processed equally, and consequently decoupling these two kinds of errors and optimizing them separately is one of the ways to improve the 3DHPE accuracy. However, The existing human pose representations are not suitable to distinguish these two kinds of errors. In order to tackle this problem, we propose a novel Multi-Anchor Offset human Representation (MAOR) for human pose, which locates the position of each joint using its offsets from a group of selected high-precision joints named as Multi-Anchors. Making use of MAOR, the pose error related to the distortion of spatial structure can be measured independently from other errors, which is helpful to promote the accuracy of pose estimation. We then propose a novel MAOR based coarseto-fine diffusion model (MAOR-DiffPose) for pose estimation, which optimizes different types of errors of poses step by step. Firstly, a MAOR-based Denoising Process (MDP) is devised to explicitly optimize spatial structures of 3D poses by using MAOR to describe poses and improves the inductive learning ability of MAOR-DiffPose by extracting view-independent features. Secondly, a Joint Coordinate based Denoising Process assisted by MAOR (JCDPaM) is devised to expand the input features meaningfully by combining MAOR with the pose representation based on joint coordinate and optimize the joint coordinates of 3D poses with the assistance of MAOR. MAOR-DiffPose realizes accurate 3DHPE by iterating MDP and JCDPaM modules. Comprehensive experimental results on widely used 3DHPE benchmarks Human3.6M and MPI-INF-3DHP show that the proposed method achieves the best performance compared with the state-of-the-art methods.