PoseFormerV2

Transformer-based 3D pose estimation model with strong temporal reasoning, ideal for jump and spin kinematics.

Primary tasks
3D pose estimation, Motion reconstruction
Modalities
RGB video
Architecture
Token pyramid transformer encoder with pose refinement head
Frameworks
PyTorch
Availability
open-source
Maintainer
Peking University & Tencent AI Lab
Released
Jul 2023
MPJPE on Human3.6M 39.5 mm
P-MPJPE on Human3.6M 24.2 mm

Why It Matters

PoseFormerV2 delivers low-latency, high-fidelity pose keypoints even under rapid rotations. That makes it a strong backbone for estimating exit edges, axis lean, and air position changes during triples and quads.

  1. Pre-train on Human3.6M, then adapt with transfer learning on figure-skating clips.
  2. Pair with a blade contact detector to stabilise foot keypoints during take-off and landing frames.
  3. Export intermediate representations for downstream scoring models that need access to pose trajectories.

Expect to fine-tune the temporal window length: 81–121 frames works well for covering full jump cycles without extra padding.