VideoMAE v2

Masked autoencoder backbone for video that delivers strong general-purpose features for figure skating element discovery.

Primary tasks
Representation learning, Element classification
Modalities
RGB video
Architecture
Hierarchical masked autoencoder with tube masking
Frameworks
PyTorch
Availability
open-source
Maintainer
Huawei Noah’s Ark Lab
Released
Apr 2023
Top-1 on Kinetics-400 88.9%
Top-1 on Something-Something V2 75.6%

Why It Matters

VideoMAE v2 is a dependable feature extractor for downstream classifiers such as jump vs. spin detectors or entry-edge labellers. It captures global motion cues without overfitting to broadcast overlays.