Why It Matters
VideoMAE v2 is a dependable feature extractor for downstream classifiers such as jump vs. spin detectors or entry-edge labellers. It captures global motion cues without overfitting to broadcast overlays.
Recommended Usage
- Pre-train with unlabelled rink footage to internalise rink geometry and camera pans.
- Feed the pooled features into lightweight heads (e.g., temporal convolutions or transformers) for rapid element detection.
- Combine with audio embeddings to improve detection of jump take-off sounds or edge swishes in multi-modal systems.