VideoMAE v2

Why It Matters

VideoMAE v2 is a dependable feature extractor for downstream classifiers such as jump vs. spin detectors or entry-edge labellers. It captures global motion cues without overfitting to broadcast overlays.

Recommended Usage

Pre-train with unlabelled rink footage to internalise rink geometry and camera pans.
Feed the pooled features into lightweight heads (e.g., temporal convolutions or transformers) for rapid element detection.
Combine with audio embeddings to improve detection of jump take-off sounds or edge swishes in multi-modal systems.

Why It Matters

Recommended Usage

Useful datasets

FSD-10