What is video classification? Popular models CNN + RNN ConvNet with W, H, 3 \times T 3DConvNet Pose Network (but loses a lot of context information)