Articles, Blog

TSM: Temporal Shift Module for Efficient Video Understanding, online demo with NVIDIA Nano

November 30, 2019


Video recognition is computationally
expensive. Here we propose a Temporal Shift Module (TSM) to enable efficient video recognition on edge devices. Here is a low-power board NVIDIA
Jetson Nano. It costs only $99 and it runs at only 8 watts. We show a hand gesture recognition demo running real-time on this board. Here is the output of our model and here is the frame rate of the demo. And with the demo we can recognize hand gestures like thumb up, thumb down you can also
recognize zoom in, zoom out. It’s useful for driving scenarios where we can tell the map to zoom in or zoom out. And you can also recognize gestures like swiping left. You can also push your hand in to tell the car to stop. It can also recognize gestures like drumming fingers. Our model runs at about 300M MACs per frame and the model size is only 14 MB. With the lightweight shift operation, our model can achieve 3D CNN performance at 2D cost, enable real-time AI applications

You Might Also Like

1 Comment

  • Reply Duman Care Khrisne November 14, 2019 at 5:36 am

    wow i like this research verry much

  • Leave a Reply