Mobile Visual Analytics: A CVPR 2021 Tutorial
Overview of the Tutorial
As mobile devices and real-time scenarios are becoming more ubiquitous and prevalent, research on mobile visual analytics is extremely urgent. The purpose of mobile visual analytics is to enable entirely new on-device experiences. In the past few years, some effective techniques in mobile visual analytics have been developed to address many challenging issues in many aspects such as real-time detection or segmentation tasks under dynamically changing scenes and computationally constrained resources.
In this tutorial, we first introduce some preliminaries of the tutorial and state-of-the-art approaches in some mobile visual tasks. Then, we present in-depth studies and analyses on the recently developed mobile visual techniques, including efficient backbone network design for mobile usage, generative adversarial network for mobile applications, and finally the context-aware algorithms under dynamic background for moving cameras. This topic covers two main aspects of mobile visual analytics: methods that can accelerate AI algorithms run for computational resource-constrained devices, and techniques that can leverage the contextual information under dynamic environments for mobile devices.
Tutorial Zoom Link: Tutorial Zoom Link
Youtube Link: Youtube Link
Organizers
Schedule of the Tutorial
10:00-10:10 AM (Est Time), June 20, 2021 Opening Remarks (Tao Chen)
10:10-11:10 AM (Est Time), June 20, 2021 Convolutional Networks for Mobile Applications (Gao Huang)
11:10-12:10 PM (Est Time), June 20, 2021 Generative Adversarial Network for Mobile Applications (Gang Yu)
12:10-1:10 PM (Est Time), June 20, 2021 Context-aware Mobile Visual Analysis (Tao Chen)
1:10-1:30 PM (Est Time), June 20, 2021 Questions and Panel Discussion
Official time 10:10-11:10 AM (Est Time), June 20, 2021 |
Presenter Dr. Gao Huang |
||
Short description | |||
Convolutional neural networks (CNNs) have been widely used as the backbone model in computer vision, and network architecture innovations are pushing forward the application of deep learning on mobile devices. This part of the tutorial will first review the SOTA efficient CNN backbones such as MobileNet, ShuffleNet, CondenseNet, etc, which adopt static structures and parameters. Then, we will focus on dynamic convolutional networks which improve the inference efficiency with data-dependent architectures or parameters. In specific, we will introduce three types of dynamic networks, namely, sample-wise, spatial-wise and temporal-wise dynamic models. The advantages and limitations of them as well as future directions will be discussed. |
|||
Presentation | |||
Official time 11:10-12:10 PM (Est Time), June 20, 2021 |
Presenter Dr. Gang Yu |
||
Short description | |||
Generative Adversarial Network (GAN) has been introduced in 2014 and has been rapidly developed in the recent five years. In industry, GAN has been applied for mobile devices with impressive applications like Face Cartoon generation and human pose transfer. In this tutorial, we will give the techniques underlying these GAN based applications. More specifically, we will first start from introduction of the GAN technique. Then we will describe the GAN application from different aspects like Face, Human, and Font. Finally, the techniques which are important for mobile applications will also be highlighted in the tutorial.
|
|||
Presentation | |||
Official time 12:10-1:10 PM (Est Time), June 20, 2021 |
Presenter Dr. Tao Chen |
||
Short description | |||
The above-mentioned design scheme for the lightweight model mainly focuses on network structure and optimization objective design (different objectives dependent on different tasks) , they still do not consider the dynamically changing background and foreground under moving cameras, and also do not fully utilize the context information including temporal and spatially for inference. In this part, we will discuss some effective context-aware analysis techniques for moving cameras, and give in-depth study on how to utilize the temporal contextual information such as object motion, and spatial information such as surrounding objects centering around an object-of-interest for fast and accurate mobile vision analytics. In particular, some typical scenarios such as object-level motion detection, context-aware pedestrian intrusion detection will be introduced. |
|||
Presentation | |||