Mobile Visual Analytics

Overview of the Tutorial

As mobile devices and real-time scenarios are becoming more ubiquitous and prevalent, research on mobile visual analytics is extremely urgent. The purpose of mobile visual analytics is to enable entirely new on-device experiences. In the past few years, some effective techniques in mobile visual analytics have been developed to address many challenging issues in many aspects such as real-time detection or segmentation tasks under dynamically changing scenes and computationally constrained resources.

In this tutorial, we first introduce some preliminaries of the tutorial and state-of-the-art approaches in some mobile visual tasks. Then, we present in-depth studies and analyses on the recently developed mobile visual techniques, including efficient backbone network design for mobile usage, rapid vision algorithm study for downstream tasks such as object detection and segmentation, and finally the context-aware algorithms under dynamic background for moving cameras. This topic covers two main aspects of mobile visual analytics: methods that can accelerate AI algorithms run for computational resource-constrained devices, and techniques that can leverage the contextual information under dynamic environments for mobile devices.

Organizers

Dr. Chen Tao

HomePage

Dr. Gao Huang

HomePage

Dr. Gang Yu

HomePage
Backbone Convolutional Network for Mobile Applications
Official time
TBD
Presenter
Dr. Chen Tao
Short description
Convolutional neural networks (CNNs) have been widely used as the backbone model in computer vision, and network architecture innovations are pushing forward the application of deep learning on mobile devices. This part of the tutorial will first review the SOTA efficient CNN backbones such as MobileNet, ShuffleNet, CondenseNet, etc, which were designed with human expertise. Then, neural architecture search (NAS), which aims to develop efficient backbone CNNs via an automated process, will be reviewed and discussed. Finally, we will introduce another important line of research that improves the inference efficiency of deep networks with dynamic architectures. Compared to the mainstream CNN backbones with static architectures, dynamic models can change its depth/width/parameters during the inference stage, conditioned on each input sample, thus leading to substantially reduce computational redundancy. The advantages of dynamic models and future directions will be discussed.
Presentation
Detection and Segmentation for Mobile Analysis
Official time
TBD
Presenter
Dr. Gao Huang
Short description
Object detection and segmentation are two typical vision application tasks in computer vision. Mobile-based detection and segmentation is becoming increasingly critical in everyday on-device applications, such as face detection/recognition, augmented reality, etc. In this part, we will focus on discussing how to design an efficient and high-performance downstream neural network for detection and segmentation tasks. Specially, some specific criterias for detection tasks, such as good bounding box regression objective, categorical classification objective are carefully reviewed and discussed. For segmentation, how to design suitable network structure and head to consider both small and large-scale objects, are discussed. Further, the shape mismatching problem between arbitrarily changing objects and fixed-size receptive fields is also investigated and discussed, which has been shown to be valuable for fast semantic segmentation of video frames.
Presentation
Context-aware Mobile Visual Analytics
Official time
TBD
Presenter
Dr. Gang Yu
Short description
The above-mentioned design scheme for the lightweight model mainly focuses on network structure and optimization objective design (different objectives dependent on different tasks) , they still do not consider the dynamically changing background and foreground under moving cameras, and also do not fully utilize the context information including temporal and spatially for inference. In this part, we will discuss some effective context-aware analysis techniques for moving cameras, and give in-depth study on how to utilize the temporal contextual information such as object motion, and spatial information such as surrounding objects centering around an object-of-interest for fast and accurate mobile vision analytics. In particular, some typical scenarios such as object-level motion detection, context-aware pedestrian intrusion detection will be introduced.
Presentation