Tutorial on Image-to-Image Translation

ICIP 2019 Tutorial on
Image-to-Image Translation

Presenters:
Ming-Yu Liu, Ting-Chun Wang

Time and Location:
Sunday, Sep 22, Half Day AM

Overview

Image-to -Image TranslationImage-to -image translation refers to the problem of generating a new image based on an input image. Many tasks in image processing can be formulated as an image-to -image translation problem, including image super-resolution, image inpainting, and style transfer.

In this short course, we will cover both of the basics and applications of image-to -image translation. We will start the course by reviewing the generative adversarial network (GAN) framework proposed by Goodfellow et. al., which is a popular generative model and is the backbone model of various state-of -the-art image-to -image translation methods thanks to its extraordinary capability in generating crispy sharp images. Various variants of GANs exist. We will cover several popular ones. We will also cover their conditional extensions.

Next, we will give a formal definition of the image-to -image translation problem. We will unify thenotation and categorize existing works based on their learning settings, including the supervised setting (input--output relationship is observed), the unsupervised setting (input--output relationship is not observed), the semi-supervised setting, the multimodal setting, and the few-shot setting. We will discuss details of the representative works in each setting. We will cover their network designs, objective functions, training strategy, and limitations. Applications to various image processing tasks will be discussed.

We will then move to the video-to -video translation problem, which is a natural extension of the image-to -image translation problem. We will discuss techniques for generating convincing visual dynamics as well as techniques for ensuring temporal consistency. We will then present an integration of an existing 3D rendering engine and a video-to -video translation model for creating a new form of computer graphics. We will talk about its strength and weakness as a graphic rendering engine.

Finally, we will conclude the course by discussing the conditions needed for image-to -image translation to work. We will talk about practical methods for meeting the conditions, including how to collect training data and troubleshooting tips. We will outline the remaining challenges and potential research problems.

Time	Title	Speaker	Affil.
09:00 - 09:05	Welcome	Ming-Yu Liu	NVIDIA
09:05 - 09:30	A short tutorial on GANs	Ting-Chun Wang	NVIDIA
09:30 - 10:00	Supervised Image-to-Image Translation Part 1	Ting-Chun Wang	NVIDIA
10:00 - 10:30	Supervised Image-to-Image Translation Part 2	Ming-Yu Liu	NVIDIA
10:30 - 11:00	Coffee Break
11:00 - 11:20	Unsupervised Image-to-Image Translation	Ming-Yu Liu	NVIDIA
11:20 - 11:40	Few-shot Unsupervised Image-to-Image Translation	Ming-Yu Liu	NVIDIA
11:40 - 12:00	Video-to-Video Translation	Ting-Chun Wang	NVIDIA
12:00 - 12:30	Few-Shot Adaptive Video-to-Video Translation	Ting-Chun Wang	NVIDIA