Journal of University of Science and Technology of China ›› 2021, Vol. 51 ›› Issue (1): 1-11.DOI: 10.52396/JUST-2020-0022

• Research Article •     Next Articles

MOVIE: Mesh oriented video inpainting network

LIU Sen, ZHANG Zhizheng, YU Tao, CHEN Zhibo*   

  1. CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System, University of Science and Technology of China, Hefei 230027, China
  • Online:2021-01-31 Published:2021-05-27
  • Contact: *E-mail: chenzhibo@ustc.edu.cn
  • About author:Sen Liu received the B.S. degree in computer science from the Beijing University of Posts and Telecommunications, Beijing, China, in 2013. Currently, he is working towards the PhD degree at School of Information Science and Technology, University of Science and Technology of China. His area of interests includes artificial intelligence, deep learning, video coding, computer vision and pattern recognition and reinforcement learning.
    Zhibo Chen (M'01-SM'11) received the B. S., and PhD degree from Department of Electrical Engineering Tsinghua University in 1998 and 2003, respectively. He is now a professor in University of Science and Technology of China. His research interests include image and video compression, visual quality of experience assessment, immersive media computing and intelligent media computing. He has more than 100 publications and more than 50 granted EU and US patent applications. He is IEEE senior member, Secretary (Chair-Elect) of IEEE Visual Signal Processing and Communications Committee. He was TPC chair of IEEE PCS 2019 and organization committee member of ICIP 2017 and ICME 2013, served as Track chair in IEEE ISCAS and Area chair in IEEE VCIP.
    Zhizheng Zhang (S'19) received the B.S. degree in electronic information engineering from the University of Electronic Science and Technology of China, Chengdu, China, in 2016. He is currently pursuing the PhD degree in the University of Science and Technology of China, Hefei, China. His current research interests include reinforcement learning, few-shot learning, and intelligent media computing.
    Tao Yu is currently pursuing the PhD degree with the Department of Electronic Engineering and Information Science, University of Science and Technology of China. He received the B.S. degree in Electronics and Information Engineering in Anhui University in 2018. His research interests include computer vision, image processing and reinforcement learning.
  • Supported by:
    National Natural Science Foundation of China(61571413,61632001).

Abstract: Video inpainting aims to fill the holes across different frames upon limited spatio-temporal contexts. The existing schemes still suffer from achieving precise spatio-temporal coherence especially in hole areas due to inaccurate modeling of motion trajectories. In this paper, we introduce fexible shape-adaptive mesh as basic processing unit and mesh flow as motion representation, which has the capability of describing complex motions in hole areas more precisely and efficiently. We propose a Mesh Oriented Video Inpainting nEtwork, dubbed MOVIE, to estimate mesh flows then complete the hole region in the video. Specifically, we first design a mesh flow estimation module and a mesh flow completion module to estimate the mesh flow for visible contents and holes in a sequential way, which decouples the mesh flow estimation for visible and corrupted contents for easy optimization. A hybrid loss function is further introduced to optimize the flow estimation performance for the visible regions, the entire frames and the inpainted regions respectively. Then we design a polishing network to correct the distortion of the inpainted results caused by mesh flow transformation. Extensive experiments show that MOVIE not only achieves over four-times speed-up in completing the missing area, but also yields more promising results with much better inpainting quality in both quantitative and perceptual metrics.

Key words: mesh flow, deep neural networks, video inpainting

CLC Number: