Learning 3D Generative Models

CVPR 2020 Workshop, Seattle, WA

14th of June 2020

Please give us your feedback on how the workshop went using this Google form.


The past several years have seen an explosion of interest in generative modeling: unsupervised models which learn to synthesize new elements from the training data domain. Such models have been used to breathtaking effect for generating realistic images, especially of human faces, which are in some cases indistinguishable from reality. The unsupervised latent representations learned by these models can also prove powerful when used as feature sets for supervised learning tasks.

Thus far, the vision community's attention has mostly focused on generative models of 2D images. However, in computer graphics, there has been a recent surge of activity in generative models of three-dimensional content: learnable models which can synthesize novel 3D objects, or even larger scenes composed of multiple objects. As the vision community turns from passive internet-images based vision toward more embodied vision tasks, these kinds of 3D generative models become increasingly important: as unsupervised feature learners, as training data synthesizers, as a platform to study 3D representations for 3D vision tasks, and as a way of equipping an embodied agent with a 3D `imagination' about the kinds of objects and scenes it might encounter.

With this workshop, we aim to bring together researchers working on generative models of 3D shapes and scenes with researchers and practitioners who can use these generative models to improve embodied vision tasks. For our purposes, we define ``generative model'' to include methods that synthesize geometry unconditionally as well as from sensory inputs (e.g. images), language, or other high-level specifications. Vision tasks that can benefit from such models include scene classification and segmentation, 3D reconstruction, human activity recognition, robotic visual navigation, question answering, and more.

Call for Papers

Call for papers: We invite novel full papers of 4 to 6 pages (extended abstracts are not allowed) for work on tasks related to data-driven 3D generative modeling or tasks leveraging generated 3D content. Paper topics may include but are not limited to:

  • Generative models for 3D shape and 3D scene synthesis
  • Generating 3D shapes and scenes from real world data (images, videos, or scans)
  • Representations for 3D shapes and scenes
  • Unsupervised feature learning for embodied vision tasks via 3D generative models
  • Training data synthesis/augmentation for embodied vision tasks via 3D generative models

Submission: we encourage submissions of up to 6 pages excluding references and acknowledgements. The submission should be in the CVPR format. Reviewing will be single blind. Accepted works will be published in the CVPR 2020 proceedings (online/app, IEEE Xplore, and CVF open access). Due to the archival nature of these publications, we are looking for work that has not been published before. The submissions will be handled by the CMT paper management system.

Important Dates

Paper Submission Deadline March 30 2020 - AoE time (UTC -12)
Notification to Authors April 13 2020
Camera-Ready Deadline April 20 2020
Workshop Date June 14 2020


In the table of events below, links labeled "Video" redirect to pre-recorded talks. If an event does not have a "Video" link, then it is a live session. After a live session finishes, a "Video" link will be added which redirects to the Zoom recording of the session. The links labeled "Zoom/chat" redirect to the CVPR internal webpage for the corresponding schedule item. These pages require a CVPR registration to access (to prevent Zoom-bombing).

For links to the content for the posters and spotlight presentations, please see the "Accepted Papers" section below.

8:45am - 9:00am Welcome and Introduction Zoom/chat
9:00am - 9:25am Invited Talk 1 (Daniel Aliaga)
Urban Scene Generation
Video Zoom/chat
9:25am - 9:50am Invited Talk 2 (Evangelos Kalogerakis)
What Can Go Here?
Video Zoom/chat
9:50am - 10:10am Spotlight Talks (Poster Session 1)
10:10am - 10:55am Poster Session 1
10:55am - 11:20am Invited Talk 3 (Georgia Gkioxari)
Beyond 2D Visual Recognition
Video Zoom/chat
11:20am - 11:45am Invited Talk 4 (Jitendra Malik) Video Zoom/chat
11:45am - 12:00pm 3D-FRONT Dataset Announcement
Dataset website
Video Zoom/chat
12:00pm - 1:00pm Lunch Break
1:00pm - 1:25pm Invited Talk 5 (Paul Guerrero)
Structuring Shapes and Shape Distributions
Video Zoom/chat
1:25pm - 1:50pm Invited Talk 6 (Vladimir Kim)
Neural Mesh Processing
Video Zoom/chat
1:50pm - 2:10pm Spotlight Talks (Poster Session 2)
2:10pm - 2:55pm Poster Session 2
2:55pm - 3:20pm Invited Talk 7 (Jiajun Wu)
Program Synthesis for 3D Scene Understanding and Manipulation
Video Zoom/chat
3:20pm - 3:45pm Invited Talk 8 (Sanja Fidler)
A.I. for Robotics Simulation
Video Zoom/chat
3:45pm - 4:30pm Panel Discussion Video Zoom/chat

Accepted Papers

Poster Session 1 (10:10am - 10:55am)

VoronoiNet: General Functional Approximators with Local Support
Francis Williams, Jérôme Parent-Lévesque, Derek Nowrouzezahrai, Daniele Panozzo, Kwang Moo Yi, Andrea Tagliasacchi
Spotlight Presentation: Video | Slides | Zoom/chat
Poster Session: Poster | Zoom/chat

Deep Octree-based CNNs with Output-Guided Skip Connections for 3D Shape and Scene Completion
Peng-Shuai Wang, Yang Liu, Xin Tong
Spotlight Presentation: Video | Slides | Zoom/chat
Poster Session: Poster | Zoom/chat

PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
Rundi Wu, Yixin Zhuang, Kai Xu, Hao Zhang, Baoquan Chen
Spotlight Presentation: Video | Slides | Zoom/chat
Poster Session: Poster | Zoom/chat

Generalized Autoencoder for Volumetric Shape Generation
Yanran Guan, Tansin Jahan, Oliver van Kaick
Spotlight Presentation: Video | Zoom/chat
Poster Session: Poster | Zoom/chat

Poster Session 2 (2:10pm - 2:55pm)

Topology-Aware Single-Image 3D Shape Reconstruction
Qimin Chen, Vincent Nguyen, Feng Han, Raimondas Kiveris, Zhuowen Tu
Spotlight Presentation: Video | Zoom/chat
Poster Session: Poster | Zoom/chat

Geometry to the Rescue: 3D Instance Reconstruction from a Cluttered Scene
Lin Li, Salman Khan, Nick Barnes
Spotlight Presentation: Video | Zoom/chat
Poster Session: Poster | Zoom/chat

Mesh Variational Autoencoders with Edge Contraction Pooling
Yu-Jie Yuan, Yu-Kun Lai, Jie Yang, Qi Duan, Hongbo Fu, Lin Gao
Spotlight Presentation: Video | Slides | Zoom/chat
Poster Session: Poster | Zoom/chat

BSP-Net: Generating Compact Meshes via Binary Space Partitioning
Zhiqin Chen, Andrea Tagliasacchi, Hao Zhang
Spotlight Presentation: Video | Slides | Zoom/chat
Poster Session: Poster | Zoom/chat

Invited Speakers

Jitendra Malik received the B.Tech degree in Electrical Engineering from Indian Institute of Technology, Kanpur in 1980 and the PhD degree in Computer Science from Stanford University in 1985. In January 1986, he joined the university of California at Berkeley, where he is currently the Arthur J. Chick Professor in the Department of Electrical Engineering and Computer Sciences. Since January 2018, he is also Research Director and Site Lead of Facebook AI Research in Menlo Park. Prof. Malik's research group has worked on many different topics in computer vision, computational modeling of human vision, computer graphics and the analysis of biological images. Several well-known concepts and algorithms arose in this research, such as anisotropic diffusion, normalized cuts, high dynamic range imaging, shape contexts and R-CNN. He has mentored more than 60 PhD students and postdoctoral fellows. His publications have received numerous best paper awards, including five test of time awards - the Longuet-Higgins Prize for papers published at CVPR (twice) and the Helmholtz Prize for papers published at ICCV (three times). He received the 2013 IEEE PAMI-TC Distinguished Researcher in Computer Vision Award, the 2014 K.S. Fu Prize from the International Association of Pattern Recognition, the 2016 ACM-AAAI Allen Newell Award, the 2018 IJCAI Award for Research Excellence in AI, and the 2019 IEEE Computer Society Computer Pioneer Award. He is a fellow of the IEEE and the ACM. He is a member of the National Academy of Engineering and the National Academy of Sciences, and a fellow of the American Academy of Arts and Sciences.

Sanja Fidler is an Assistant Professor at University of Toronto, and a Director of AI at NVIDIA, leading a research lab in Toronto. Prior coming to Toronto, in 2012/2013, she was a Research Assistant Professor at Toyota Technological Institute at Chicago, an academic institute located in the campus of University of Chicago. She did her postdoc with Prof. Sven Dickinson at University of Toronto in 2011/2012. She finished her PhD in 2010 at University of Ljubljana in Slovenia in the group of Prof. Ales Leonardis. In 2010, she was visiting Prof. Trevor Darrell's group at UC Berkeley and ICSI.

Daniel Aliaga does research primarily in the area of 3D computer graphics but overlaps with computer vision and visualization while also having strong multi-disciplinary collaborations outside of computer science. His research activities are divided into three groups: a) his pioneering work in the multi-disciplinary area of inverse modeling and design; b) his first-of-its-kind work in codifying information into images and surfaces, and c) his compelling work in a visual computing framework including high-quality 3D acquisition methods. Dr. Aliaga’s inverse modeling and design is particularly focused at digital city planning applications that provide innovative “what-if” design tools enabling urban stake holders from cities worldwide to automatically integrate, process, analyze, and visualize the complex interdependencies between the urban form, function, and the natural environment.

Evangelos Kalogerakis is an Associate Professor Computer Science at the University of Massachussetts Amherst. His research deals with the development of visual computing and machine learning techniques that help people to easily create and process representations of the 3D visual world, including 3D models of objects and scenes, 3D scans, animations, shape collections, and images. His research is supported by NSF awards and donations from Adobe. He was a postdoctoral researcher at Stanford University from 2010 to 2012 (advised by Leo Guibas and Vladlen Koltun). He obtained his PhD from the University of Toronto in 2010 (advised by Aaron Hertzmann and Karan Singh). He graduated from the department of Electrical and Computer Engineering, Technical University of Crete in 2005 (undergraduate thesis advised by Stavros Christodoulakis).

Jiajun Wu is a Visiting Faculty Researcher at Google Research, New York City, working with Noah Snavely. In Fall 2020, He will join Stanford University as an Assistant Professor of Computer Science. He studies machine perception, reasoning, and its interaction with the physical world, drawing inspiration from human cognition. He completed my PhD at MIT, advised by Bill Freeman and Josh Tenenbaum, and his undergraduate degrees from Tsinghua University, working with Zhuowen Tu. He has also spent time at the research labs of Microsoft, Facebook, and Baidu.

Vladimir Kim is a Senior Research Scientist at Adobe Research Seattle. He works on geometry analysis algorithms at the intersection of graphics, vision, and machine learning, enabling novel interfaces for creative tasks. His recent research focuses on making it easier to understand, model, manipulate, and process geometric data such as models of 3D objects, interior environments, articulated characters, and fonts.

Georgia Gkioxari is a research scientist at FAIR. She received her PhD from UC Berkeley, where she was advised by Jitendra Malik. She did her bachelors in ECE at NTUA in Athens, Greece, where she worked with Petros Maragos. In the past, she has spent time at Google Brain and Google Research, where she worked with Navdeep Jaitly and Alexander Toshev.

Paul Guerrero recently joined Adobe Research in London, working on the analysis of shapes and irregular structures, such as graphs, meshes, or vector graphics, by combining methods from machine learning, optimization, and computational geometry. Previously, he was a post-doctoral researcher at the Smart Geometry Processing Group, UCL. He completed my PhD at the Insitute for Computer Graphics and Algorithms, Vienna University of Technology, and at the Visual Computing Center in KAUST.


Daniel Ritchie
Brown University
Florian Golemo
MILA, Element AI
Angel X. Chang
Simon Fraser University
Siddhartha Chaudhuri
Adobe Research, IIT Bombay
Qixing Huang
UT Austin
Derek Nowrouzezahrai
McGill, MILA
Pedro O. Pinheiro
Element AI
Sai Rajeswar
MILA, Element AI
Manolis Savva
Simon Fraser University
David Vazquez
Element AI
Hao (Richard) Zhang
Simon Fraser University


Thanks to visualdialog.org for the webpage format.