Seminar: Deep Neural Network for Text-to-Image Synthesis

Xin Huang
Ph.D. Oral Comprehensive

Supervisory Committee: Drs. Minglun Gong, Yuanzhu Chen and Oscar Meruvia-Pastor

Deep Neural Network for Text-to-Image Synthesis

Department of Computer Science
Thursday, April 18, 2019, 1:00p.m., Room EN 2022


Abstarct

With breakthroughs in machine learning techniques, Deep Neural Networks (DNNs) and its variants have been applied to areas such as computer vision and natural language processing with great success. Image synthesis has been one of the most well-studied topic in computer vision since the drastic growth of the DNNs. The objective of image synthesis is to generate new images for reducing design time or creating competitive visual effects.

Generating images from natural language is one of the main sub-fields in image synthesis, which has many exciting and practical applications such as photo editing or computer-aided content creation. In this proposal, I give a general introduction to this topic and summarize both conventional and state-of-the-art models in this area. Moreover, I propose a Hierarchically-fused Generative Adversarial Network (HfGAN), a novel model for text-to image generation based on adaptively feature fusion. Quantitatively evaluations show that this model is more efficient and noticeably outperforms the previous state-of-the-art methods. Finally, I present future research plan in the text- to-image synthesis area.