Synthesizing natural images using generative models such as Generative Adversarial Network (GAN) has received significant attention in the recent days due to advancements in deep learning. The existing generative models employ relatively simple loss functions derived from L1 /L2 norms during training due to it’s simplistic nature as well very desirable properties in statistics and estimation. However from the perceptual viewpoint mean squared error (L2 norm) has a very weak correlation with image quality. In this work the effect of incorporating statistics that effectively quantify the ’naturalness’ of an image is studied. In particular distances derived from Natural Scene Statistics is used as a constraint while learning the generative model. Specifically the performances of Multiscale Structural Similarity (MS-SSIM) and Visual Information Fidelity (VIF) and their advantages as well as shortcomings are holistically analyzed.
This project was undertaken as part of Vision Systems course during Jan-May 2019. Further details about the project and results are available here.