Attributing the pixels of an input image to a certain category is an important and well-studied problem in computer vision, with applications ranging from weakly supervised localisation to understanding hidden effects in the data. In recent years, approaches based on interpreting a previously trained neural network classifier have become the de facto state-of-the-art and are commonly used on medical as well as natural image datasets. On medical data they have been widely used for discovering disease effects. Unfortunately, such approaches have a significant shortcoming which may lead to only a subset of the category specific features being detected. To address this problem we developed a novel feature attribution technique based on Generative Adversarial Networks, which does not suffer from this limitation. The proposed method performs substantially better than the state-of-the-art for identifying disease effects on 3D neuroimaging data from patients with mild cognitive impairment (MCI) and Alzheimer’s disease (AD). Moreover, the proposed framework allows easy incorporation of prior knowledge about the disease formation. For AD patients the method produces compellingly realistic disease effect maps which closely resemble the observed effects. We believe that in the future GAN-based frameworks may offer an alternative to classical population-wide disease effect analyses such as voxel-based morphometry.