Large image datasets with annotated pixel-level semantics are necessary to train and evaluate supervised deep-learning models. These datasets are very expensive in terms of the human effort required to build them. Still, recent developments such as DatasetGAN open the possibility of leveraging generative systems to automatically synthesise massive amounts of images along with pixel-level information. This work analyses DatasetGAN and proposes a novel architecture that utilises the semantic information of neighbouring pixels to achieve significantly better performance. Additionally, the overfitting observed in the original architecture is thoroughly investigated, and modifications are proposed to mitigate it. Furthermore, the implementation has been redesigned to greatly reduce the memory requirements of DatasetGAN, and a comprehensive study of the impact of the number of classes in the segmentation task is presented.