Awesome Augmentations
Pixel-level Transforms
Arithmetic
-
Add values to the pixels of images with possibly different values for neighbouring pixels. Source: Imgaug
Original Augmentation -
Fill one or more rectangular areas in an image using a fill mode. See paper “Improved Regularization of Convolutional Neural Networks with Cutout” by DeVries and Taylor. Source: Imgaug
Original Augmentation -
Invert the input image by subtracting pixel values from 255. Source: Albumentations
Original Augmentation -
Add noise sampled from gaussian distributions elementwise to images. Source: Imgaug
Original Augmentation -
Add noise sampled from laplace distributions elementwise to images. Source: Imgaug
Original Augmentation -
Add noise sampled from poisson distributions elementwise to images. Source: Imgaug
Original Augmentation -
Multiply all pixels in an image with a specific value, thereby making the image darker or brighter. Source: Imgaug
Original Augmentation -
Multiply values of pixels with possibly different values for neighbouring pixels, making each pixel darker or brighter. Source: Imgaug
Original Augmentation -
Augmenter that sets a certain fraction of pixels in images to zero. Source: Imgaug
Original Augmentation -
Replace pixels in images with salt noise, i.e. white-ish pixels. Source: Imgaug
Original Augmentation -
Replace rectangular areas in images with white-ish pixel noise. Source: Imgaug
Original Augmentation -
Replace pixels in images with pepper noise, i.e. black-ish pixels. Source: Imgaug
Original Augmentation -
Replace rectangular areas in images with black-ish pixel noise. Source: Imgaug
Original Augmentation -
Replace pixels in images with salt/pepper noise (white/black-ish colors). Source: Imgaug
Original Augmentation -
Replace rectangular areas in images with white/black-ish pixel noise. Source: Imgaug
Original Augmentation
Artistic
Blend
-
Blend images from two branches along a vertical linear gradient. Source: Imgaug
Original Augmentation -
Blend images from two branches along a horizontal linear gradient. Source: Imgaug
Original Augmentation -
Alpha-blend two image sources using alpha/opacity values sampled per pixel. Source: Imgaug
Original Augmentation -
Alpha-blend two image sources using non-binary masks generated per image. Source: Imgaug
Original Augmentation
Blur
-
Blur the input image using a Gaussian filter with a random kernel size. Source: Albumentations
Original Augmentation -
Blur the input image using a median filter with a random aperture linear size. Source: Albumentations
Original Augmentation -
Bilateral filters blur homogenous and textured areas, while trying to preserve edges. Source: Imgaug
Original Augmentation
Color
-
Convert the input RGB image to grayscale. If the mean pixel value for the resulting image is greater than 127, invert the resulting grayscale image. Source: Albumentations
Original Augmentation -
Augment RGB image using FancyPCA from Krizhevsky's paper "ImageNet Classification with Deep Convolutional Neural Networks". Source: Albumentations
Original Augmentation -
Randomly shift values for each channel of the input RGB image. Source: Albumentations
Original Augmentation -
Randomly change hue, saturation and value of the input image. Source: Albumentations
Original Augmentation -
Randomly change brightness and contrast of the input image. Source: Albumentations
Original Augmentation
Contrast
-
Apply Contrast Limited Adaptive Histogram Equalization to the input image. Source: Albumentations
Original Augmentation -
Apply CLAHE to all channels of images in their original colorspaces. Source: Imgaug
Original Augmentation -
Adjust image contrast by scaling pixel values to 255*((v/255)**gamma). Source: Imgaug
Original Augmentation -
Adjust image contrast to 255*1/(1+exp(gain*(cutoff-I_ij/255))). Source: Imgaug
Original Augmentation -
Adjust image contrast by scaling pixels to 255*gain*log_2(1+v/255). Source: Imgaug
Original Augmentation -
Apply Histogram Eq. to L/V/L channels of images in HLS/HSV/Lab colorspaces. Source: Imgaug
Original Augmentation -
Apply Histogram Eq. to all channels of images in their original colorspaces. Source: Imgaug
Original Augmentation
Compression
-
Decreases image quality by downscaling and upscaling back. Source: Albumentations
Original Augmentation
Convolutional
-
Augmenter that sharpens images and overlays the result with the original image. Source: Imgaug
Original Augmentation -
Augmenter that embosses images and overlays the result with the original image. Source: Albumentations
Original Augmentation
Corruption
Edges
-
Augmenter that detects all edges in images, marks them in a black and white image and then overlays the result with the original image. Source: Imgaug
Original Augmentation -
Augmenter that detects edges that have certain directions and marks them in a black and white image and then overlays the result with the original image. Source: Imgaug
Original Augmentation
Pooling
Segmentation
-
Completely or partially transform images to their superpixel representation. Source: Imgaug
Original Augmentation -
Uniformly sample Voronoi cells on images and average colors within them. Source: Imgaug
Original Augmentation -
Sample Voronoi cells from regular grids and color-average them. Source: Imgaug
Original Augmentation -
Sample Voronoi cells from image-dependent grids and color-average them. Source: Imgaug
Original Augmentation
Weather
Spatial-level transforms
Affine
-
Apply affine transformations that differ between local neighbourhoods. Source: Imgaug
Original Augmentation -
Randomly apply affine transforms: translate, scale and rotate the input. Source: Albumentations
Original Augmentation
Crop
-
Crop images down until their height/width is a multiple of a value. Source: Imgaug
Original Augmentation -
Crop images equally on all sides until H/W are multiples of given values. Source: Imgaug
Original Augmentation -
Torchvision's variant of crop a random part of the input and rescale it to some size. Source: Albumentations
Original Augmentation -
Crop a random part of the input and rescale it to some size. Source: Albumentations
Original Augmentation
Distortion
-
Augmenter that applies other augmenters in a polar-transformed space. Source: Imgaug
Original Augmentation
Flip
-
Flip the input either horizontally, vertically or both horizontally and vertically. Source: Albumentations
Original Augmentation
Pad
-
Pad images equally on all sides up to given minimum heights/widths. Source: Imgaug
Original Augmentation -
Pad images equally on all sides until H/W are multiples of given values. Source: Imgaug
Original Augmentation -
Pad images equally on all sides until H/W is a power of a base. Source: Imgaug
Original Augmentation -
Pad images equally on all sides until H/W matches an aspect ratio. Source: Imgaug
Original Augmentation -
Pad images equally on all sides until their height & width are identical. Source: Imgaug
Original Augmentation
Rotate
-
Rotate the input by an angle selected randomly from the uniform distribution. Source: Albumentations
Original Augmentation -
Randomly rotate the input by 90 degrees zero or more times. Source: Albumentations
Original Augmentation
Size
-
Rescale an image so that maximum side is equal to max_size, keeping the aspect ratio of the initial image. Source: Albumentations
Original Augmentation -
Rescale an image so that minimum side is equal to max_size, keeping the aspect ratio of the initial image. Source: Albumentations
Original Augmentation
Awesome Articles:
A list of awesome articles and tutorials for easy understanding of deep learning and data augmentation!- Automating Data Augmentation: Practice, Theory and New Direction
- A Beginner's Guide To Understanding Convolutional Neural Networks
- A Beginner's Guide to Generative Adversarial Networks (GANs)
- Overview of GAN Structure | Generative Adversarial Networks
- Generative Adversarial Network (GAN) for Dummies — A Step By Step Tutorial
- Data Augmentation For Deep Learning Algorithms
Awesome Libraries and Frameworks
A list of awesome deep learning libraries and frameworks in python!- Tensorflow
- Keras
- Pytorch
- Catalyst Classification
- Catalyst Semantic Segmentation
- Detectron 2
- Imgaug
- Albumentations
Awesome Surveys:
A list of awesome surveys in many different subjects of deep learning!-
- Link: Springer
- Authors: Connor Shorten and Taghi M. Khoshgoftaar
-
- Link: Arxiv
- Authors: Zhengwei Wang, Qi She, Tomas E. Ward
-
- Link: Arxiv
- Authors: Abdul Jabbar, Xi Li, Bourahla Omar
2019
2020
Awesome Papers:
-
- Link: Arxiv
- Authors: Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, Alexei A. Efros
- Code: Github
- Project: http://efrosgans.eecs.berkeley.edu/iGAN/
The authors proposed a method to help non-artistic users to modify ou create new images with simple image transformations. First, the input image is converted in the closest feature vector in the latent space of a Generative Adversarial Network (GAN). Then, through a user-friendly interface, the user can apply some transformations like color change, sketching, and warping, into the image. The feature vector in the latent space is adjusted to match those transformations. In the end, a new photo-realist is generated, with the object in the original image transformed according to the user's modifications.
-
The proposed approach is a Generative Adversarial Network (GAN), called Cycle-Gan, that learns how to translate images from a source domain to a target domain. For example, it can translate images containing horses into images containing zebras, it can change the weather, and also can translate images with a painting style to another. The advantage of the proposed method over its predecessors is that the Cycle-Gan doesn't need paired training data to achieve the proposed task.
-
The proposed method, cutout, is a data augmentation technique that consists of removing square regions of the image by placing a grey square mask in random positions in the image. This augmentation technique forces the network to learn the object representation even when it has some occlusion while reduces the network overfitting.
-
The authors proposed a neural network to perform style transfer between images. That is, the proposed method takes an image with a real scenario and outputs the same image as it was painting. The proposed network uses a VGG-19 as an encoder, to extract features from both, real image A and a painting example B. Then, the features enter in the Adaptive Instance Normalization (AdaIn), proposed by the authors, to perform style transfer between the images. The output of the AdaIn enters the decoder to generate the image with the new painting style. This decoder uses another encoder, that also uses a VGG-net, to learn how to perform such conversion.
-
In the proposed data augmentation, a rectangle is positioned on top of training images. The size and position of the rectangle are random, that is, it can be of any size and position of the image. The values of the pixels in the image corresponding to the rectangle position are set to random values, creating a noise are inside the image. This technique improves the classification by turning the Convolutional Neural Network (CNN) more robust to object occlusion. Also, it helps to reduce overfitting when training the network model.
-
- Link: Arxiv
- Authors: Terrance DeVries, Graham W. Taylor
In general, data augmentations are applied to the input images, usually before the training step of a deep neural network. In this work, the authors proposed data augmentations in the feature space generated by the neural network. To achieve that, they used an encoder-decoder network. The encoder is used to convert the input image into the feature space. Then augmentations of noise, interpolating, or extrapolating are applied to this feature space. Then, the feature space can be re-converted into an image through a decoder structure used in the classification through fully connected layers.
-
- Link: Arxiv
- Authors: Luke Taylor, Geoff Nitschke
The authors performed a series of experiments and comparisons of different data augmentation techniques in the classification problem. They focused on geometric methods (flipping, rotation, and cropping), and photometric methods (color jittering, edge enhancement, and fancy PCA).
-
- Link: Arxiv
- Authors: Xinyue Zhu, Yifan Liu, Zengchang Qin, Jiahong Li
The authors utilized a Cycle-GAN to generate images of emotion to the classification problem. More specifically, they perform a dataset balancement by applying the Cycle-GAN to perform a dataset balancing by generating new examples for the classes with fewer samples in the training dataset. They used the classes with more examples in the training set as the reference to generate the new images of other classes.
-
- Link: Arxiv
- Authors: Luis Perez, Jason Wang
The authors evaluated different augmentation techniques for the classification problem. They compared classical augmentations like shift, zoom, rotation, flip, and distortion, with GANs based augmentations. They also proposed an augmentation called neural augmentation. In the proposed approach, they use a CNN to concatenate two images and generate a new one.
-
- Link: IEEE Xplore
- Authors: Joseph Lemley, Shabab Bazrafkan, Peter Corcoran
In this work, the authors proposed a data augmentation called Smart Augmentation. In this approach, they use two networks, A and B. The former is a generative model, and the letter is a standard classification model. The generative model, network A, receives a set of images from a determined class and learns to generate new examples of this class. They use the loss of Network B to improve the results of Network A. So, A learns to generate images in a way that improves the classification result of B.
-
- Link: Research Gate
- Authors: Francesco Calimeri, Aldo Marzullo, Claudio Stamile, Giorgio Terracina
The authors evaluated a data augmentation technique based on Generative Adversarial Network (GAN) to generate new images in the Magnetic Resonance Images (MRI) classification problem.
-
- Link: IEEE Xplore
- Authors: Jia Shijie, Wang Ping, Jia Peiyi, Hu Siping
The authors evaluated several data augmentation techniques on CIFAR10 and ImageNet classification datasets. They evaluated Flipping, Cropping, Shifting, PCA jittering, Color jittering, Noise, Rotation, GAN, and WGAN.
-
- Link: Arxiv
- Authors: Alexander Buslaev, Alex Parinov, Eugene Khvedchenya, Vladimir I. Iglovikov, Alexandr A. Kalinin
- Code: Github
- Documentation: https://albumentations.ai/docs/
The authors proposed the Albumentations, a Python library with a huge amount of different data augmentation techniques to use in deep learning applications. They showed that the augmentations from the Albumentation library helped to improve the results in different deep learning applications like image classification, object detection, and semantic segmentation tasks.
-
- Link: Arxiv
- Authors: Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, Quoc V. Le
- Code: Github
- Details: https://ai.googleblog.com/2018/06/improving-deep-learning-performance.html
The proposed augmentation framework learns the best augmentation strategies for a determined classification problem. The authors created a search space with several augmentation techniques. A Recurrent Neural Network (RNN) samples an augmentation technique to train a Convolutional Neural Network (CNN). The RNN is updated based on the validation accuracy achieved in the CNN, and it learns the best augmentation strategy for the problem approached.
-
- Link: Arxiv
- Authors: Hiroshi Inoue
The author proposed a data augmentation called SamplePairing. In this technique, he takes two random images from the training dataset and mixup them by making a per-pixel average of both images. The label of the first image is used as the label of the mixed image. With this augmentation, the training step does not achieve good results. However, a network trained with this augmentation, and then fine-tuned without the augmentation, showed improvements on the classification problem.
-
A mixing augmentation method called random image cropping and patching (RICAP) is proposed. In this method, four images are randomly cropped and patched, generated a new mixed image. For the label, they mix the one-hot code of the four images by summing up them, with each label weighted by its proportion in the newly generated image.
-
The authors proposed a data augmentation technique called deep adversarial data augmentation (DADA). The proposed augmentation is based on Generative Adversarial Networks (GANs) to generate new examples for small datasets.
-
- Link: Arxiv
- Authors: Nikita Dvornik, Julien Mairal, Cordelia Schmid
The authors trained a Convolution Neural Network to learn the context of the object in the dataset. They cropped the object from the image and created a context image. Then, the network is trained to learn the probability of different objects being in that background. They used this network to guide the proposed data augmentation, where objects are placed on new background images.
-
- Link: Arxiv
- Authors: Maayan Frid-Adar, Idit Diamant, Eyal Klang, Michal Amitai, Jacob Goldberger, Hayit Greenspan
The authors evaluated data augmentation techniques to the computed tomography (CT) classification problem. They evaluated classical augmentations like translation, rotation, scaling, flipping, and shearing. They also evaluated two Generative Adversarial Networks (GANs): the Deep Convolutional GAN (DCGAN) and the Auxiliary Classifier GAN (ACGAN).
-
- Link: SPIE Digital Library
- Authors: Ali Madani, Mehdi Moradi, Alexandros Karargyris, Tanveer Syeda-Mahmood
The authors applied a Generative Adversarial Network (GAN) to generate chest x-ray images.
-
- Link: Open Access
- Authors: Sheng-Wei Huang, Che-Tsung Lin, Shu-Ping Chen, Yen-Yi Wu, Po-Hao Hsu, Shang-Hong Lai
The authors proposed a Generative Adversarial Network (GAN) to the image-to-image translation problem. The input image is fed into the encoder network and transformed into the encoded feature space. The encoded feature space inputs two networks in parallel: a decoder network and a semantic segmentation network. Both networks have their weights shared. The output of the decoder network is fed into the inverse network to reconstruct the original image.
-
- Link: Academic
- Authors: Marcus D. Bloice, Peter M. Roth, Andreas Holzinger
- Code: Github
- Documentation: https://augmentor.readthedocs.io/en/master/
The authors developed a package for biomedical images augmentations.
-
- Link: Arxiv
- Authors: Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo
- Code: Github
The proposed augmentation strategy performs the mix of two training samples A and B, by cutting a rectangle area in sample A and replacing it by a slice of sample B, generating a new training sample C as the combination of both training samples. The label is also adjusted to match the proportion of each sample in the new image. The advantage of this method over the ones that just fill a rectangle region of the image with zeros or random noise values is that there is no information loss, which grows the training efficiency.
-
The authors proposed an algorithm to automatically find the best augmentation policies to train a classification network. The proposed framework is similar to its predecessor, AutoAugment, but focused on reducing the time to find the best augmentations to the problem. In the proposed algorithm, the training dataset is split in K-folds, with each fold split into two sets, called Dm and Da. The k sets of Dm are used to train neural networks and the k sets of Da are used to find the best augmentation policies. In each Da, they applied a Bayes Optimization algorithm to select B augmentation policies, which are validated in the neural networks trained in with the Dm sets. For each fold, the top-N policies are selected and concatenated, generating a unique set of augmentation policies. In the end, the network is trained with the entire training dataset plus the best augmentations and validated in the validation dataset.
-
The proposed method also tries to overcome the problem of a human selecting the best augmentation techniques to be applied in the network training. Similar methods turn the training step extremely complex and time-consuming. This method tries to simplify and speed up this process by simply selecting a random set of N augmentations in the list of all possible augmentations.
-
- Link: Arxiv
- Authors: Mate Kisantal, Zbigniew Wojna, Jakub Murawski, Jacek Naruniec, Kyunghyun Cho
- Code: Github
In the proposed work, the authors showed that the detection and segmentation of small objects in images are still an open problem. In state-of-art networks based on anchor boxes, small objects have a limited number of anchors, which harms the detection process. To improve the ability of the network to find small objects, a data augmentation is proposed. In images with the occurrence of small objects, the authors made a copy of the object and pasted many copies over the image. This augmentation helped the network to generate more anchor boxes for the objects, which helped the network to detect such objects.
-
- Link: Arxiv
- Authors: Hao-Shu Fang, Jianhua Sun, Runzhong Wang, Minghao Gou, Yong-Lu Li, Cewu Lu
- Code: Github
The authors proposed an augmentation technique called Random InstaBoost. In this approach, given an image and its respective segmentation mask, they cut the object from the scene and place it elsewhere on the image. The hole left by the object in its original location is closed by an inpainting technique.
-
- Link: Arxiv
- Authors: Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, Jan Kautz
- Code: Github
The idea behind the proposed method is to perform the image-to-image translation task with few examples of a target class. To achieve this objective, a Generative Adversarial Network, called FUNIT, is proposed. The FUNIT is divided into two steps: training and deployment. In training, the network is trained with several images from different classes, also called source images. In deployment, the FUNIT receives a set of few images from a target class, not used in the training step, and the network is able to translate examples from the source images to the target images class.
-
- Link: ACM Digital Library
- Authors: Rui Ma, Pin Tao, Huiyun Tang
The authors evaluated seven popular data augmentations in a semantic segmentation problem. They evaluated color transform, flipping, projection transform, jpeg compression, cropping, local shift, and local copy.
-
- Link: IEEE Xplore
- Authors: Shuangting Liu, Jiaqi Zhang, Yuxin Chen, Yifan Liu, Zengchang Qin, Tao Wan
The authors proposed a data augmentation technique based on Generative Adversarial Networks (GANs). They trained a GAN to generate realistic images from segmentation masks. Then, they manually created segmentation masks with the classes which they wanted to augment and used the GAN to generate a corresponding image.
-
- Link: Arxiv
- Authors: Yang Liu, Pietro Perona, Markus Meister
The authors proposed a data augmentation called PanDA for panoptic segmentation problems. In this augmentation, they use the panoptic labels to separate foreground objects from the background. The background is filled with noise to hide the regions where the objects were removed. Then, they apply operations like shift and resize on the foreground objects and put them again on the background image.
-
- Link: Arxiv
- Authors: Jaehoon Choi, Taekyung Kim, Changick Kim
The authors proposed a data augmentation called Target-Guided and Cycle-Free Data Augmentation (TGCF-DA) . This augmentation is based on Generative Adversarial Networks (GANs) to generate realistic and labeled images from synthetic labeled images and realistic unlabeled images.
-
- Link: Arxiv
- Authors: Taesung Park, Alexei A. Efros, Richard Zhang, Jun-Yan Zhu
- Code: Github
- Project: http://taesung.me/ContrastiveUnpairedTranslation/
The authors uses a patch strategy to improve the image-to-image translation problem. They use a Generative Adversarial Network to translate images from a specific domain to another. Then, they perform a final adjustment by comparing patches from the input image and the resulting image.
-
- Link: Arxiv
- Authors: Dan Hendrycks, Norman Mu, Ekin D. Cubuk, Barret Zoph, Justin Gilmer, Balaji Lakshminarayanan
- Code: Github
The authors proposed a mixing of different augmentation techniques. Instead of performing a series of augmentations in sequence, they use different augmentations chains in parallel, where each chain is a sequence of augmentation techniques. Each sequence of augmentations generates a different augmented image, that is mixed through elementwise convex combinations, with all augmented images receiving a different weight. In the end, the resulting augmented image is mixed with the original image.
-
The proposed augmentation generates a mask with a sequence of black squares uniformly distributed forming a grid on top of the image. The regions of the image corresponding to the black squares in the mask are removed by setting the pixels to zero. This method is similar to others that remove regions from the image, forcing the network to learn the same concepts with different regions of the image and improve the ability to learn objects with occlusion. It also helps to reduce overfitting. The advantage of this method over others similar is that this other method uses random positions and size of the image to be removed, which can generate examples that remove the entire object in the image or do not remove any relevant information on it.
-
- Link: Arxiv
- Authors: Kuniaki Saito, Kate Saenko, Ming-Yu Liu
- Code: Coming soon
- Project: https://nvlabs.github.io/COCO-FUNIT/
Motivated by some content losses when translating more complex images using the FUNIT network, the authors proposed some improvements to the original work. The idea is the same as the original FUNIT, learn to translate images from a source class to a target class with a few examples from the target class. Aiming to mitigate the content loss in the FUNIT image-to-image translation results, the authors of COCO-FUNIT proposed an adaptation to the FUNIT called e COntent-COnditioned style encoder (COCO), to replace the original style encoder from FUNIT. In the COCO encoder, both style and content from the images are used to encoder the target class, which reduced the content loss problem.
-
- Link: Arxiv
- Authors: Ali Dabouei, Sobhan Soleymani, Fariborz Taherkhani, Nasser M. Nasrabadi
- Code: Github
The authors proposed a mixing augmentation based on saliency regions. In the supermix, for a set of images to be mixed, a set of mixing binary masks are created. Then, a trained model is responsible to optimize these mixing masks in a way that the salient regions in the images are present in the final mixed image.
-
- Link: Arxiv
- Authors: Barret Zoph, Ekin D. Cubuk, Golnaz Ghiasi, Tsung-Yi Lin, Jonathon Shlens, Quoc V. Le
- Code: Github
The authors proposed a study about the behavior of several augmentations techniques, usually applied to classification problems, on object detection problems. The authors also proposed a search method to find the best augmentation policies to be applied in an object detection problem. To achieve this goal, they defined an augmentation policy as a set of K sub-policies, with each sub-policy being a set of N image transformations. They trained a Recurrent Neural Network (RNN) to find the K sub-policies that better that composes an augmentation policy.
-
- Link: Arxiv
- Authors: Mulham Fawakherji, Ciro Potena, Alberto Pretto, Domenico D. Bloisi, Daniele Nardi
The proposed work uses a Generative Adversarial Network to generate new instances to the crop/weed segmentation problem. The idea behind this method is to crop regions of the image close to the crop/weed and use the GAN to generate only a new instance of the object, instead of an entirely new image. Then, they use another GAN to replace instances of crop/weed in real images to the generated ones in the previous step.
-
A data augmentation for semantic segmentation problem, called ClassMix, is proposed. The proposed method takes two random, unlabeled, images A and B. Then, uses a neural network to perform the segmentation of both images, generating two segmentation masks Sa and Sb. From Sa, a binary mask is created by randomly selecting half of the classes in Sa. This binary mask is used to mix the images A and B and the segmentation masks Sa and Sb.
-
- Link: Arxiv
- Authors: Iñigo Azqueta-Gavaldon, Florian Fröhlich, Klaus Strobl, Rudolph Triebel
The authors applied the CycleGAN to convert non-realist images (3D models) of surgical instruments into realist images. Then, these new realist images are used as augmentations to train a semantic segmentation network.