Authored By: Eungyeup Kim, Jihyeon Lee, Jaegul Choo

BiaSwap: Removing dataset bias with bias-tailored swapping augmentation

Jan 24, 2024

1. Introduction

Modern deep neural networks are achieving impressive results across a variety of computer vision tasks, such as object detection and image classification. However, these models often identify and rely on spurious correlations present in the data to make accurate predictions. These so-called 'dataset biases' can often result in a model that fails to generalize well to real-world, unbiased data. One common example of this might be a classifier trained to identify camels. If the training data predominantly contains images of camels in the desert, the trainned model may incorrectly associate the desert environment with the presence of a camel.

Some existing approaches for mitigating dataset bias pre-define the type of bias to prevent the model from learning it, but this method proves impracticable when dealing with large, real-world datasets where identifying specific bias types is prohibitively challenging. Thus, a more sophisticated solution for learning debiased representations is necessary.

2. BiaSwap: A Solution for Bias

The BiaSwap technique introduces a novel method for adapting to and "unlearning" bias in a dataset. It assumes that biases correspond to certain "easy-to-learn" attributes of a dataset. This approach sorts all images in the dataset based on whether a biased classifier is likely to exploit them as a shortcut. From here, the images are divided into "bias-guiding" and "bias-contrary" samples.

Image translation models are then used to primarily transfer bias attributes learned by the classifier. Therefore, given a pair of bias-guiding and bias-contrary images, BiaSwap generates a new 'bias-swapped' image that contains the bias attributes from the bias-contrary images, but preserves bias-irrelevant ones in the bias-guiding images. This augmentation is used to debias the system, leading to a remarkable performance improvement on both bias-unaffected and bias-guiding samples.

3. Addressing Existing Shortcomings

While some existing methods have made significant strides in mitigating bias, they generally require explicit labels and pre-existing knowledge of the bias type, a requirement that is both costly and unrealistic, particularly if the bias attributes vary across the dataset. Such pre-assumptions can restrict maneuverability and limit the potential for broad application. The unsupervised and adaptable nature of BiaSwap eliminates these issues, allowing for efficient training even when the bias type or attribute is unknown.

4. Applying BiaSwap: A Practical Solution for Bias in Deep Learning

The BiaSwap technique provides a practical solution for many of the problems associated with dataset bias in deep learning models. By effectively removing and learning the dataset bias without the need for specific supervision, BiaSwap is able to adapt to different types of bias, improving model performance and our understanding of bias within datasets.

5. Conclusion

Through the novel integration of image translation models and automated sorting of bias-guiding and bias-contrary samples, BiaSwap offers a dynamic, unsupervised, and efficient approach towards removing dataset bias in deep learning models. By generating bias-swapped images, BiaSwap allows a model to effectively account for, unlearn, and correct bias in a dataset, resulting in improved and unbiased results.