New Method Drives Machine Learning Algorithms to Focus on Additional Data While Learning a Task

Dec, 2021 - By WMR

A shortcut solution arises in machine learning when the model relies on a single feature of a dataset to make a choice rather than understanding the underlying core of the data, which can result in incorrect forecasts.

A new study from MIT researchers investigates the issue of shortcuts in a common machine-learning method and provides a method to avoid shortcuts by encouraging the model to use more data in its decision-making. By deleting the simpler qualities that the model is focusing on, the researchers compel it to focus on more complicated data elements that it had previously ignored. Then, by allowing the model to do the same job twice—once with the simpler features and again with the complex features it has learned to identify—they lessen the potential for shortcut solutions and improve the model's performance.

The team concentrated on contrastive learning, a powerful type of self-supervised machine learning. As a result, it may be used effectively for a wider range of data. A self-supervised learning model develops valuable data representations that are then utilized as inputs for various tasks such as picture classification. Moreover, if the model chooses shortcuts and fails to gather key information, these tasks will be unable to use it as well.

An encoder algorithm is learned to discern between pairs of identical and pairs of different inputs in contrastive learning models. This method encodes complex and useful data, such as photographs, in a way that the contrastive learning model can recognize. The researchers tried contrastive learning encoders with a variety of images and discovered that they are susceptible to shortcut solutions during the training operation. To determine which pairs of inputs are similar and which are distinct, encoders prefer to focus on the simplest characteristics of an image. When making decisions, the encoder should preferably focus on all of the useful properties of the data.

This research could be used to improve the effectiveness of machine learning algorithms used to detect disease in clinical images. In this setting, shortcut solutions may result in incorrect diagnosis and have harmful consequences for patients.