What is the Optimal Transport Theory?

This week, I discovered a podcast with a wealth of information on the Artificial Intelligence industry and current developments therein. Over a series of short posts, I hope to share the information that I’ve learned from this podcast in an attempt to simplify it to easily understandable concepts for everyone. In addition, I hope that sharing this info sheds light on current research in the industry as well as on its importance.

Making sure everyone’s properly stocked up

This week’s studies included Optimal Transport theory. The podcast included Marco Cuturi, who is a foremost researcher in the AI space at L’École Nationale de la Statistique et de l’Administration Économique, in Paris.

The theoretical foundations for what we call “Optimal Transport theory,” today, actually date back to Russia, in the 18th century. At that time, Leonid Kantorovich did some of the first notable work on what was then called, “transport theory.”

Essentially, over the course of his work, he tried to develop mathematical explanations for how to allocate resources between locations in the best possible way. In the end, he developed the idea for what we now call linear programming.

The question as to how linear programming fits into the AI space calls for another piece entirely as mentioning all of its uses could cause us to stray too far from the topic of Optimal Transport theory.

For now, suffice it to say that linear programming has primarily been applied in business operations as a mathematical way of laying out a function that helps us to graph the path to the highest possible profit or the lowest possible costs.

In Artificial Intelligence, linear programming has been applied as a way to graph out the path for AI systems to function in their most efficient or Optimal way. In a sense, this means that linear programming allows us to predict the values that we need for the AI system to work “correctly,” whether this is related to identifying objects in pictures or another activity, entirely.

From linear programming put together with an updated version of the original theory of Optimal Transport, we have our current application related to Artificial Intelligence. When this application is understood in connection with how Marco Cuturi is using his “Optimal Transport loss function,” it may become clearer why both are essential to the growth of Artificial Intelligence.

Optimal Prediction Algorithms

According to Cuturi, the popular application of the Optimal Transport theory in AI includes using it with known data or supervised learning to predict how likely certain outcomes are to happen. He starts with the well-known operations problem that an even distribution of needed resources between sites or even, factories, is not always the best solution. What if one site undergoes a major, unforeseen difficulty and suddenly needs more? What if some of the resources are lost in transport?

Cuturi states clearly that when considered mathematically, a more-skewed distribution can be thought of as most optimal. In other words, you can’t always split everything by even percentages. To illustrate this, Cuturi used the example of AI systems that try to find or map similarities between words.

Without factoring Optimal Transport theory into how the AI’s neural network computes this problem, the comparison between results is lost. Specifically, the system can suggest words that are possibly like other words, but it has little to no specific idea of the differences and similarities between these words. It appears that when Optimal Transport theory is applied to such a computation, synonyms become possible to express.

To further bring home this point Cuturi says that existing AI systems usually use what can be called cross-entropy in their calculations, which displays a sort of graph of possible matches to words, but again, shows no real certainty related to the best matches or synonyms. Cuturi has directly applied Optimal Transport theory to such problems, as a function and re-termed it, an “Optimal Transport loss function.” In doing this, he can, for example, place cat and kitty together, as synonyms, mathematically. The specific math is quite involved but it is related to linear programming and the current research on Optimal Transport theory, at the same time.

Further applications- Optimal Transport Loss Functions

In closing his interview, Cuturi also mentioned the growing applications of his Optimal Transport Loss Functions, related to AI interpretations of shapes and graphics. A particularly interesting application of this research is its apparent application towards identifying what shapes are, in-between states. For example, when a circle changes into a triangle, what kind of value can we give it, in-between this transformation. How can an AI identify this intermediate state? Cuturi says we are close to being able to do this and it is all because of the Optimal Transport theory and its new iteration, the Optimal Transport loss function. All in all, it will be interesting to see what else comes from this research in the future, but if you are interested in learning more about it, look no further than the links below.

References:

Marco Cuturi’s Further Research in OTT: http://marcocuturi.net/SI.html

Primary Source: https://twimlai.com/twiml-talk-131-optimal-transport-and-machine-learning-with-marco-cuturi/

Transportation Theory: https://en.wikipedia.org/wiki/Transportation_theory_(mathematics)

About Ian LeViness 113 Articles
Professional Writer/Teacher, dedicated to making emergent industries acceptable to the general populace