Stéphan Tulkens, Machine Learning Engineer at Slimmer AI, recently had his article, “Hyperspherical Alternatives to Softmax,” published by Towards Data Science.
In this article, Stéphan investigates alternatives for softmax outputs in neural networks. This is important because softmax classifiers can slow down dramatically as the number of classes increases. While this does not happen that often in academic datasets, applied AI problems often feature many different classes. An example of this is a classifier that can link a specific paper, to a specific author from a database of thousands of authors.
A promising direction that might alleviate some of the problems with softmax is a Hyperspherical Prototype Network, which has a constant output size and therefore is hypothesized to work better for a large number of classes.
Investigating this further is important for Slimmer AI, as we often deal with problems that have many different classifications.
The results may surprise you, click here to read more.
Follow us on LinkedIn and Twitter for more stories like this.