Training via quantum superposition circumventing local minima and vanishing gradient of sinusoidal neural network

Authors :: Wen, Zujin
Huang, Jin-Long
Dahlsten, Oscar
Publication Year :: 2024
Abstract: Deep neural networks have been very successful in applications ranging from computer vision and natural language processing to strategy optimization in games. Recently neural networks with sinusoidal activation functions (SinNN) were found to be ideally suited for representing complex natural signals and their fine spatial and temporal details, which makes them effective representations of images, sound, and video, and good solvers of differential equations. However, training SinNN via gradient descent often results in bad local minima, posing a significant challenge when optimizing their weights. Furthermore, when the weights are discretized for better memory and inference efficiency on small devices, we find that a vanishing gradient problem appears on the resulting discrete SinNN (DSinNN). Brute force search provides an alternative way to find the best weights for DSinNN but is intractable for a large number of parameters. We here provide a qualitatively different training method: an algorithm for quantum training of DSinNNs. The quantum training evolves an initially uniform superposition over weight values to one that is guaranteed to peak on the best weights. We demonstrate the algorithm on toy examples and show that it indeed outperforms gradient descent in optimizing the loss function and outperforms brute force search in the time required.

Tools