Project Waifu – CNN

The speaker verification now uses a convolutional neural network (CNN) rather than an ANN as described here. This new algorithm gives the speaker verification system a massive improvement on performance (both accuracy-wise and resource-wise).

The Performance of the CNN

The CNN, without much hyperparameter tuning, is able to get a cost of lower than 0.1% in a few hundred epochs. The old algorithm, however, reaches similar performance at over a thousand epochs. Similarly, the CNN also runs much faster than the ANN, although this may be due to CUDA implementations.

Continue reading →