Fully parallel implementation of an SVM with SGD-based training on FPGA
FPGA, SGD, SVM, Machine Learning
Artificial intelligence, machine learning, and deep learning have proven to be powerful techniques for solving natural language processing problems, computer vision, and others. However, its computational and statistical performance depends on other factors of the algorithms used for training, the computing platform used, and even the numerical precision used to represent the data. One of the main machine learning algorithms is the Support Vector Machine (SVM), most commonly trained through minimal sequential optimization, contributing to low computational performance. As an alternative, algorithms based on Stochastic Gradient Descent (SGD) have better scalability and can be used as good options for training machine learning algorithms. However, even with algorithms based on the stochastic gradient, the training and inference times can become long depending on the computing platform employed. Thus, accelerators based on Field Programmable Gate Arrays (FPGAs) can be used to improve performance in terms of processing speed. This work proposes a fully parallel FPGA implementation of an SVM with SGD-based training. Results associated with hardware occupation, processing speed (or throughput), and accuracy for training and inference in various quantization formats are presented.