GPU acceleration for data intensive processing
Computing is pushing the limits of Moore’s Law, which means that it is becoming increasingly hard for the industry to double the performance of hardware every two years while keeping hardware costs more or less static.
More complex applications in the data analytics space have seen the rise of radically different approaches to database architectures, which have provided a way for systems to scale horizontally using low cost commodity servers.
Sophisticated techniques in software engineering have helped limit the impact of Moore’s Law’s performance plateau being reached. But there are limitations to CPUs, which makes them unsuitable for processing data such as graphics manipulation, which has led to the rise of alternative chipsets.
But there is also growing awareness today of hardware innovation based on GPUs, FPGAs and ASICs (application-specific integrated circuits) that promise to accelerate computationally intensive applications.
Due to the higher levels of expertise and specialisation needed, as well as the overall costs involved in FPGAs and Asics, it is the highly scalable Graphics Processing Unit (GPU) which is being widely adopted as the next step up from x86-based CPU architectures. While GPU-based systems were previously used in gaming, supercomputers and high end engineering workstations, they have become more mainstream with the advent of machine learning and artificial intelligence.
Just as in graphics manipulation, a GPU is optimised to process a stream of data exceedingly quick. It is able to do this because its architecture comprises of hundreds of GPU cores, each of which runs a set of instructions over a subset of data. Since the GPU cores run these instructions in parallel, the set of instructions can be run, in parallel, across the entire dataset in an extremely short period of time. This ability to run a set of instructions simultaneously across a large dataset is what makes the GPU the hardware architecture of choice for data intensive application use cases like machine learning.
Creating these highly parallel applications is easier nowadays thanks to parallel computing software libraries and frameworks like cuDNN, for developing deep learning neural networks for machine learning applications.