Adaptable Computing Accelerators
A changing infrastructure for intelligent computing
There are writers who claim we are entering a new era with the current change in computing, particularly relating to the increased problems within machine learning applications. This can require a specialised form of computation that can be dealt with somewhat differentiated form of hardware. This article is a short summary of the piece written by Adam Scraba called Welcome to the Era of Real-Time AI.
Real-time services grow with intelligent personal assistants and other applications of ‘natural language’. There are a wide variety of real-time services that requires other computational solutions to be running in an efficient manner.
- “Cloud giants like Amazon Web Services (AWS), Microsoft, Alibaba and SK Telecom are developing the computing infrastructure to deliver those services.”
Operations in data centres, and as such IT architectures, have to address varied workloads in two ways.
- Service providers need an infrastructure platform offering differentiation and performance to deliver throughput, low latency and a flexible software.
- Hardware stack that can handle algorithms ranging from recurrent neural networks and long- and short-term memory networks, convolutional neural networks and query acceleration based on the Apache Spark cluster computing framework.
Therefore, service providers are building their own hardware and software stacks.
AWS Advanced Query Accelerator is one data analytics platform.
SK Telecom recently developed AI-enabled speech and video analytics on a custom software and programmable hardware stack.
The data center focused on adaptable acceleration of computing, storage and networking is emerging for these reasons.
High performance computing (HPC) can be a path to solving some of the world’s most complex problems.
A consortium of some 20,000 scientists at the European Laboratory for Particle Physics (CERN) is attempting to reconstruct the origin of the universe.
They push the limits of technology to do so.
The Large Hadron Collider is the largest particle accelerator in the world.
- “The 27-kilometer ring is composed of superconducting magnets that accelerate particles to previously unprecedented energy levels. Each proton traverses the ring 11,000 times per second — approaching the speed of light. At four different points on the ring-every 25 nanoseconds — protons collide. The conditions of the collision are captured by particle detectors. This trigger system is implemented in two layers — the first trigger requiring a fixed, extremely low-latency AI inference capability of about three microseconds per event.”
100 meters underground there is a network of FPGAs running algorithms designed to instantaneously filter the data generated and identify novel particle substructures as evidence of the existence of dark matter and other physical phenomena.
“The focus is shifting from computing horsepower to processing data through computational storage.”
Computing can be moved closer to the data. Integrating data analytics with storage significantly reduces system-level data bottlenecks, increases parallelism while reducing overall power requirements.
This approach has attracted vendors such as IBM and Micron Technology, who have developed ‘accelerated storage’ and ‘computation storage’ products where processing takes place near the data.
Samsung Electronics has launched SmartSSD to enable high-performance accelerated computing closer to flash storage while overcoming CPU and memory limitations.
In addition virtual environments scale beyond a single server, they must employ sophisticated overlay networks.
Overlay networks: are virtualized systems that are dynamically created and maintained using the concept of packet encapsulation. Supervising this encapsulation adds a burden on the OS or virtualization kernel. When combined with traditional networking tasks, these approaches consume nearly 30 percent of a server’s raw CPU cycles.
One suggestion to help with this is FPGA-based SmartNICs (network interface cards). SmartNIC: A network interface card (network adapter) that offloads processing tasks that the system CPU would normally handle. In particular, this can help with safety and security.
“…throwing more CPU-based servers at the problem simply won’t deliver the required performance.”
As computing demand is changing and current infrastructure is structured for different kinds of computation we could see a rise of more data centres focused on an infrastructure designed for usage oriented towards real-time services.
This is #500daysofAI and you are reading article 355. I am writing one new article about or related to artificial intelligence every day for 500 days. My focus for day 300–400 is about AI, hardware and the climate crisis.