Great Expectations

Qualcomm prepares to upset the status quo in AI inferencing

It’s a long time since spring 2019, when Qualcomm first unveiled its Cloud AI 100 silicon, but on 16 September 2020, the company announced that the chip is now available in samples to select customers worldwide. This alone isn’t huge news, and nor does it guarantee commercial design wins. Nonetheless, some of the recent disclosures from Qualcomm underline its confidence in its new artificial intelligence (AI) accelerator.

Cloud AI 100 is an accelerator designed for AI inference in a range of segments including safety, retail, manufacturing and machine translation. Qualcomm is also offering a development kit, paired with its Snapdragon 865 chipset and Snapdragon X55 5G modem-RF system, designed for AI and video processing to enable early evaluation.

Qualcomm is adapting its existing partnerships and capability in low-power signal processing that address the crucial needs of reliability, low latency and low power. This maximizes its existing investment in silicon and software spanning frameworks, tools and run-times, and takes aim at the AI opportunity, from data centres down to a host of different scenarios at the network edge. Regardless of the use, optimization for low power and low latency are consistent requirements. Qualcomm’s track record in power-constrained devices and investment in software and silicon to accelerate AI inferencing on mobile devices give it a good starting point.

If power and performance stack up, this is a relatively low barrier to entry as it is addressing the opportunity for acceleration of AI inference specifically, rather than trying to transform the operating model of the entire data centre. This contrasts with the quest of the broader Arm ecosystem to pit Arm-based server CPUs as an alternative to Intel’s processors. Qualcomm doesn’t need to establish a new instruction set, build out supporting tooling and software stack or convince prospective customers that Arm can rival x86 architecture in supporting legacy applications and workloads. Instead, it can work alongside existing players and target the opportunity that spreads from data centres to the network edge.

This potential is vast. Over the coming years, the majority of data processing will shift to the edge from data centres or on-premises servers. Competition is intensifying with a long line of contenders, not least Intel and Nvidia (with or without Arm), alongside hundreds of smaller players that are ripe for consolidation. Nonetheless, the total market is enormous.

At Qualcomm’s AI event in April 2019, Facebook said that it processes about 200 trillion inferences per day, which means that power consumption in data centres has been doubling annually for the past three years. This is unsustainable, and highlights the opportunity for accelerators that can bring about real changes in power consumption and performance. Performance per trillion operations per second will be even more crucial at the network edge as workloads grow, costs rise and efficiency (or lack of) becomes more evident.

This is where Qualcomm has supreme confidence. The company claims that its top-tier 75 W form factor will be able to process up to 400 trillion operations per second. Using the performance benchmark of the ResNet-50 classification network, Qualcomm’s Dual M.2 edge solution was roughly comparable with Nvidia’s T4 GPU, but at much lower power — a 15W thermal design point. Its mid-tier Dual M.2 offering outperformed Nvidia’s V100 Tensor Core and T4 processors at a 25 W thermal design point, whilst its top-tier PCIe saw comparable performance to Nvidia’s A100 platform, using a fraction of the power consumed by Nvidia’s solution — 75 W compared with more than 300 W.

Of course, benchmarks rarely tell a complete story. However, on paper the gap is remarkable and, if true, should translate to customer design wins once commercial silicon is available. It should be remembered that all the leading hyperscale cloud providers are trialling Qualcomm’s Cloud AI 100 accelerator, with Facebook and Microsoft Azure attending the company’s AI Day in April 2019.

Power consumption is increasingly becoming a critical parameter of AI performance. If Qualcomm has leapfrogged rivals in the way it suggests, the company will quickly establish itself as a leader in AI inferencing from industry to retail. It has set expectations high; now it needs to deliver.