Local Use of AI Is an Opportunity for Chipmakers

The introduction of OpenAI’s DALL-E text-to-image generation technology in July 2022 sparked an ersatz type of creativity — anyone, regardless of artistic training or ability, could create an image by describing what they want, using mostly natural language.

Access to this technology was initially limited, with OpenAI opening access to DALL-E to the first million users over several weeks. This roll-out hastened offshoots such as Craiyon, formerly DALL-E Mini, Midjourney and Stable Diffusion to fill the gap as interest in image generation using artificial intelligence (AI) outpaced availability.

Although cloud platforms promise unlimited computing capacity, they have their limits. For example, the abrupt discontinuation of OpenAI Codex as well as rate-limiting of GPT-4 following its initial launch for paying customers is evidence of capacity constraints in Microsoft Azure. Ultimately, confining DALL-E to cloud platforms limits possibilities for the technology. To contrast, burgeoning interest in rival Stable Diffusion brings AI image generation from the cloud to user devices.

With Enough Power, Devices Can Run AI Models Now

Aside from presumably temporary growing pains with cloud capacity, AI users may prefer local querying to cloud access, particularly where privacy is important, such as speech recognition; where model customization is desirable, such as image generation; or where experimentation would be too expensive to perform on cloud platforms.

The latter category is quite broad: tasks such as inpainting a generated image often require multiple iterations to produce the desired result, and programmers or users experimenting by running batches of queries with subtle tweaks to compare differences would be costly considering the value of the output.

Some professional- and enthusiast-grade systems are powerful enough to perform these tasks. For example, Stable Diffusion can be run using any relatively modern GPU with at least 8GB of video memory, including Intel Arc and AMD processors. Generally, this constrains use to desktop or workstation computers.

Some of Nvidia’s RTX GPUs for laptops are equipped with enough video memory to use Stable Diffusion; these are typically gaming notebooks, although Lambda Labs offers an AI-focussed Tensorbook, a customized variant of Razer’s Blade laptop series. Although these options are portable, they still rely on relatively power-hungry GPUs.

Making AI Accessible with Dedicated Silicon

Dedicated silicon for accelerating AI and machine learning workloads can significantly lower the power needed to run models locally. This isn’t precisely new — the Pixel Visual Core and Pixel Neural Core in Google’s early Pixel smartphones were used in the background for computational photography and voice processing. Exposing this in a way directly addressable by users is comparatively new, however.

For example, Apple ported Stable Diffusion in December 2022 to the machine learning cores of its M series silicon in Macs and iPads for higher performance — note that ports for Apple’s GPU also exist. Relative to PCs, Macs with Apple silicon have the benefit of a unified memory architecture, making it possible to run larger models on client devices. For example, Nvidia’s top-line consumer GPU, the GeForce RTX 4090, is equipped with 24GB of RAM, but the 2023 MacBook Pro with Apple’s M2 chip can be configured with up to 96GB, and the 2022 Mac Studio based on an M1 chip can reach 128GB.

Qualcomm is also getting involved with custom silicon for AI and machine learning, demonstrating Stable Diffusion running on a smartphone with a Snapdragon 8 Gen 2 at MWC 2023 in February.

Beyond image generation, Meta’s LLaMA, or Large Language Model Meta AI, was made available at several sizes, measured in billions of parameters — 7B, 13B, 33B and 65B — making it possible to run on different types of device. Other AI models, such as OpenAI’s Whisper speech recognition model, have been ported by independent developers to run on Apple silicon.

Cloud Can Offer a Path to Revenue and Safeguard against Abuse

There are drawbacks to using cloud on client devices — relying on cloud-powered APIs makes a path to revenue more obvious, as charging for access to a model to run locally introduces potential problems with providing technical support, performance updates and software piracy. For open-source models, this premise is less meaningful. Additionally, cloud platforms can serve as an anti-abuse mechanism, as users writing prompts in violation of the terms of service can be banned from the service, which would be functionally impossible to achieve otherwise.

More Opportunities and Strategies for Commercializing AI

In my latest Insight Report, I explore this and other emerging topics in developing generative AI products, including the economics of training and inference of AI models, the potential value that could be added to commercial or open-source models, the real and perceived customer value for the content created with these models and implications for ownership, licensing and copyright. Get in touch for access to the report or to discuss the implications of generative AI on your business.