By Max A. Cherney
SAN FRANCISCO (Reuters) – Cerebras Systems on Tuesday launched a tool for AI developers that will give them access to the startup’s outsized chips to run applications, which it says is a much cheaper option than industry standard Nvidia (NASDAQ:) processors.
Access to Nvidia graphics processing units (GPUs) – often through a cloud computing provider – to train and deploy large artificial intelligence models used for applications such as OpenAI’s ChatGPT can be difficult to obtain and expensive to run , a process that developers call inference.
“We’re delivering performance that can’t be achieved by a GPU,” Cerebras CEO Andrew Feldman told Reuters in an interview. “We do it with the highest accuracy and offer it at the lowest price.”
The inference portion of the AI market is expected to be fast-growing, attractive, and ultimately worth tens of billions of dollars as consumers and businesses adopt AI tools.
The Sunnyvale, California-based company plans to offer several types of the inference product via a developer key and its associated cloud. The company will also sell its AI systems to customers who prefer to operate their own data centers.
Cerebras’ chips – each the size of a dinner plate and called Wafer Scale Engines – avoid one of the problems with AI data crunching: the data processed by large models that power AI applications typically cannot fit on a single chip and can need hundreds. or thousands of chips strung together.
That means Cerebras’ chips can deliver faster performance, Feldman said.
It plans to charge users just 10 cents per million tokens, which is one way companies can measure the amount of output data from a large model.
Cerebras is aiming to go public and filed a confidential prospectus with the Securities and Exchange Commission this month, the company said.