By Anna Tong and Katie Paul
ChatGPT maker OpenAI is working on a new approach to its artificial intelligence models in a project codenamed “Strawberry,” according to a person familiar with the matter and internal documentation reviewed by Reuters.
The project, details of which have not previously been reported, comes as Microsoft-backed startups race to demonstrate that the types of models it offers can provide advanced reasoning capabilities.
Teams within OpenAI are working on Strawberry, according to a copy of a recent internal OpenAI document seen by Reuters in May. Reuters was unable to determine the precise date of the document, which outlines a plan for how OpenAI plans to use Strawberry to conduct research. The source described the plan to Reuters as a work in progress. The news agency was unable to determine how close Strawberry is to public availability.
How Strawberry works is a closely guarded secret even within OpenAI, the person said.
The paper describes a project using Strawberry models with the aim of enabling the company’s AI to not only generate answers to questions, but also plan ahead enough to navigate and execute autonomously and reliably on the Internet conduct what OpenAI calls “deep research,” the source said. .
This is something that has so far eluded AI models, according to interviews with more than a dozen AI researchers.
Asked about Strawberry and the details reported in this story, a spokesperson for the OpenAI company said in a statement: “We want our AI models to see and understand the world more the way we do. Continuous research into new AI capabilities is a common industry practice, with the shared belief that these systems will improve in reasoning over time.”
The spokesperson did not directly respond to questions about Strawberry.
The Strawberry project was previously called Q*, which was seen as a breakthrough within the company last year, according to Reuters.
Two sources described earlier this year that OpenAI staffers told them they were Q* demos, capable of answering tough scientific and mathematical questions beyond the reach of current commercially available models.
On Tuesday, OpenAI showed off a demo of a research project that it claimed had new human-like reasoning skills during an internal all-hands meeting, according to Bloomberg. An OpenAI spokesperson confirmed the meeting but declined to provide details about its contents. Reuters was unable to determine whether the project demonstrated was Strawberry.
OpenAI hopes the innovation will dramatically improve the reasoning capabilities of its AI models, the person familiar with it said, adding that Strawberry involves a specialized way of processing an AI model after pre-training it on highly large data sets.
Researchers Reuters interviewed say reasoning is the key for AI to achieve human- or superhuman-level intelligence.
Although large language models can summarize dense texts and compose elegant prose much faster than any human, the technology often falls short of common-sense problems whose solutions seem intuitive to humans, such as recognizing logical fallacies and playing of tic-tac-toe. When the model encounters these types of problems, it often “hallucinates” false information.
AI researchers interviewed by Reuters generally agree that reasoning, in the context of AI, involves building a model that allows AI to plan ahead, represent how the physical world functions and reliably solve challenging multi-step problems.
Improving reasoning in AI models is seen as key to unlocking the models’ capabilities to do everything from making major scientific discoveries to planning and building new software applications.
Sam Altman, CEO of OpenAI, said earlier this year that in AI, “the most important areas of advancement will be in reasoning.”
Other companies such as Google (NASDAQ:), Meta (NASDAQ:) and Microsoft (NASDAQ:) are also experimenting with different techniques to improve reasoning in AI models, as are most academic labs conducting AI research. However, researchers differ on whether large language models (LLMs) are capable of incorporating ideas and long-term planning into the way they make predictions. For example, one of the pioneers of modern AI, Yann LeCun, who works at Meta, has often said that LLMs are incapable of human reasoning.
AI CHALLENGES
Strawberry is a key part of OpenAI’s plan to overcome these challenges, the source familiar with the matter said. The document seen by Reuters described what Strawberry wants to make possible, but not how.
In recent months, the company has privately told developers and other outside parties that it is about to release technology with significantly more advanced reasoning capabilities, according to four people who heard the company’s pitches. They declined to be identified because they are not authorized to speak about private matters.
Strawberry includes a specialized way of what’s known as ‘post-training’ OpenAI’s generative AI models, or tweaking the base models to hone their performance in specific ways after they’ve already been ‘trained’ on stacks of generalized data , one of the sources said.
The post-training phase of developing a model includes methods such as ‘refinement’, a process used in almost all language models today and which comes in many variations, such as having humans provide feedback to the model based on the responses and giving examples. of good and bad answers.
Strawberry has similarities to a method developed at Stanford in 2022 called “Self-Taught Reasoner” or “STaR,” one of the sources with knowledge of the matter said. STaR allows AI models to bootstrap themselves to higher levels of intelligence by iteratively creating their own training data, and in theory could be used to allow language models to surpass human-level intelligence, one of its creators, a Stanford professor, said Noah Goodman, to Reuters.
“I think that’s both exciting and terrifying… if things continue to move in that direction, we as people have some serious things to think about,” Goodman said. Goodman is not affiliated with OpenAI and is not familiar with Strawberry.
One of the capabilities OpenAI Strawberry is targeting is performing long-horizon (LHT) tasks, the document said, referring to complex tasks that require a model to plan ahead and execute a series of actions over an extended period of time, explains the first source. .
To do this, OpenAI creates, trains and evaluates the models based on what the company calls a “deep-research” dataset, according to OpenAI’s internal documentation. Reuters could not determine what is in that data set or how long a longer period would mean.
OpenAI specifically wants its models to use these capabilities to conduct research by surfing the Internet autonomously with the help of a “CUA,” a computer-using agent, that can take actions based on its findings, according to the document and one of the sources. OpenAI also plans to test its capabilities in the work of software and machine learning engineers.