GPT-4 is too costly, Microsoft can’t sustain it anymore, and it is reported to have quietly initiated Plan B.

GPT-4 is too costly, Microsoft can't sustain it anymore, and has started Plan B.

According to The Information, while Microsoft is publicly integrating GPT-4 into its various flagship products, behind the scenes it has already started plotting a Plan B to achieve performance comparable to OpenAI models through its own conversational AI (Artificial Intelligence) large language model.

According to a current employee and another person who recently left Microsoft, in recent weeks, Peter Lee, who leads the 1,500 researchers at Microsoft, has instructed many of them to develop conversational AI. These AIs may not perform as well as OpenAI’s large models, but they have the advantage of being smaller in scale and much lower in operating costs.

These sources said that Microsoft’s product team for its search engine Bing is attempting to integrate Microsoft’s proprietary models into Bing Chat.

A current Microsoft employee said that Microsoft researchers are allocating most of their approximately 2,000 GPUs to produce lower-cost, smaller-scale models.

01. GPT-4 is too expensive, Microsoft is formulating Plan B

Microsoft’s inclusion of AI in its software is almost entirely dependent on OpenAI in exchange for the right to use its cutting-edge technology. However, as the cost of running advanced AI models rises, Microsoft researchers and product teams are formulating a Plan B, according to The Information.

As the cost of AI soars, Microsoft and other large AI developers like Google are finding ways to save money from conversational AI software and the server chips that run it. Earlier, Microsoft had promised to invest over $10 billion in OpenAI, partly to acquire its intellectual property.

Despite this investment, Microsoft still needs to control costs when launching OpenAI-supported features, including auto-generating PowerPoint presentations, transcribing Teams meetings, and creating Excel spreadsheets based on what customers tell them they want to see. If over a billion people end up using these features, Microsoft will have to scale down its size and complexity to avoid bankruptcy.

Microsoft also hopes that doing so will free up more in-demand AI server chip resources.

While Microsoft’s efforts are still in the early stages, it demonstrates how Microsoft CEO Satya Nadella is carving out a path for Microsoft’s AI products that is not entirely reliant on running through OpenAI. Over the years, these two companies will remain closely connected, but as they increasingly compete to sell AI software to the same enterprise customers, their relationship is becoming increasingly tense.

“This had to happen eventually,” said Naveen Rao, an executive at enterprise software company Databricks, when talking about Microsoft’s internal AI work.

He added, “Microsoft is an intelligent enterprise company, they need to be efficient, and when you deploy products that use these large models, like GPT-4 from OpenAI… it’s like saying, ‘I need a medical doctor with two PhDs to answer customer service calls for a Nerf gun company.’ It’s not economically viable.”

02. Not Expecting to Develop GPT-4, Want to Increase Negotiating Chips

Microsoft’s research team has no illusions about developing large AI models like GPT-4. The team does not have the same computational resources as OpenAI, nor does it have a large number of human reviewers to provide feedback on how their language models answer questions, so that engineers can improve on them.

It is undeniable that OpenAI, Google, and another star language model startup called Anthropic, which received a $4 billion investment from Amazon Web Services (AWS) on Monday, are all ahead of Microsoft in developing advanced language models.

However, Microsoft may be able to compete in the race to build AI models that mimic the software quality of OpenAI at a fraction of the cost, as indicated by its release of an internal model called Orca in June.

Large language models are the foundation of conversational AI systems like ChatGPT. For Microsoft, developing high-quality language models without direct assistance from OpenAI can provide more negotiating chips for future partnership updates discussions within the company.

The current deal seems to be mutually beneficial: as a return on OpenAI’s funding, Microsoft gains exclusive rights to use OpenAI’s existing intellectual property in Microsoft products indefinitely. It also receives 75% of OpenAI’s theoretical operating profits before the initial investment is paid back and 49% of the profits until a certain threshold is reached.

Microsoft hopes to increase its revenue by at least $10 billion over an unspecified period through its alliances with OpenAI and other AI companies. There are early signs of revenue traction with new AI features appearing in Office 365 productivity applications. At least one major customer of its cloud computing competitor AWS has spent a significant amount of money on Azure OpenAI cloud services. Microsoft also revealed in July that over 27,000 companies have paid for the code programming tool GitHub Copilot, which is supported by OpenAI software.

However, any desire of Nadella or Microsoft’s research executives to develop complex AI without OpenAI could be wishful thinking.

Since fully embracing OpenAI, Microsoft’s research division has largely been downgraded to tuning OpenAI’s models for use in Microsoft products rather than developing its own models. Over the past year, the department has lost some talent with several waves of researchers leaving and some joining internal product teams at Microsoft.

03. Investing Thousands of GPUs to Develop “Distilled” Models with Lower Costs

But after spending a year in the shadow of OpenAI, some Microsoft researchers have found a new purpose: creating what AI engineers call “distilled” models that mimic large models like GPT-4 but on a smaller scale with much lower operational costs.

Ironically, the terms of the deal between Microsoft and OpenAI are helping Microsoft break its dependency on OpenAI. When Microsoft customers use Bing’s chatbot, Microsoft can have unique access to the results generated by the OpenAI model.

Microsoft is now using this data to create smaller models. Its researchers have found that these models can produce similar results with fewer computing resources. Many other AI developers, such as Google and Databricks, are also focused on developing smaller models for specific tasks.

To create its Orca model, Microsoft researchers input millions of answers generated by GPT-4 into a more basic open-source model to teach it to mimic GPT-4.

Finally, the researchers stated that the Orca model performs much better than the basic version of their trained open-source model, Meta Llama 2, on a range of tasks, almost on par with GPT-4, such as explaining how to solve math problems or summarizing meeting records.

They claim that in some cases, Orca is as good as the free version of OpenAI’s ChatGPT. Orca is able to achieve this using only 1/10 of the computing power used by GPT-4.

In another paper published this month, Microsoft researchers introduced Phi, which they trained entirely based on “textbook-quality” information. Phi has 1% of the parameter count of GPT-4. The research shows that due to high-quality training data, Phi is 5 times as capable as the open-source model in terms of mathematics and logical problems.

It is still unclear whether streamlined models like Orca and Phi are useful in the long term, and researchers outside of Microsoft are vigorously debating whether these papers truly demonstrate the comparability of smaller streamlined models with larger advanced models like GPT-4. But their cost advantage gives Microsoft the motivation to continue moving forward.

A current Microsoft employee said that after the publication of Phi, Peter Lee told employees earlier this month that validating the quality of such models will be the team’s top priority. He also mentioned that researchers are allocating most of their 2000 GPUs for creating streamlined models.

Of course, compared to the computing resources provided by Microsoft to OpenAI, this chip cluster scale can be said to be insignificant.

An upcoming paper will focus on a method called contrastive learning, in which engineers teach the model to differentiate between high-quality and low-quality responses and improve Orca. The person said that other Microsoft researchers are developing a multimodal large-scale language model that can interpret and generate text and images.

Spokespersons for Microsoft and OpenAI declined to comment on this paper, and Microsoft will not allow interviews with Lee or the researchers behind Orca.

Models like Orca and Phi can help Microsoft reduce the computing costs associated with the AI capabilities it provides to its customers. A current employee said that Microsoft product managers have been testing how to handle some users’ queries to Bing chatbots using Orca and Phi instead of using OpenAI’s model. These approaches include using simpler queries to summarize short texts or answer yes or no questions, without the need for longer queries that require multi-step reasoning.

Microsoft is also considering whether to offer an Orca version to Azure cloud customers. The demand for it may already be realized.

According to insiders, after Microsoft released the Orca paper, the manager of Microsoft Research Institute told colleagues that some Azure customers have inquired about when they can use it. Given Meta’s restrictions on the commercialization of its open-source large language model, it is currently unclear whether Microsoft needs Meta’s permission.

04. Conclusion: Cracks are appearing between Microsoft and OpenAI, actively embracing other large model partners

Alex Ratner, co-founder of Snorkel AI, a professor at the University of Washington and a software vendor to AI developers, said, “More and more companies are running small models.” GPT-4 is “something eye-catching that can serve as the basis for you to start with… but when it comes to the professional use cases that Microsoft needs to power its products, we will continue to see this diversification.”

Microsoft also offers other large language models through Azure, including Meta’s Llama 2, hedging against OpenAI. According to a previous report by The Information, Microsoft is partnering with Databricks to sell software to Azure customers so that they can use open-source large language models instead of OpenAI’s closed-source applications to build applications.

We will continue to update Blocking; if you have any questions or suggestions, please contact us!

Share:

Was this article helpful?

93 out of 132 found this helpful

Discover more

Bitcoin

Vitalik Buterin Proposes Changes to Simplify Ethereum’s Proof-of-Stake Mechanism

Vitalik Buterin, co-founder of Ethereum, has highlighted the benefits of having a large number of validators, despite...

Market

MicroStrategy Inc.: Bitcoin Enthusiasts on a Shopping Spree

MicroStrategy, a top business intelligence firm, is solidifying its position as the biggest corporate holder of Bitco...

Market

Bullish Trend Emerges in Crypto Market as Binance and DoJ Edge Closer to Resolution

Binance exchange could potentially face a record-breaking $4 fine from the US Department of Justice (DoJ), causing a ...

Blockchain

Revolut Launches Data Phone Plans for UK Customers

Revolut, a leading UK-based financial institution, has recently launched a new data phone plan specifically designed ...

Market

Velar is set to release the world's first PerpDEX on Bitcoin, backed by a $3.5M seed fund.

Exciting news as Vellar, a promising PerpDEX platform built on Stacks, secures a $3.5 million seed investment to deve...

Blockchain

The Hilarious and Electrifying Journey of Ethereum’s Decentralization

Ethereum's Buterin reveals exciting roadmap for technical upgrades, including upcoming 'Danksharding' transition.