OpenAI Trapped in Distantly Leading

OpenAI Hindered by Distant Leadership

The popular global drama “Where Does Ultraman Go” has come to an end, but OpenAI’s worries are far from over.

Sam Altman’s reinstatement in a short period of time is thanks to the busy support from Microsoft. Throughout this year, Microsoft has been helping its good brother grow stronger. Not only did they invest an additional $10 billion, but they also extensively employed the manpower from Microsoft Research to set aside their ongoing basic scientific research projects and focus all their efforts on transforming foundational large models like GPT-4 into products, arming OpenAI to the teeth.

However, what many people don’t know is that in September of this year, Peter Lee, the head of Microsoft Research, received a secret project – creating an alternative to OpenAI.

The first “OpenAI-ization” project is Microsoft’s first large model application called Bing Chat.

According to confidential sources from The Information, Microsoft is attempting to gradually replace the OpenAI large models integrated in Bing with their own self-developed versions. At November’s Ignite developers’ conference, Microsoft announced that Bing Chat has been renamed Copilot, which now has a market positioning quite similar to ChatGPT – it’s hard not to think about it.

The brand new Copilot

However, Microsoft’s original intention is not because of any flaws in OpenAI’s technological capabilities, nor did they foresee conflicts in the OpenAI management. The real reason is somewhat comical:

Because OpenAI’s technological capabilities are too strong.

Delivering Deliveries in a Lamborghini

The impetus for Microsoft to develop its own large models was an instance of failure by OpenAI.

While ChatGPT was making waves worldwide, the computer scientists at OpenAI were busy with a project code-named Arrakis, hoping to create a sparse model (sLianGuairse model) to compete with GPT-4.

This is a special kind of large model: when performing tasks, only specific parts of the model are activated. For example, when a user needs to generate a summary, the model automatically activates the most suitable parts for the job, without having to mobilize the entire large model every time.

Compared to traditional large models, sparse models have faster response times and higher accuracy. More importantly, they can significantly reduce the cost of inference.

In layman’s terms, it means you don’t need to use a sledgehammer to kill a chicken – and this is exactly what Microsoft values.

Google’s summary of the advantages of sparse models

When public opinion talks about the cost of large models, they always mention training costs in the 7 to 8-digit range, as well as astronomical figures for GPU expenses. But for most tech companies, model training and data center construction are one-time capital expenditures that can be accepted with a determined bite. On the other hand, it’s the expensive inference costs necessary for daily operations that act as the first barrier preventing tech companies from delving further.

This is because, under normal circumstances, large models do not exhibit the same evident economies of scale as the internet.

Every query from the user requires a new inference calculation. This means that the more users who use the product, and the more intensive their usage, the higher the computational cost for tech companies.

Previously, Microsoft transformed the large-scale model application GitHub Copilot based on GPT-4, which assists programmers in writing code and charges $10 per month.

According to the Wall Street Journal, due to the expensive inference cost, GitHub Copilot loses an average of $20 per user per month, and heavy users can even cause Microsoft a loss of $80 per month.

GitHub Copilot

The inability to generate enough revenue from large-scale model applications is the primary reason for Microsoft’s self-developed large-scale models.

OpenAI’s large-scale models continue to lead in technology, consistently ranking first on major lists, but the cost of using them is expensive.

An AI researcher made an estimation that in theory, the API price of GPT-3.5 is nearly 3-4 times the inference cost of the open-source model Llama 2-70B, let alone the fully upgraded GPT-4.

However, except for a few scenarios such as code generation and solving complex mathematical problems, most work can be handled by smaller versions and open-source models.

A start-up called is a living example. Its business is to provide tools for summarizing audio and video content, with about 200,000 monthly active users. In the early stages, it used GPT-3.5 to support its service.

Later, the company tried replacing the underlying large-scale model with the open-source model Mistral-7B-Instruct and found that users did not perceive any difference, but the monthly inference cost decreased from $2000 to less than $1000.

In other words, OpenAI is offering Lamborghinis with powerful engines to customers who mainly deliver takeout food—this is the “leading difficulty” for OpenAI.

So, not only Microsoft, but early major customers of OpenAI like Salesforce and Wix have also switched to cheaper technological solutions.

Reducing inference costs and making “Audi more affordable than a Yamaha” have become the problems that OpenAI must solve, leading to the aforementioned sparse model project, Arrakis.

In fact, it’s not just OpenAI; Google is also conducting related research and has made progress. At the Hot Chips conference in August, Jeff Dean, Google’s Chief Scientist and former head of Google Brain, mentioned in his speech that sparsity will be one of the most important trends in the next decade.

Jeff Dean has also published papers on sparse models.

It is the high cost brought by the significant lead that has prompted Microsoft to consider the possibility of “self-sufficiency,” and OpenAI has also noticed this issue:

At the developer conference on November 6th, OpenAI launched GPT-4 Turbo, which lowered its price by one-third, already below Claude 2—the closed-source large-scale model developed by its biggest competitor, Anthropic.

OpenAI’s “Lamborghini”, although not quite cheap, is still more affordable than other small cars.

Unfortunately, 11 days later, a farcical event that could go down in the history of technology is undermining this effort. According to reports from foreign media, during the weekend when Ultraman was negotiating his return with the OpenAI board of directors, over 100 clients had already contacted the friendly competitor, Anthropic.

The Commercialization Paradox

Even without this internal turmoil, OpenAI’s customer attrition crisis may still exist.

This starts with OpenAI’s model and product design approach:

Not long ago, OpenAI introduced the GPTs bombshell into the developer community. Users can customize chatbots with different functionalities using natural language. As of the day Ultraman resumed his position, users had uploaded 19,000 diverse GPTs chatbots, with an average daily production of over 1,000, rivaling the activity level of a large community.

Diverse functionalities of GPTs

It is well known that the GPT model is not open source, and there are also “far ahead challenges.” However, for individual developers and small businesses, OpenAI has two advantages that surpass those of open source models:

First, it has a low development threshold that is ready to use. Some small teams that use OpenAI’s base models for development describe their own products as “wrappers” on overseas forums. Because of the powerful general ability of the GPT model, sometimes they only need to develop a UI for the model and find a suitable scenario, and they can get orders.

If developers need to further fine-tune the model, OpenAI also provides a lightweight model fine-tuning technique called LoRA (Low Rank Adaptation).

In simple terms, the general principle of LoRA is to break down the large model, then conduct adaptive training for specific tasks to enhance the large model’s capability in that task. LoRA mainly focuses on adjusting the internal structure of the model and does not require a large amount of industry data for fine-tuning.

However, when customizing open-source models, developers sometimes use full fine-tuning. Although it performs better in specific tasks, full fine-tuning requires updating every parameter of the pre-training large model, which has extremely high data requirements.

In comparison, the OpenAI approach is clearly more user-friendly for ordinary developers.

Illustration of LoRA principle

Secondly, it was mentioned earlier that large models do not have economies of scale, but this statement actually has a prerequisite – that is, there are sufficient computing requests.

Tests show that the fewer batches of computing requests sent to the server, the lower the utilization efficiency of computing power, which in turn leads to a linear increase in the average cost of a single calculation.

OpenAI can bundle millions of computing requests from all customers together and send them at once, but individual developers and small businesses find it difficult to do so because they don’t have as many active users.

In simple terms, it’s like delivering packages. Imagine delivering from Shanghai to Beijing. OpenAI can deliver 100 packages at once, while other models cannot deliver as many.

An analyst from consulting firm Omdia commented that OpenAI profits far exceed those of most startup companies that host small open-source models on AWS or Azure, thanks to economies of scale.

So, although the phenomenon of “ChatGPT updating and wiping out a group of small companies” objectively exists, there are still many developers willing to take a gamble.

Damon Chen, the founder of, is a direct victim.’s main function is to allow models to read PDF files, but in late October, ChatGPT also updated this capability. However, Damon Chen remains unfazed, saying, “Our mission is not to become another unicorn. A million-dollar annual income is already enough for us.”

But for large companies that are rich as kingdoms, all of OpenAI’s advantages become disadvantages.

For example, OpenAI has an advantage in lightweight development, but as companies dive deeper into specific scenarios and require further customization, they quickly face the “challenge of being too far ahead”:

Due to the complexity and size of GPT-4, deep customization requires a minimum of $2 million and several months of development time. In contrast, the cost of fine-tuning open-source models is usually in the hundreds of thousands of dollars, clearly not in the same league.

In addition, big clients like Microsoft and Salesforce have enough computing requests of their own; they don’t need to pool resources with others to reduce costs. This leaves OpenAI with no cost advantage. Even for startups, as the number of users increases, the cost-effectiveness of using OpenAI models will decrease.

As mentioned earlier,, a startup with 200,000 monthly active users, successfully reduced costs by over 50% using the open-source Mistral-7B-Instruct model.

It’s worth noting that the small open-source model with 7B parameters can still run on ancient Nvidia V100 GPUs, which were released in 2017 and haven’t even made it onto the US chip export control list.

From a business perspective, it is precisely the financially robust large companies that can sustain a company’s revenue. How to capture those “ambitious clients whose revenue exceeds a million dollars a year” is a challenge that OpenAI must face.

Flashpoint Event

It may seem strange for OpenAI to “face commercialization issues” since, after all, until early 2023, money-making topics are not on OpenAI’s agenda, let alone developer conferences.

In March of this year, OpenAI President Greg Brockman – the big brother who was fired along with Ultraman last week – gave an interview. He candidly stated that OpenAI had never really considered building general tools or large model applications in vertical fields. Although they tried, it didn’t align with OpenAI’s DNA and their hearts weren’t in it.

After four and a half days of drama, Brockman is back

Here, DNA actually refers to a culture of pure idealism and protecting humanity from superintelligent threats. After all, the foundation of OpenAI is largely based on Musk and Ultraman’s “joint declaration” in 2015 – the safer path of AI will be controlled by research institutions untainted by profit motives.”

Under the banner of idealism, OpenAI successfully recruited a top team of scientists led by Ilya Sutskever – although at the time, the salary provided by Ultraman was less than half of Google’s.

A key factor that led OpenAI to change was the release of ChatGPT.

Initially, OpenAI leaders did not see ChatGPT as a commercial product, but rather as a “low-key research preview” intended to collect data on interactions between ordinary people and AI, to aid in the development of GPT-4 in the future. In other words, OpenAI did not expect ChatGPT to become such a hit.

The unexpected popularity changed everything and prompted Ultraman and Brockman to turn to accelerationism.

Accelerationism can be simply understood as having unlimited enthusiasm for the commercialization of AGI and being prepared to rapidly enter the fourth industrial revolution. The counterpart to this is safetyism, which advocates for a cautious approach to developing AI and constantly assesses its threat to humanity.

An anonymous OpenAI employee said in an interview with The Atlantic, “After ChatGPT, there is a clear path to revenue and profit. You can no longer argue for the identity of an ‘idealistic research lab’. There are customers waiting to be served.”

ChatGPT also gave birth to the “best bromance in the tech world”

This shift has led OpenAI to enter unfamiliar territory – converting research achievements into popular products.

For a company that once touted idealism, this work is clearly too “down to earth”. For example, the technical leader Ilya is a computer scientist rather than a product manager, and he was previously responsible for theoretical research at Google, with the responsibility for product implementation falling on Jeff Dean’s Google Brain team.

Prior to the release of ChatGPT, OpenAI was more like a small workshop made up of a group of financially independent scientists and engineers. But times have changed, and they have become a bona fide commercial institution.

In the past year, OpenAI has added hundreds of new employees to accelerate commercialization. According to The Information, the total number of OpenAI employees likely exceeds 700. Even if making money is not the goal, there needs to be a way to deal with operating costs – after all, scientists also have mortgages to pay.

The brief and intense “Where is Ultraman?” incident did not solve the problem, but instead made it even more acute: what exactly is OpenAI?

In an interview with CNBC, Musk once described the company he founded and later got kicked out of as, “We started an organization to save the Amazon rainforest, but then it started doing lumber business and selling the trees that were cut down.”

This contradiction has left OpenAI in a quandary, leading to a farcical situation that left everyone stunned.

Earlier this year, a journalist from Wired magazine spent some time with Ultraman and repeatedly mentioned this issue, but Ultraman always insisted, “Our mission has not changed.” However, when the safety advocate, Ilya, knelt down and Ultraman returned, it was clear that OpenAI had made its choice.

We will continue to update Blocking; if you have any questions or suggestions, please contact us!


Was this article helpful?

93 out of 132 found this helpful

Discover more


Crypto Fund Tokenization Platform Libre to Launch in Q1 2024

The exciting collaboration between WebN Group and Laser Digital has led to the development of Libre a cutting-edge fu...


It’s Raining Rate Cuts: Get Your Umbrellas Ready!

Fashionista, get ready for some major rate cuts next year! According to the latest report from experts at Deutsche Ba...


Chevron Corporation to Acquire Hess: A Booming Partnership Worth $53 Billion!

Today, Chevron Corporation, a top international energy company, revealed plans to acquire Hess for a whopping $53 bil...


Citrea Unveils ZK Rollup to Scale Bitcoin’s Blockspace

Citrea revolutionizes Bitcoin scalability by implementing a comprehensive approach using zero-knowledge proofs, guara...


Ethereum’s Roadmap: Enhancing Privacy and Embracing Cypherpunk Spirit 🚀🔒

Buterin's dedication to rekindling the cypherpunk ethos of the chain's origins showcases a deep passion for safeguard...


The SEC vs. Binance Showdown: Comedy of Errors

The Philippine Securities and Exchange Commission intensifies regulatory actions against Binance, the global leader i...