OpenAI's recently launched developer artificial intelligence (AI) model 'GPT-4.1' has significantly reduced application programming interface (API) expenses. The industry evaluates that the launch of this model shows OpenAI's strategic shift. It has entered into price competition in response to competitors such as Google, Anthropic, and xAI.
According to IT media TechCrunch and other foreign news outlets on the 17th, OpenAI's GPT-4.1, launched on the 14th (local time), is praised as the most proficient coding model among OpenAI models and is priced competitively. GPT-4.1 is a sequel to the multimodal AI model GPT-4o launched last year, and it is divided into a basic model, a mini model, and a NANO model. GPT-4.1 is a model that focuses entirely on enhancing coding performance. OpenAI explained, "The coding performance of GPT-4.1 is 21.4% higher than that of GPT-4o and 26.6% higher than that of GPT-4.5."
The basic version of GPT-4.1 achieved a task completion rate of 54.6% in the SWE benchmark for evaluating coding ability, surpassing both GPT-4o (33%) and the previously announced large model GPT-4.5 (38%). It also scored higher than the inference model o3 mini (49%). In the math assessment AIME24, the basic version, mini, and NANO recorded 48.1%, 49.6%, and 29.4%, respectively, easily outperforming GPT-4o (13.1%). Except for NANO, they exceed the level of GPT-4.5 (36.7%).
GPT-4.1 is evaluated as having high price competitiveness. The input and output expenses for GPT-4.1 per 1 million tokens are as follows: the basic version costs $2 (2,800 won), $8 (11,300 won) for the mini model, and $0.1 (140 won), $0.4 (560 won) for NANO. Considering the previous costs of GPT-4o at $5 (7,100 won) and $20 (28,400 won), as well as the mini version at $0.6 (850 won) and $2.4 (3,400 won), it has become up to six times cheaper. OpenAI noted, "GPT-4.1 is 26% cheaper than GPT-4o for mid-level requests, and NANO is the cheapest and fastest model," adding that it has raised the discount rate for 'repetitive requests' from a basic 50% to 75%.
This represents a cheaper price compared to competitors. For instance, Anthropic’s 'Claude 3.7 Sonnet' and xAI's 'Grok-3' have input and output expenses of $3 (4,200 won) and $15 (21,200 won), respectively. Google's 'Gemini 2.5 Pro' offers input at $1.25 (1,700 won) and output at $10 (14,200 won), making it the cheapest among competitors’ advanced models. OpenAI explains that by lightweighting the AI model, they have reduced operating costs and lowered the final user fees. In fact, OpenAI has announced that it will discontinue the currently offered 'GPT-4.5' service through API in July, citing excessively high computation expenses as the reason.
The industry views the launch of GPT-4.1 as demonstrating OpenAI's strategic shift. As the competition in developer-oriented AI has intensified, OpenAI has engaged in price competition to solidify a favorable position. Indeed, Anthropic gained attention among engineers by introducing 'Claude Code,' specialized in coding, when it launched Claude 3.7 Sonnet last February. Google also made unlimited use of Gemini Code Assist available to general users around the same time.
With its own cloud, Google is leading with ultra-low-cost models like Gemini 'Flash Light,' emphasizing cost-effectiveness. Following low-cost models such as DeepSeek from China, U.S. big tech is also lowering API prices to target the developer market, prompting OpenAI to take action. The competitive price of GPT-4.1, which is cheaper than competitors', suggests it may have accepted some losses given OpenAI's current situation of suffering from a lack of computing resources.
OpenAI's actions suggest that the price competition for AI models will continue. Especially since AI models rely on large language models (LLM) where economies of scale are vital, it is crucial to attract users with low prices to dominate the initial market. Additionally, the rapid annual decline in reasoning expenses due to advancements in AI technology is also cited as a background for the price drop. The industry reports that the reasoning expenses for LLMs providing equivalent performance are decreasing to one-tenth each year.