OpenAI continues its efforts to make artificial intelligence accessible to as wide an audience as possible. In this context, it has announced GPT-4o Mini, the most cost-effective small model available. GPT-4o Mini is expected to make AI significantly more affordable and accessible, thereby expanding the range of applications developed with AI.
GPT-4o Mini offers a much more advantageous pricing structure compared to previous models. It can be used at a cost of 15 cents per 1 million tokens for text input and 60 cents for output. This makes AI much more accessible.
Thanks to its low cost and low latency, GPT-4o Mini enables a variety of tasks:
Chaining or parallelizing multiple model calls (e.g., making multiple API calls)
Transferring large amounts of context to the model (e.g., an entire codebase or conversation history)
Interacting with customers through fast, real-time text responses (e.g., customer support chatbots)
Currently, GPT-4o Mini API supports text and image processing. In the future, it is planned to support text, image, video, and audio inputs and outputs. The model has a context window of 128,000 tokens, supports up to 16,000 output tokens per request, and contains information up to October 2023. Thanks to the advanced tokenizer shared with GPT-4o, the cost of working with non-English text is reduced.
GPT-4o Mini outperforms GPT-3.5 Turbo and other small models in academic benchmarks for both textual intelligence and multimodal reasoning. It also supports the same range of languages as GPT-4o. Additionally, it demonstrates strong performance in function calling, enabling developers to retrieve data from external systems or perform operations, and shows better performance in long-text contexts compared to GPT-3.5 Turbo.
GPT-4o Mini has been evaluated across various key benchmarks, with the following results:
Reasoning tasks: GPT-4o Mini performs better than other small models in reasoning tasks involving both text and images. It scored 82.0% on the MMLU benchmark for textual intelligence and reasoning, while Gemini Flash scored 77.9% and Claude Haiku scored 73.8%.
Mathematical and coding proficiency: GPT-4o Mini demonstrates superior performance in mathematical reasoning and coding tasks compared to previous small models. On the MGSM benchmark for mathematical reasoning, GPT-4o Mini scored 87.0%, while Gemini Flash scored 75.5% and Claude Haiku scored 71.7%. On the HumanEval benchmark for coding performance, GPT-4o Mini scored 87.2%, while Gemini Flash scored 71.5% and Claude Haiku scored 75.9%.
Multimodal reasoning: GPT-4o Mini also shows strong performance on the MMMU benchmark for multimodal (visual, auditory) reasoning, scoring 59.4%, while Gemini Flash scored 56.1% and Claude Haiku scored 50.2%.
Reinforcement Learning from Human Feedback (RLHF): RLHF is used to align the model’s behavior with OpenAI’s policies and improve the accuracy and reliability of its responses.
External Expert Evaluation: Over 70 independent experts in fields such as social psychology and disinformation have evaluated GPT-4o Mini for potential risks.
Instruction Hierarchy: GPT-4o Mini is the first model to use the “instruction hierarchy” method. This approach enhances the model’s resistance to jailbreak attacks, prompt injections, and system prompt extraction attempts.
You can experience GPT-4o Mini through SkyStudio and take advantage of a free demo by reaching out to us!