OpenAI has announced the launch of GPT-4, the latest iteration in its deep learning models, featuring improved creativity, reliability, and the ability to accept image inputs and work with texts around the same length as a short novel.
The company has also released details of big-name companies and organisations already using GPT-4, including Morgan Stanley, Stripe, Duolingo and the Government of Iceland, the latter using Open AI’s technology to help preserve its national language.
With human-level performance on various professional and academic benchmarks, GPT-4 surpasses GPT-3.5 by a significant margin, exhibiting an increased ability to handle complex tasks and more nuanced instructions.
Over the past two years, OpenAI has rebuilt its entire deep learning stack in collaboration with Microsoft’s Azure, co-designing a supercomputer specifically for their workload. GPT-4's training run proved to be unprecedentedly stable, allowing OpenAI to accurately predict its performance for the first time. As the organisation continues to focus on reliable scaling, it says it aims to refine its methodology for predicting and preparing for future capabilities increasingly far in advance.
Initially, GPT-4's text input capability will be available via ChatGPT and the API (with a waitlist), while the image input capability will be prepared for wider availability in collaboration with a single partner. OpenAI will also open-source OpenAI Evals, its framework for automated evaluation of AI model performance, to encourage community participation in reporting model shortcomings.
The model also accepts prompts of text and images, allowing users to specify vision or language tasks. Although image inputs are still a research preview and not publicly available, GPT-4's performance on standard academic vision benchmarks has been promising.
Despite its capabilities, GPT-4 shares some limitations with earlier GPT models, such as “hallucinating” facts and making reasoning errors. However, GPT-4 has significantly reduced hallucinations compared to previous models, scoring 40% higher than GPT-3.5 on internal adversarial factuality evaluations.
OpenAI says GPT-4’s new feature highlights are:
- Enhanced creativity and collaboration: GPT-4 can generate, edit, and collaborate on creative and technical writing tasks, including song composition, screenplay writing, and adapting to a user's writing style.
- Image input capabilities: GPT-4 can process images and generate captions, classifications, and analyses.
- Extended text handling: GPT-4 can manage over 25,000 words of text enabling long-form content creation, prolonged conversations, and document search and analysis; for reference, that limit is around the same wordcount as The Old Man and the Sea by Ernest Hemingway and only slightly less than George Orwell’s Animal Farm or John Steinbeck’s Of Mice and Men
- Improved problem-solving: GPT-4 can tackle complex problems more accurately due to its expanded general knowledge and problem-solving skills.
“ChatGPT Plus subscribers will get GPT-4 access on chat.openai.com with a usage cap,” says the Open AI team. “We will adjust the exact usage cap depending on demand and system performance in practice, but we expect to be severely capacity constrained (though we will scale up and optimise over upcoming months).
“Depending on the traffic patterns we see, we may introduce a new subscription level for higher-volume GPT-4 usage; we also hope at some point to offer some amount of free GPT-4 queries so those without a subscription can try it too.”
Chatbot helps Morgan Stanley unlock potential
Morgan Stanley hosts a content library comprising hundreds of thousands of pages covering investment strategies, market research, commentary, and analyst insights; a massive repository stored across numerous internal sites, primarily in PDF format, which necessitates advisors to sift through extensive information to address specific queries.
This process can be both time-consuming and unwieldy, so Morgan Stanley is using GPT-4 to power an internal-facing chatbot that performs a comprehensive search of wealth management content and attempts to unlock the cumulative knowledge of Morgan Stanley Wealth Management, says Jeff McMillan, Head of Analytics, Data & Innovation, whose team is leading the initiative.
“You essentially have the knowledge of the most knowledgeable person in Wealth Management — instantly”, says McMillan. “Think of it as having our Chief Investment Strategist, Chief Global Economist, Global Equities Strategist, and every other analyst around the globe on call for every advisor, every day. We believe that is a transformative capability for our company.”
Stripe, the global payments platform, recently asked 100 employees to come up with potential applications for GPT-4. The company hoped to identify products and workflows that could be accelerated by LLMs.
Eugene Mann, Product Lead for Stripe's Applied Machine Learning team, noted that GPT-4 was a game-changer, opening up numerous possibilities. The team compiled a list of 50 potential applications to test with GPT-4, and 15 prototypes were deemed strong candidates for integration, including support customisation, question answering, and fraud detection.
One application involved better understanding users' businesses, with GPT-4 scanning websites and delivering summaries that outperformed human-written ones. In another, GPT-4 acted as a virtual assistant, digesting and understanding Stripe's extensive technical documentation to answer support questions and troubleshoot issues.
“Our mission was to identify products and workflows across Stripe that could be accelerated by large language models and to really understand where LLMs work well today and where they still struggle,” says Mann. “But just having access to GPT-4 enabled us to realise, ‘Oh, there are all these problems that could be solved with GPT surprisingly well.’ ”
- Clever coders lead the way as Microsoft launches 365 CopilotAI Applications
- Large language models a welcome “wild west” for economistsAI Strategy
- GPT-3 language model matches humans in psychological testsAI Applications
- The AI will see you now as ChatGPT scores doctor’s exam passTechnology