OpenAIâs Sora 2: Redefining Safe, PhysicsâDriven Video AI

The competition to develop video generation systems that can accurately simulate physical reality has intensified across the technology sector over the past year.
OpenAI has now released Sora 2, a video and audio generation model that the firm says demonstrates improved physics simulation capabilities compared to earlier systems.
The model is a development in what OpenAI describes as world simulation technology, which uses neural networks to generate video content that adheres more closely to physical laws.
The system can now model scenarios such as gymnastics routines and basketball rebounds that follow principles of buoyancy and rigidity.
The Sora team writes that the model marks progress from the original Sora, which launched in February 2024.
“The Sora team has been focused on training models with more advanced world simulation capabilities,” the team writes in an OpenAI blog post.
“We believe such systems will be critical for training AI models that deeply understand the physical world.
“A major milestone for this is mastering pre-training and post-training on large-scale video data, which are in their infancy compared to language.”
Sora 2’s enhanced capabilities
The OpenAI Sora team notes that earlier video generation models would alter objects and deform scenarios to match text prompts.
In contrast, Sora 2 attempts to model outcomes that follow physics constraints.
âPrior video models are overoptimistic â they will morph objects and deform reality to successfully execute upon a text prompt,â the team writes.
âFor example, if a basketball player misses a shot, the ball may spontaneously teleport to the hoop.
âIn Sora 2, if a basketball player misses a shot, it will rebound off the backboard.â
The system can follow instructions across multiple shots while maintaining consistency of elements within generated scenes.
It operates across visual styles including photorealistic, cinematic and anime formats.
The model also generates audio elements including background soundscapes, speech and sound effects alongside video content.
Additionally, OpenAI has introduced a feature that allows users to insert recordings of people or objects into generated environments.
âBy observing a video of one of our teammates, the model can insert them into any Sora-generated environment with an accurate portrayal of appearance and voice,â the team writes.
How does Sora 2 work?
The company has released an iOS mobile application that provides access to Sora 2 through an invite system.
The app includes a feature called cameos, which requires users to record a video and audio sample for identity verification before they can appear in generated content.
Users can maintain control over their digital likeness through permission settings.
“Only you decide who can use your cameo and you can revoke access or remove any video that includes it at any time,” the team says.
OpenAI has implemented what it describes as a natural language recommender system, which uses the company’s language models to allow users to instruct the content feed through text commands.
The company states it has designed the algorithm to prioritise content from accounts users follow and videos that might serve as creative inspiration.
“We explicitly designed the app to maximise creation, not consumption,” they say.
Sora 2’s safety features
The company has introduced multiple features to protect the user experience.
Distinguishing AI content
Sora 2 embeds visible watermarks and C2PA metadata in every video, alongside internal tracing tools, to ensure AI-generated content is identifiable and accountable.
- Physics-accurate video generation with realistic motion and outcomes
- Audio and dialogue synchronisation alongside video content
- Multiâstyle output: photorealistic, cinematic and anime formats
- Cameo feature with identityâverified likeness control
- Builtâin safety systems: watermarks, filters, parental controls
Consent-based likeness
Users control how their likeness is used through cameo features, with the ability to revoke access, review drafts, delete or report content and set custom preferences. Public figures are blocked unless they opt in.
Safeguards for teens
The platform restricts mature content, blocks adults from initiating teen contact and introduces parental controls via ChatGPT. Teen users face limits on scrolling and receive a non-personalised feed by default.
Filtering harmful content
Sora deploys layered defences to block unsafe prompts and outputs, filters feed content against global policies – and applies tighter rules due to video realism, with human moderation supplementing automation.
Audio safeguards
Generated audio is reviewed for policy violations, prevents imitation of living artists and honours takedown requests from creators.
User control and recourse
Users decide when to publish content, can remove or report videos and accounts and maintain control over visibility and interactions.
“Video models are getting very good, very quickly,” the team writes.



