Imagine you woke up in the middle of the night, with the weirdest dream of your cats organizing the shelf. But what if you can relive the dream by generating a video of them doing so? Sounds impossible, right? Well, not anymore, thanks to Sora, Creativity knows no bounds! The makers of ChatGPT, OpenAI, have uncovered its new AI model Sora can produce realistic-looking one-minute-long videos from just text commands.

What is Sora and how does it work?

Sora, OpenAI’s text-to-video generator, can generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. It can also create videos based on a still image or extend existing footage with fresh content. Along with understanding user prompts, it can comprehend how these things will reflect in the real world. Sora can also understand the user’s preferences for the style and mood of the video. It can adjust the lighting, color, and camera angles accordingly. It can also handle different genres and themes, including comedy, horror, sci-fi, and fantasy. Sora can also help viewers to explore and learn more about different topics and subjects, based on their curiosity and questions.

Following the introduction of the model, Sam Altman, CEO of OenAI, shared creations of Sora based on prompts requested by his followers. From cycling dolphins to a squirrel riding a dragon, he posted some sample videos that showcase the ambidexterity of Sora. Results prompted through the users’ commands looked very impressive, however, being attentive of a model that can effortlessly generate a one-minute video from simple text prompts is crucial, as it could be easily misused.

Challenges and Limitations

The software is currently in the red teaming phase, where the company is working towards identifying flaws in the system. On its official website, OpenAI has stated that it has been taking several safety measures before making Sora accessible in its products. The company went on to assert that they are working with a team of domain experts specific to misinformation, hateful content, and bias. These experts will be adversarially testing Sora. Besides, the company is also building tools like a detection classifier that can detect misleading content and tell if a video was generated by Sora. The text classifier deployed by OpenAI will keep a check and even reject prompts that violate the company’s usage policy which include, requests of extreme violence, sexual content, hateful imagery, celebrity likeness, or IP of others. The company also has strong image classifiers that will review the frames of every video to ensure that they align with the company’s usage policy.

OpenAI’s Sora comes at a time when text-to-video models have shown the astounding capabilities of AI video generation. Sora is a step further in Artificial General Intelligence. Sora is miles ahead of existing generative AI video creation models. Google & Meta also introduced a similar model, however, OpenAI seems to have surpassed all.

