Sora from OpenAIArtificial intelligence creates photorealistic videos

Something is happening again in the field of artificial intelligence. Now that the initial hype surrounding generative speech and image programs such as ChatGPT and DALL-E has died down, the company OpenAI is once again attracting attention. This time it is about the artificial creation of photorealistic videos. The program now presented is called Sora, and although it is initially only available to a small group of users, it is already causing quite a stir online. We explain what Sora can do and the potential impact of artificial intelligence in the field of video.

Sora is a so-called text-to-video program. OpenAI provides information on the exact functionality of Sora on this page. It shows that Sora is based on a similar principle to OpenAI's ChatGPT. The AI text generator ChatGPT splits words into smaller units (called "tokens") in order to then generate new text from them. This requires training with a large amount of text over a long period of time. Sora was also trained with huge amounts of data, but not with text, but with videos. These videos are also broken down into smaller individual parts (called "patches"), from which the program can then generate new videos after a long training period.

As with the release of the generative language model ChatGPT at the end of 2022, many users are particularly enthusiastic about the high quality of the artificially generated content. On the Sora website, you can get an idea of how deceptively real some of the videos look. Even though OpenAI itself points out that there are still some things that Sora has problems with and provides examples of this. For example, one of the videos shows a candle with two flames burning at the wick. Whether such errors are the exception or the rule cannot yet be reliably assessed, as Sora is not yet available to the public.

What about the safety of Sora?

OpenAI now seems to be a little more sensitive to the fact that artificial intelligence could also cause great harm in the wrong hands. Therefore, the company assures that several important security steps will be taken before making Sora publicly available. Initially, Sora is therefore only available to a closed group of users who can test Sora. According to OpenAI, this group also includes experts in the areas of misinformation, hateful content and bias.

In addition, technical methods are to be used to prevent misuse of Sora. For example, OpenAI is working on a method to be able to recognize retrospectively whether a video was created with Sora. The company also uses security methods that are already used in the generative AI for images, DALL-E 3. For example, the entered texts are checked and rejected if they violate OpenAI's terms of use. This is intended to prevent the depiction of extreme violence, sexual content, prominent people or protected brands and other problematic content. The videos created are also to be automatically checked again before they are shown to users. Despite these precautions and the promised exchange with civil society (OpenAI cites politicians, educators and artists as examples), OpenAI points out that it is impossible to predict exactly how people will use the new technology.

What dangers could arise from text-to-video programs?

  1. Misinformation and manipulation: The ability to create photorealistic video content via text can make it easier for to spread misinformation and manipulated content. If artificial videos are difficult or impossible to recognize, the trust in visual media could be shaken.
  2. Privacy and data protection: Programs such as Sora could lead to significant data protection problems. This applies in particular if personal or copyrighted content is created and distributed without consent. It is also conceivable that people's privacy could be violated if fake videos are created of them.
  3. Identity theft and abuse: text-to-video programs could be used by malicious actors to create fake identities or manipulated videos that are used for fraudulent purposes. This could lead to serious legal and social issues, including identity theft and reputational damage.
  4. Social impact and ethics: The widespread adoption of text-to-video technologies such as Sora could lead to a shift in social norms and ethical standards. Issues of authenticity, accountability in content creation and control over digital identities will become increasingly relevant and require careful consideration.

Suitable information and materials from klicksafe