OpenAI's Sora: Navigating Legal and Data Protection Challenges under GDPR

Publication date: 08.05.2024

Legal and data protection experts are focusing their attention on OpenAI's video generation tool, Sora, while the Italian Data Protection Authority, Garante, initiates a probe into its activities. The findings of this analysis could have repercussions for generative AI companies that rely on legitimate interest exemptions for data processing under the General Data Protection Regulation (GDPR).

What is OpenAI’s Sora ?

Generative AI technologies, such as OpenAI's Sora, generate fresh, original material based on the data on which they have been trained, which frequently consists of large amounts of internet text. While these tools can be extremely useful for swiftly and efficiently developing realistic, high-quality content, their data usage may be incompatible with GDPR requirements. This depends on whether the data used by these systems is considered personal data under the law and whether the company's use of this data can be justified as a legitimate interest.

Italian Authority Probes OpenAI's Sora

The Garante is taking on Sora while still conducting its original probe into OpenAI's ChatGPT, which it temporarily banned in March 2023. According to a March 8 press release from the Garante, the authority specifically asked OpenAI to explain "how the algorithm is trained; what data is collected and processed to train the algorithm, especially if it is personal data; whether particular categories of data (religious or philosophical beliefs, political opinions, genetic data, health, sexual life) are collected; and which sources are used."

The core of Sora's possible legal issues in Europe is not simply the data used to train its algorithm, but also how OpenAI intends to exploit its image and video database. In that way, many of the legal issues that Sora may pose to European users are identical to those presented by text-based creative tools. However, some data privacy professionals believe that Sora and other video-based generative AI technologies may not fit into the legal precedents that experts and regulators have previously relied on—and that Garante's Sora probe may be venturing into uncharted territory.

Garante's Evolving Approach to OpenAI's Sora

In many ways, several privacy experts consider Garante's research on Sora a continuation of its previous work with ChatGPT. In truth, a lot has changed since Garante's initial slap on OpenAI's wrist. The Garante is sure to draw on previous findings to influence its approach to Sora—and may already be doing so. After receiving criticism for its ban on ChatGPT, the authorities chose a different tactic this time, targeting Sora before its probable introduction in the European market.

Assessing OpenAI's Sora's Data Processing

Until now, providers of large language models (LLMs) have most likely used the legitimate interest exemption—the proportionality test between data subjects' rights and freedoms and the usefulness and the legitimate interest of a company's service or product—to justify their data processing. Many experts think that the impact on individuals' private rights is likely larger with videos and photographs.

The major concern with Sora is whether it is merely gathering personal data—i.e., images and videos of data subjects—or if it is also running biometrics on that information. If this is the case, OpenAI's new tool will enter the severely restricted realm of sensitive personal information under GDPR.

Balancing AI Innovation with Legal Compliance

This is only one example of a bigger trend in which legal authorities are wrestling with the consequences of new technology for existing laws and regulations. With technology growing at a quick pace, integrating it with the current legal framework is a difficult but necessary undertaking. Legal practitioners must continue to adapt and broaden their understanding of how technology such as AI fits into present privacy regulations and how these laws may need to evolve in the future.

Conclusion

The legal community is closely monitoring the Garante's inquiry into Sora. If they rule in favor of OpenAI and find that the company's use of data to train Sora is acceptable, it might establish a precedent for other generative AI businesses. However, the opposite is also true. If the Garante rules against OpenAI, it may prompt generative AI businesses to reassess their practices to ensure GDPR compliance.

You can also read about:

Common Law vs. Civil Law Systems: Navigating the Differences