OpenAI failed to deliver the opt-out tool it promised by 2025

Back in May, OpenAI said it was developing a tool to let creators specify how they want their works to be included in — or excluded from — its AI training data. But seven months later, this feature has yet to see the light of day.

Called Media Manager, the tool would "identify copyrighted text, images, audio, and video," OpenAI said at the time, to reflect creators' preferences "across multiple sources." It was intended to stave off some of the company's fiercest critics, and potentially shield OpenAI from IP-related legal challenges.

But people familiar tell TechCrunch that the tool was rarely viewed as an important launch internally. "I don’t think it was a priority," one former OpenAI employee said. "To be honest, I don't remember anyone working on it."

A non-employee who coordinates work with the company told TechCrunch in December that they had discussed the tool with OpenAI in the past, but that there haven't been any recent updates. (These people declined to be publicly identified discussing confidential business matters.)

And a member of OpenAI's legal team who was working on Media Manager, Fred von Lohmann, transitioned to a part-time consultant role in October. OpenAI PR confirmed Von Lohmann's move to TechCrunch via email.

OpenAI has yet to give an update on Media Manager's progress, and the company missed a self-imposed deadline to have the tool in place "by 2025." (To be clear, "by 2025" could be read as inclusive of the year 2025, but TechCrunch interpreted OpenAI's language to mean leading up to January 1, 2025.)

IP issues

AI models like OpenAI's learn patterns in sets of data to make predictions — for instance, that a person biting into a burger will leave a bite mark. This allows models to learn how the world works, to a degree, by observing it. ChatGPT can write convincing emails and essays, while Sora, OpenAI's video generator, can create relatively realistic footage.

The ability to draw on examples of writing, film, and more to generate new works makes AI incredibly powerful. But it's also regurgitative. When prompted in a certain way, models — most of which are trained on countless web pages, videos, and images — produce near-copies of that data, which despite being "publicly available," are not meant to be used this way.

For example, Sora can generate clips featuring TikTok's logo and popular video game characters. The New York Times has gotten ChatGPT to quote its articles verbatim (OpenAI blamed the behavior on a "hack").