We finally have an 'official' definition for open source AI

There's finally an "official" definition of open source AI.

The Open Source Initiative (OSI), a long-running institution aiming to define and "steward" all things open source, today released version 1.0 of its Open Source AI Definition (OSAID). The product of several years of collaboration with academia and industry, the OSAID is intended to offer a standard by which anyone can determine whether AI is open source — or not.

You might be wondering — as this reporter was — why consensus matters for a definition of open source AI. Well, a big motivation is getting policymakers and AI developers on the same page, said OSI EVP Stefano Maffulli.

"Regulators are already watching the space," Maffulli told TechCrunch, noting that bodies like the European Commission have sought to give special recognition to open source. "We did explicit outreach to a diverse set of stakeholders and communities — not only the usual suspects in tech. We even tried to reach out to the organizations that most often talk to regulators in order to get their early feedback."

Open AI

To be considered open source under the OSAID, an AI model has to provide enough information about its design so that a person could "substantially" recreate it. The model must also disclose any pertinent details about its training data, including the provenance, how the data was processed, and how it can be obtained or licensed.

"An open source AI is an AI model that allows you to fully understand how it's been built," Maffulli said. "That means that you have access to all the components, such as the complete code used for training and data filtering."

The OSAID also lays out usage rights developers should expect with open source AI, like the freedom to use the model for any purpose and modify it without having to ask anyone's permission. "Most importantly, you should be able to build on top," added Maffulli.

The OSI has no enforcement mechanisms to speak of. It can't pressure developers to abide by or follow the OSAID. But it does intend to flag models described as "open source" but which fall short of the definition.

"Our hope is that when someone tries to abuse the term, the AI community will say, 'We don't recognize this as open source,' and it gets corrected," Maffulli said. Historically, this has had mixed results, but it isn't entirely without effect.

Many startups and big tech companies, most prominently Meta, have employed the term "open source" to describe their AI model release strategies — but few meet the OSAID's criteria. For example, Meta mandates that platforms with more than 700 million monthly active users request a special license to use its Llama models.