Close Menu
TechBrunchTechBrunch
  • Home
  • AI
  • Apps
  • Crypto
  • Security
  • Startups
  • TechCrunch
  • Venture

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Chainsmokers' Mantis Ventures closes its third $100 million fund

July 15, 2025

US Army soldier pleaded guilty to hacking and fearing carriers

July 15, 2025

Meta fixes a bug that lets users leak AI prompts and can generate content

July 15, 2025
Facebook X (Twitter) Instagram
TechBrunchTechBrunch
  • Home
  • AI

    OpenAI seeks to extend human lifespans with the help of longevity startups

    January 17, 2025

    Farewell to the $200 million woolly mammoth and TikTok

    January 17, 2025

    Nord Security founder launches Nexos.ai to help enterprises move AI projects from pilot to production

    January 17, 2025

    Data proves it remains difficult for startups to raise capital, even though VCs invested $75 billion in the fourth quarter

    January 16, 2025

    Apple suspends AI notification summaries for news after generating false alerts

    January 16, 2025
  • Apps

    Google Discover adds AI summary and threatens publishers with more traffic

    July 15, 2025

    July 15, 2025

    NextDoor's redesign app with AI recommendations, local news, and real-time emergency alerts

    July 15, 2025

    Following YouTube, Meta announces crackdowns on “non-original” Facebook content

    July 14, 2025

    When browser wars get hot, there are the hottest alternatives for Chrome and Safari in 2025

    July 14, 2025
  • Crypto

    Bitcoin surpasses $118K at the second highest high in 24 hours

    July 11, 2025

    Vitalik Buterin reserves for Sam Altman's global project

    June 28, 2025

    Calci will close a $185 million round as rival Polymeruk reportedly seeks $200 million

    June 25, 2025

    Stablecoin Evangelist: Katie Haun's Battle of Digital Dollars

    June 22, 2025

    Hackers steal and destroy millions of Iran's biggest crypto exchanges

    June 18, 2025
  • Security

    US Army soldier pleaded guilty to hacking and fearing carriers

    July 15, 2025

    Meta fixes a bug that lets users leak AI prompts and can generate content

    July 15, 2025

    Ukrainian hackers claim to have destroyed servers of Russian drone manufacturers

    July 15, 2025

    Doge staff with access to American personal data leaked their private Xai API key

    July 15, 2025

    Episolus informs millions of people that their health data has been stolen

    July 14, 2025
  • Startups

    7 days left: Founders and VCs save over $300 on all stage passes

    March 24, 2025

    AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

    March 24, 2025

    20 Hottest Open Source Startups of 2024

    March 22, 2025

    Andrill may build a weapons factory in the UK

    March 21, 2025

    Startup Weekly: Wiz bets paid off at M&A Rich Week

    March 21, 2025
  • TechCrunch

    OpenSea takes a long-term view with a focus on UX despite NFT sales remaining low

    February 8, 2024

    AI will save software companies' growth dreams

    February 8, 2024

    B2B and B2C are not about who buys, but how you sell

    February 5, 2024

    It's time for venture capital to break away from fast fashion

    February 3, 2024

    a16z's Chris Dixon believes it's time to focus on blockchain use cases rather than speculation

    February 2, 2024
  • Venture

    Chainsmokers' Mantis Ventures closes its third $100 million fund

    July 15, 2025

    Venture acquires a rare Native American-led fund at Betsy Fore's Velvetin venture

    July 15, 2025

    A comprehensive list of 2025 tech layoffs

    July 15, 2025

    Rwazi raises a $12 million Series A to help businesses with consumer insights and intelligence

    July 15, 2025

    TC All Stage is on sale in Boston today

    July 15, 2025
TechBrunchTechBrunch

First of all, what does “open source AI” mean?

TechBrunchBy TechBrunchJune 22, 20249 Mins Read
Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
Share
Facebook Twitter LinkedIn Pinterest Telegram Email


The battle between open source and proprietary software is well known, but the tensions that have permeated the software industry for decades have also found their way into the burgeoning field of artificial intelligence, where they've sparked fierce debate.

The New York Times recently published a glowing review of Meta CEO Mark Zuckerberg, noting that his “open source AI” initiative has rekindled his popularity in Silicon Valley. But the problem is that Meta's Llama-branded large-scale language models are not actually open source.

Or is it?

By most estimates, no. But it does highlight that the concept of “open source AI” is likely to become even more controversial in the future. This is something the Open Source Initiative (OSI) is trying to address. Led by Executive Director Stefano Maffulli (pictured above), the initiative has been tackling the issue for over two years through a global effort that includes conferences, workshops, panels, webinars, and reports.

AI is not software code

Image credit: Westend61 via Getty

For over 25 years, OSI has been the custodian of the Open Source Definition (OSD), defining how the term “open source” can and should be applied to software. Any license that meets this definition can legitimately be considered “open source,” but a wide range of licenses are permitted, from very permissive to not so permissive.

But applying traditional software licensing and naming conventions to AI is problematic. Joseph Jacks, open source evangelist and founder of venture capital firm OSS Capital, goes so far as to say that “there is no such thing as open source AI,” noting that “open source was invented specifically for software source code.”

In contrast, “neural network weights” (NNWs) is a term used in the artificial intelligence world to describe the parameters or coefficients that a network learns during the training process, but cannot be meaningfully compared to software.

“Neural net weights are not software source code; they cannot be read or debugged by humans,” Jacks points out. “Furthermore, the fundamental rights of open source do not apply in quite the same way to NNWs.”

This led Jacks and his OSS Capital colleague Heather Meeker to come up with their own definition, centered around the concept of “openweight.”

So before we arrive at a meaningful definition of “open source AI,” we can see that trying to get there will create some inherent tensions: how can we agree on a definition if we can't agree that the “thing” we're defining exists?

Mahouli agrees.

“You're right,” he told TechCrunch. “One of the early discussions was whether we should even call this open source AI, but everyone was already using that term.”

This reflects part of a challenge in the broader field of AI, where there is a lot of debate about whether what we call “AI” today is really AI, or just powerful systems taught to find patterns in reams of data. But opponents generally accept the fact that the “AI” label already exists, and see no point in fighting it.

Llama illustrationImage credit: Larysa Amosova via Getty

Founded in 1998, OSI is a non-profit public benefit corporation that focuses on advocacy, education, and a wide range of open source related activities with the Open Source Definition at its core. Today, the organization relies on sponsors for funding and includes such notable members as Amazon, Google, Microsoft, Cisco, Intel, Salesforce, and Meta.

Meta's involvement with OSI is especially notable in relation to the current concept of “open source AI.” While Meta positions its AI as open source, the company does place notable restrictions on how the Llama model can be used. Of course, it is free for research and commercial use, but app developers with more than 700 million monthly users must apply for a special license from Meta, which will be granted at Meta's sole discretion.

Simply put, Meta's Big Tech allies can blow the whistle if they want to get involved.

Meta's wording around LLM has been somewhat flexible: the company called the Llama 2 model open source, but with the arrival of Llama 3 in April, it has toned down that term a bit in favor of phrases like “openly available” and “openly accessible,” though it still calls the model “open source” in some places.

“Everyone else in this discussion is in complete agreement that Llama itself cannot be considered open source,” Maffulli said. “People who have spoken to people who work at Meta understand that's a bit of a stretch.”

On top of that, one might argue there is a conflict of interest here: are the companies that have demonstrated a desire to piggyback on the open source brand also funding the maintainers of the “definition”?

That's one reason OSI is looking to diversify its funding, and recently won a grant from the Sloan Foundation, which is funding OSI's multi-stakeholder, global effort to achieve its definition of open source AI. TechCrunch revealed that the grant was worth about $250,000, and Makhlouri hopes it will change perspectives on its reliance on corporate funding.

“One of the things the Sloan grant makes even clearer is that we can say goodbye to Meta's funding at any time,” Mahri said. “We can do that even before the Sloan grant is paid out, because we know that we're going to be getting donations from other people, and Meta knows that very well. They're not going to interfere with this at all.” [process]Microsoft, GitHub, Amazon, and Google all fully understand that their organizational structures mean they cannot interfere.”

A working definition of open source AI

Conceptual diagram depicting finding a definitionImage credit: Alexei Morozov/Getty Images

The current draft Open Source AI Definition is at version 0.0.8 and consists of three main parts: a “Preamble” that outlines the scope of the document, the Open Source AI Definition itself, and a checklist of necessary components for an open source compliant AI system.

According to the current draft, open source AI systems must grant the freedom to use the system for any purpose without asking permission, the freedom for others to study how the system works and inspect its components, and the freedom to modify and share the system for any purpose.

But one of the biggest challenges was around data: whether an AI system can be classified as “open source” if a company doesn't make its training datasets available to others. Mahruli says it's more important to know where the data came from and how the developers labeled, deduplicated, and filtered it. It's also important to have access to the code that was used to assemble the datasets from various sources.

“Knowing that information is much better than just having a data set without the rest of the information,” Mafoury said.

While it would be nice to have access to the full dataset (OSI lists this as an “optional” component), Maffulli says that in many cases, this is not possible or practical. This may be because the dataset contains confidential or copyrighted information that developers are not allowed to redistribute. Additionally, there are techniques to train machine learning models in such a way that the data itself is not actually shared with the system, using techniques such as federated learning, differential privacy, and homomorphic encryption.

And this perfectly highlights the fundamental difference between “open source software” and “open source AI”: they may be similar in intent, but they are not comparable on an equal footing, and it is this difference that the OSI tries to capture in its definition.

In software, source code and binary code are two views of the same artifact: they reflect the same program in different forms. However, a training dataset and the subsequent trained model are different things. Using the same dataset does not necessarily allow you to consistently recreate the same model.

“There's a lot of statistical and random logic that happens during training, so it can't be replicated in the same way as software,” Makhouli added.

Therefore, an open source AI system should be easy to replicate with clear instructions. This is where the checklist aspect of the Open Source AI definition comes in handy. The definition is based on a recently published academic paper called “Model Openness Framework: Promoting Integrity and Openness for Reproducibility, Transparency, and Usability in Artificial Intelligence.”

The paper proposes the Model Openness Framework (MOF), a classification system that evaluates machine learning models “based on their completeness and openness.” MOF requires that certain components of AI model development, such as details about training methods and model parameters, be “included and released under an appropriate open license.”

Steady state

Stefano Maffouli speaking at the Digital Public Goods Alliance (DPGA) Member Summit in Addis AbabaStefano Maffouli presents at the Digital Public Goods Alliance (DPGA) Member Summit in Addis Ababa. Image courtesy of OSI

OSI calls its official releases of definitions “stable versions,” much like a company might release an application that has been thoroughly tested and debugged before prime time. OSI deliberately avoids calling them “final releases” because parts of the definitions are likely to evolve.

“We can't expect this definition to last 26 years like the Open Source Definition,” Makhlouri says. “I don't think the first part of the definition, like 'what is an AI system,' will change much. But the part that we refer to in the checklist — the list of components — will depend on the technology. Who knows what the technology will be tomorrow?”

A stable open source AI definition is expected to be approved by the board at its All Things Open conference at the end of October. In the interim, OSI has embarked on a global roadshow across five continents to solicit more “diverse opinions” on how “open source AI” should be defined going forward. But the final changes are likely to be just “small tweaks” here and there.

“This is the final stage,” says Makhlouri, “we've got the full functionality of the definition. We have all the pieces we need. We have the checklist, so we're making sure there are no surprises, that there are systems that we should include or exclude.”



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

OpenAI seeks to extend human lifespans with the help of longevity startups

January 17, 2025

Farewell to the $200 million woolly mammoth and TikTok

January 17, 2025

Nord Security founder launches Nexos.ai to help enterprises move AI projects from pilot to production

January 17, 2025

Data proves it remains difficult for startups to raise capital, even though VCs invested $75 billion in the fourth quarter

January 16, 2025

Apple suspends AI notification summaries for news after generating false alerts

January 16, 2025

Nvidia releases more tools and guardrails to help enterprises adopt AI agents

January 16, 2025

Leave A Reply Cancel Reply

Top Reviews
Editors Picks

7 days left: Founders and VCs save over $300 on all stage passes

March 24, 2025

AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

March 24, 2025

20 Hottest Open Source Startups of 2024

March 22, 2025

Andrill may build a weapons factory in the UK

March 21, 2025
About Us
About Us

Welcome to Tech Brunch, your go-to destination for cutting-edge insights, news, and analysis in the fields of Artificial Intelligence (AI), Cryptocurrency, Technology, and Startups. At Tech Brunch, we are passionate about exploring the latest trends, innovations, and developments shaping the future of these dynamic industries.

Our Picks

Chainsmokers' Mantis Ventures closes its third $100 million fund

July 15, 2025

US Army soldier pleaded guilty to hacking and fearing carriers

July 15, 2025

Meta fixes a bug that lets users leak AI prompts and can generate content

July 15, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

© 2025 TechBrunch. Designed by TechBrunch.
  • Home
  • About Tech Brunch
  • Advertise with Tech Brunch
  • Contact us
  • DMCA Notice
  • Privacy Policy
  • Terms of Use

Type above and press Enter to search. Press Esc to cancel.