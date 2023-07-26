AI has had a strong open-source tradition since the deep learning era took off more than a decade ago. But in the brave new world of generative AI, does the mostly altruistic open-source community stand a chance against the Big Tech moneybags?

That is the $1.3 trillion question, which is the amount that the generative AI market could be worth over the next decade alone, according to a recent report by Bloomberg Intelligence. And the answer could be "Yes."

Tech companies are worried. In May, a leaked document from a Google employee claimed that open-source AI will beat out products from both Google and the startup OpenAI, which despite the name mostly focuses on proprietary and not open-source products.

AI’s massive resource requirements has meant companies with the deep pockets to underwrite data and computing needs have been putting out some of the most impressive AI tools. Still, the open-source community has had major successes in making free versions of AI that are quickly catching up with the tech giants' offerings. As the Google employee warns in the memo, “We have no moat.”

The battle between open-source creators and large companies has been going on since before the term “open source” came into use 20 years ago to describe software released under a license that allows others to use, study, change and distribute it and its source code.

AI and previous open-source software have some differences. Fr example, AI has a major need for data. But the history of open-source software could still reveal something about the future of open-source AI.

The Linux Lesson

When Linux, an open-source operating system, started taking off, Microsoft's then-CEO Steve Ballmer in 2001 called it a “cancer.” Current Microsoft President Brad Smith has since walked that back, and the company is now a major supporter of the Linux Foundation, a nonprofit that supports the development and adoption of Linux.

Linux today is one of the most successful examples of open-source software and the basis for many computing systems. Almost 40 percent of all websites rely on Linux, and more than 85 percent of phones use Android, which is based on Linux.

Generative AI could follow a similar path. Its foundational models, huge and expensive to train, are the sort of open-source products that could spawn multiple, bespoke offshoots, according to Irving Wladawsky-Berger, a researcher affiliated with MIT.

“Usually the more infrastructure layers that are widely used by different companies, those tend to be the most open source because there is usually no differentiation,” he said. “I think that's what will happen with AI -- the higher you go in the stack, the more proprietary you get.”

As in other areas of the software industry, open-source AI is thriving, but that doesn’t mean it's without challenges, according to Derek Slater, a founding partner of the tech policy, strategy and advocacy firm Proteus Strategies. The software industry has changed since open source got its start and is now increasingly dominated by tech giants that can undermine open-source projects.

"Open source is not magic pixie dust that eliminates all questions about market concentration,” Slater says.

Self-Serving

There have always been challenges in open-source software when one or two companies become the main backers of a project, crowding out others and ending up serving their interest the most. For example, Google has restricted Android in ways that have frustrated developers and caused governments around the world to pursue antitrust claims.

"We still have to figure out how to make sure this isn't just the province of large capitalized entities," Slater says.

Other issues came up in the 1990s when software companies tried to game open-source systems, according to Christopher Tozzi, a senior lecturer at Rensselear Polytechnic Institute and author of “For Fun and Profit: A History of the Free and Open Source Software Revolution.”

“People accused Microsoft of doing this and the term was ‘embrace, extend, extinguish,’” he says. “The charge was that Microsoft was basically adding proprietary extensions onto open-source products in ways that made the open-source products no longer compatible and you had to use Microsoft's special version.”

The same could happen with generative AI: a company could end up creating an open-source large language that gets tightly coupled with proprietary pieces of software users would need for the open-source product to work at all.

“It would be like, sure, the core of it is open source,” Tozzi says. “But practically speaking, it doesn't have the freedoms that open source is supposed to deliver.”

Muddied Language

The community has been carefully coming to define and create licenses for open source for the last 30 years. Despite that work, some companies are calling their AI releases open source when it is not, according to Stephen O'Grady, an industry analyst with RedMonk.

“Many of the players want the upside of open source without any of the costs,” he says. “Which in many cases is somebody taking this code and potentially competing with you.”

Meta recently said it was releasing a code base as open source when in reality it was “source available,” which limits what users can do. Muddied language means it's unclear for companies and developers what can be built on the software or AI. That can diminish the uptake and stifle open-source development.

This history just isn’t known to some working in AI, many of whom were children or weren’t born yet, points out O'Grady.

“I think one of the problems that we have with open source today is that it's taken for granted,” he says. “They never had to sort of contend with people trying to kill it.”

High Stakes

Despite challenges, the open-source movement is still strong, according to Christine Peterson, who coined the term “open-source software” in 1998 and is co-founder and senior fellow at the Foresight Institute, a research non-profit that promotes emerging technologies.

“Things that are not proprietary can always use more resources,” she says. “But I think that open-source effort is still ongoing and we can still build on that.”

The stakes of getting it right could be even higher for open-source development in AI. Openness has played a critical role in democratic societies, and that includes open-source software, points out Peterson. Authoritarian governments can collect much more massive amounts of data than countries with civil liberties are able to amass. That data could give China an edge unless it must continue to compete with open-source AI, built by many people, and the deep pockets of Big Tech without power concentrating in one place.

“It's possible to make the case that ultimately, enabling open-source artificial intelligence to continue may be necessary for the continued openness of the more democratically oriented countries in our world,” Peterson says. “An open approach enables a wide variety of flowers to bloom in the more open countries.”