Technology

59607 readers

3432 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

Meta’s Open Source Llama Upsets the AI Horse Race (www.wired.com)

submitted 1 year ago by Xepher@lemm.ee to c/technology@lemmy.world

6 comments fedilink hide all child comments

#Meta is giving its answer to OpenAI’s GPT-4 away for free. The move could intensify the generative AI boom by making it easier for entrepreneurs to build powerful new AI systems.

In May an anonymous memo apparently written by a Google researcher concerned about the company’s future leaked online. It argued that, while executives squabbled about the competitive threat of text-generation technology from OpenAI, open source software was “quietly eating our lunch.”

As proof, the memo cited Llama, a large language model made by Meta that was initially available only to researchers by invitation but within days leaked on 4Chan, and quickly became popular with programmers who adapted and built on the project. Within weeks of its release, variants called Alpaca and Vicuna were nearly as good as ChatGPT but agile enough to customize on a laptop computer. “The impact on the community cannot be overstated,” the leaked Google memo said. “Suddenly anyone is able to experiment.”

Last week, Meta released the second version of its unexpectedly popular model, Llama 2. This time, it is open source and free for commercial use from the start. The new version was made using 40 percent more data than the original, and a chatbot built with the model is capable of generating results on par with OpenAI’s ChatGPT, Meta claims.

Just like ChatGPT, Google’s Bard, and other generative AI models released recently, Llama 2 likely cost millions to create. But only Meta’s system is available for free to developers, startups, and others interested in creating custom variations of the model. By supplying a cheaper option, Meta’s Llama 2 makes it easier for small companies or lone coders to create new products and services, potentially accelerating the current AI boom.

Meta isn’t offering up Llama 2 alone. It has support with some major partners that are already making the model available to their customers, including AI startups Hugging Face, Databricks, and OctoML.

Microsoft, which has invested $10 billion in OpenAI, will nonetheless also offer Llama 2 downloads to developers for use in the cloud or on Windows. At a conference for Microsoft customers last week, CEO Satya Nadella talked excitedly about developers being able to use Meta’s open source AI alongside the proprietary offerings of OpenAI. Amazon’s cloud division, AWS, also offers access to Llama 2.

Ahmad Al-Dahle, Meta’s vice president for generative AI, declines to say what role the leak of the first Llama model played in the company’s new strategy for Llama 2.

“If you look back at Meta’s history, we've been a huge proponent of open source,” he says, pointing to the example of PyTorch, a popular tool for developers working with machine learning. “One of the major motivations for building a community around this was that we saw there was demand beyond researchers to work on these models and improve them.” Al-Dahle says work is already underway on the development of Llama 3, but he would not specify how it will be different.

Though Llama 2 lends credibility to Meta as a leader in open source AI, not all aspects of the release can be characterized as open. The training data used to create the model is described in release materials only as “publicly available online sources,” and the company won’t offer further details about what went into the model’s creation.

Meta’s license for Llama 2 also requires companies with more than 700 million monthly active users to establish a separate license agreement with Meta. It is not clear why, but the clause creates a barrier to other tech giants building on the system. The model also comes with an acceptable use policy, which prohibits generating malicious code, promoting violence, or enabling criminal activity, abuse, or harassment. Meta did not respond to a question about what actions it might take if Llama 2 was used in breach of that policy.

Jon Turow, an investor at Madrona Ventures in Seattle, says Meta’s pivot from trying to restrict distribution of the first Llama model to open-sourcing the second could enable a new wave of creativity using large language models. “Developers and entrepreneurs are very resourceful, and they are going to find out what they can squeeze out of Llama 2,” he says.

Turow likens Meta’s choice to release Llama 2 this month to Google introducing the Android mobile operating system in 2007 to rival Apple’s iOS. By giving away a cheap but powerful alternative, Meta can become a counterbalance to proprietary systems like the kind developed by OpenAI, sparking innovation that could feed back ideas that help improve Meta products and services.

Llama 2 is the first openly released model on par with ChatGPT, says Nathan Lambert, an AI researcher at Hugging Face, a startup that releases open source machine-learning software, including generative models. He doesn’t consider the project truly open source, because of Meta’s limited disclosures about its development, but he is astonished by the number of Llama 2 variations he sees in his social media feed. One example is the latest version of WizardLM, an AI system, similar to ChatGPT, designed to follow complex instructions. Eight out of 10 models trending currently on Hugging Face, a number of which are made to generate conversational text, are variations of Llama 2.

“I think there’s a case to be made that Llama 2 is the biggest event of the year in AI,” Lambert says. He says proprietary models have the advantage today, but he believes that later versions of Llama will catch up and, before long, will be able to perform most tasks that people turn to ChatGPT for today.

Lambert also says the Llama 2 release leaves a number of questions unanswered, in part due to the lack of documentation of training data. And it will still remain the case that only major players like Meta, Google, Microsoft, and OpenAI will have the computing resources and staff needed to make leading large language models.

But he is hopeful that, despite the the success of OpenAI’s proprietary approach, language models are shifting into an era of transparency. A voluntary agreement between the White House and seven major AI companies calls for tests of things like potential for discrimination or impact to society or national security before deployment.

It’s a trend that could be challenged by growing questions about legal liability for AI systems and increasing regulatory pressure from politicians, who fear that malicious actors will start using open source models.

Like Demis Hassabis, the AI researcher now leading Google’s AI development, Turow disagrees with the assertion made by the leaked Google memo that it and other major AI companies are threatened by open source AI. He thinks data, talent, and access to computing power will continue to protect the biggest tech companies—but not make them invincible.

He’s now watching to see what startups and researchers do with Llama 2, expecting to see them rapidly improve it, as happened with the first iteration of Meta’s model. He says that should create new possibilities for both startups and the broader field of AI. “We're seeing open source continually get better and better, so there may be surprises that upset the early leaders,” Turow says. “I don't know what will happen.”

you are viewing a single comment's thread
view the rest of the comments

[–] defiant@lemm.ee 4 points 1 year ago

The restrictions on commercial use means it does not meet the non-discrimination part of the open source definition.

This blog post from the OSI explains it well: https://blog.opensource.org/metas-llama-2-license-is-not-open-source/