Meta releases new model LLaMA to roam a world of public data

Meta introduces LLaMA and says it shows it is possible to train language models using publicly available datasets without resorting to proprietary datasets

Meta - the company formerly known as Facebook - has thrown its generative AI model hat into the ring in the form of LLaMA, the Large Language Model Meta AI. 

In a blog post, Meta states that this foundational large language model will be released to help advance research in the field of AI. The company says it has committed itself to open science, and LLaMA is a part of that. 

Smaller and more performant models like LLaMA allow researchers without the substantial infrastructure required to study them, democratising access to this essential and fast-changing field, says Meta.

“We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets,” say the report authors.

Guidelines required for responsible AI

The LLaMA model takes a sequence of words as an input and predicts the next word to generate text recursively, much like other large language models. The model was trained on text from the 20 languages with the most speakers, focusing on those with Latin and Cyrillic alphabets. 

Researchers believe that the entire AI community, including policymakers, academics, civil society, and industry, must collaborate to develop clear guidelines on responsible AI and large language models. 

Meta hopes that releasing LLaMA will promote further research in these crucial areas. Access to the model will be granted on a case-by-case basis for research use cases.

“There is still more research that needs to be done to address the risks of bias, toxic comments, and hallucinations in large language models,” write the authors. “Like other models, LLaMA shares these challenges. As a foundation model, LLaMA is designed to be versatile and can be applied to many different use cases, versus a fine-tuned model that is designed for a specific task. 

“By sharing the code for LLaMA, other researchers can more easily test new approaches to limiting or eliminating these problems in large language models.”

Share

Featured Articles

Why Businesses are Building AI Strategy on Amazon Bedrock

AWS partners such as Accenture, Delta Air Lines, Intuit, Salesforce, Siemens, Toyota & United Airlines are using Amazon Bedrock to build and deploy Gen AI

Pick N Pay’s Leon Van Niekerk: Evaluating Enterprise AI

We spoke with Pick N Pay Head of Testing Leon Van Niekerk at OpenText World Europe 2024 about its partnership with OpenText and how it plans to use AI

AI Agenda at Paris 2024: Revolutionising the Olympic Games

We attended the IOC Olympic AI Agenda Launch for Olympic Games Paris 2024 to learn about its AI strategy and enterprise partnerships to transform sports

Who is Gurdeep Singh Pall? Qualtrics’ AI Strategy President

Technology

Should Tech Leaders be Concerned About the Power of AI?

Technology

Andrew Ng Joins Amazon Board to Support Enterprise AI

Machine Learning