LLaMA 2: Meta’s Open Source AI Model
Is the newest LLM in town worth the hype?
A couple of days ago, Meta released its latest version of LLM called Llama 2 in collaboration with Microsoft. If you have been following the LLM hype, you might have already heard about it or even read about its new features. To simplify things, we will list down four reasons why Llama 2 is generating so much hype and how it compares with some of the best LLMs.
Free for Research and Commercial Use
One significant reason that has caught people’s interest in Llama 2 is that Meta made the entire model free for almost everyone, except for some big enterprises that may have certain conditions. This move opens up exciting opportunities for individuals thinking of starting their own businesses or venturing into the world of Generative AI. Now is the perfect time to dive into the waters of AI, especially with a language model of this caliber being freely accessible. While there were already multiple open-source models available, none of them came from a company of Meta’s stature and could serve as direct competitors to GPT.
“There have been public releases of pretrained LLMs (such as BLOOM (Scao et al., 2022), LLaMa-1 (Touvron et al., 2023), and Falcon (Penedo et al., 2023)) that match the performance of closed pretrained competitors like GPT-3 (Brown et al., 2020) and Chinchilla (Hoffmann et al., 2022), but none of these models are suitable substitutes for closed “product” LLMs, such as ChatGPT, BARD, and Claude.” — Meta Research Paper
Based on the reports published in the Meta research paper, Llama 2 has demonstrated superior performance compared to other open-source models in the helpfulness and safety benchmark. It has even outperformed ChatGPT (7b, 13b, 70b models) in these aspects. However, it is important to note that the research paper acknowledges the possibility of biased data favoring Llama 2, which should be taken into consideration while interpreting the results. Nevertheless, even if Llama 2 comes close to the ChatGPT benchmark, it deserves commendation.
One of the most significant factors contributing to the safety of Llama 2 is its data privacy. Unlike some models, Llama 2 does not require sending your data to an external server, such as OpenAI, to fetch responses. This unique attribute makes the model particularly valuable for critical and sensitive use-cases, as it helps safeguard users’ data and maintain their privacy. Users can run the model on private servers with their data being contained within their infrastructure.
The most popular LLMs currently in use operate as black boxes, with users having limited insight into their functioning. In contrast, open-source models provide a transparent approach, allowing users to understand their inner workings. This transparency instills confidence and assurance when using such models, despite the challenges they may face, such as generating spam or disinformation.
Additionally, the open-source nature of these models encourages collaborative efforts, leading to continuous improvement and development in the field of LLMs. As a result, open-source models play a crucial role in driving advancements in the world of language models.
“And we believe it’s safer. Opening access to today’s AI models means a generation of developers and researchers can stress test them, identifying and solving problems fast, as a community. By seeing how these tools are used by others, our own teams can learn from them, improve those tools, and fix vulnerabilities.” — Meta Website
Although Llama 2 is openly licensed, Meta has still not disclosed the data it has been trained on, which still sticks out in terms of data privacy of Meta users. Meta says it “made an effort to remove data from certain sites known to contain a high volume of personal information about private individuals” in the Llama 2 research paper, but it did not list what those sites are.
Llama 2 is available in four different weights: 7B, 13B, 34B, and 70B. The weight represents the number of parameters the model is trained on. Generally, larger parameter sizes result in more accurate and reliable responses, but they also require greater computational resources. To improve the human-like characteristics of the model, Llama 2 undergoes fine-tuning using instruction-tuning and the RLHF (Reinforcement Learning with Human Feedback) method which is also used by GPT.
While the 70B parameter size is substantial, it still falls short compared to GPT-3.5, which has 175B parameter-size. As a result, Llama 2’s performance may not match that of GPT-3.5, but benchmark tests indicate a close competition even with its smaller parameter size. Despite this difference, Llama 2 outperforms all existing open-source models currently available.
“RLHF is a model training procedure that is applied to a fine-tuned language model to further align model behavior with human preferences and instruction following. We collect data that represents empirically sampled human preferences, whereby human annotators select which of two model outputs they prefer. This human feedback is subsequently used to train a reward model, which learns patterns in the preferences of the human annotators and can then automate preference decisions.” — Meta Research Paper
There is indeed a multitude of open-source models emerging, and with the release of Llama 2, the possibilities seem limitless. While it may take some time for these open-source models to directly compete with something as advanced as GPT-4, the excitement lies in getting a model that comes close to the capabilities of GPT-3.5. This progress in itself is truly remarkable.
Looking ahead, as LLM training becomes more efficient, the potential for having a personalized ChatGPT, fine-tuned with your data on your local device, becomes a tantalizing prospect. One platform that offers such capabilities is Cody, an intelligent AI assistant tailored to support businesses in various aspects. Much like ChatGPT, Cody can be trained on your business data, team, processes, and clients, using your unique knowledge base.
With Cody, businesses can harness the power of AI to create a personalized and intelligent assistant that caters specifically to their needs, making it a promising addition to the world of AI-driven business solutions.