Falcon 180b large language model

Falcon 180B The Most Powerful Open LLM Is Here!

Large language models (LLMs) have taken the world by storm. Ever since the release of ChatGPT. Almost every major tech company or educational institution wants to release their own LLM but very few have done at a scale that can match the top contender, that being GPT from OpenAI. There is one LLM from Technology…

Large language models (LLMs) have taken the world by storm. Ever since the release of ChatGPT. Almost every major tech company or educational institution wants to release their own LLM but very few have done at a scale that can match the top contender, that being GPT from OpenAI. There is one LLM from Technology Innovation Institute (TII) from Abu Dhabi called Falcon that is challenging some of the most powerful models from major tech companies. Now, they have released their most powerful model to date: Falcon 180B.

Technical Specification of Falcon 180B

Falcon 180B is a state-of-the-art LLM and as the name suggests has 180 billion parameters making it the largest open LLM currently available. A general rule of thumb within the Machine Learning community is models with larger parameters tend to outperform smaller parameters models (though there are quite a few exceptions). Let’s take a look at the technical specifications of Falcon 180B.

  • 180 billion parameters.
  • Trained on 3.5 trillion tokens using TII’s RefinedWeb dataset.
  • Longest single-epoch pertaining for an open model. What this generally means is that the model is highly data efficient (being able to extract patterns, nuances, information) without needing to see the data multiple times. Another major point, which I think is more likely, is due to computational efficiency, meaning that the model was able to achieve optimal results within a single epoch, this saves time and money.
  • Model was trained using 4096 GPUs simultaneously on Amazon SageMaker.
  • Trained for a total of 7 million GPU hours (that is quite sometime).
  • Falcon 180B is a scaled up version of Falcon 40B with major innovations in multi-query attention for scalability.

Falcon 180B Comparison

With its model size of 180 billion parameters, Falcon 180B is larger and outperforms OpenAI’s GPT 3.5 and Meta’s LLMA 2 70B. It was trained on four times more compute than LLMA 2. This is an incredible feat considering that the model is openly available for anyone to use and with a limited commercial licence.

In terms of performance it is comparable to that of Google’s PaLM 2 and sits between GPT 3.5 and GPT 4 according to HuggingFace. It is also currently leading HuggingFace’s leaderboard for openly released LLMs.

Training Falcon 180B

Training a model as extensive as Falcon 180B is no small feat. The memory and GPU requirements vary based on the training method. Below are the requirements provided by team, although it is mentioned that they are not the minimum requirements but the ones they had access to.

  1. Full Fine-tuning:
  • Memory: A whopping 5120GB (5.12TB)
  • Configuration: Think of it as using 8 sets of 8x A100 GPUs, each boasting 80GB memory.
  1. LoRA with ZeRO-3:
  • Memory: 1280GB (1.28TB)
  • Configuration: This method utilizes 2 sets of 8x A100 GPUs, each equipped with 80GB memory.
  1. QLoRA:
  • Memory: A more modest 160GB
  • Configuration: Here, you’d need 2x A100 GPUs, each with 80GB memory.

Deploying Falcon 180B: Inference Hardware Insights

Once trained, using Falcon 180B for inference also demands substantial hardware:

  1. BF16/FP16 Precision:
  • Memory: 640GB
  • Configuration: This requires 8x A100 GPUs, each with 80GB memory.
  1. GPTQ/int4 Precision:
  • Memory: 320GB
  • Configuration: For this precision level, you’d need 8x A100 GPUs, each offering 40GB memory.

Currently there are two variations of the model which have been released for the public. There is the base model and a demo model available on HuggingFace. By looking at the requirements you can tell this is an enormous model, even running inference is not possible for a hobbyist, but hey, there are tools out there such as Runpod, which allows you to run these models. Though it is going to cost you quite a penny.

demo from Falcon 180B model
Demo of Falcon 180B for chat

The most interesting part of Falcon is going to be the downstream tasks and what the community builds on top of large model. Can’t wait to see those!

Interested in Learning More?

Check out our comprehensive courses to take your knowledge to the next level!

Browse Courses

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *