DBRX
DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024. It is a mixture-of-experts Transformer model, with 132 billion parameters in total. 36 billion parameters (4 out of 16 experts) are active for each token. The released model comes in either a base foundation model version or an instruct-tuned variant.
Developer(s) | Mosaic ML and Databricks team |
---|---|
Initial release | March 27, 2024 |
Repository | https://github.com/databricks/dbrx |
License | Databricks Open License |
Website | https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm |
DRBX outperforms other prominent open-source models such as Meta's LLaMA 2, Mistral AI's Mixtral, and xAI's Grok and close-sourced models such as GPT-3.5 in several benchmarks ranging from language understanding, programming ability and mathematics. As of March 28, 2024, this makes DBRX the world's most powerful open sourced model.
It was trained in 2.5 months on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth (InfiniBand), for a training cost of $10m USD.