DBRX

DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024. It is a mixture-of-experts Transformer model, with 132 billion parameters in total. 36 billion parameters (4 out of 16 experts) are active for each token. The released model comes in either a base foundation model version or an instruct-tuned variant.

DBRX
Developer(s)Mosaic ML and Databricks team
Initial releaseMarch 27, 2024
Repositoryhttps://github.com/databricks/dbrx
LicenseDatabricks Open License
Websitehttps://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

DRBX outperforms other prominent open-source models such as Meta's LLaMA 2, Mistral AI's Mixtral, and xAI's Grok and close-sourced models such as GPT-3.5 in several benchmarks ranging from language understanding, programming ability and mathematics. As of March 28, 2024, this makes DBRX the world's most powerful open sourced model.

It was trained in 2.5 months on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth (InfiniBand), for a training cost of $10m USD.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.