Closed source LLMs

Closed Source LLMs are proprietary language models known for their confidential source code and architecture. Unlike open-source counterparts, these models keep their internal workings private. This approach offers enhanced data security and control over intellectual property, making them suitable for industries like healthcare, finance, and legal services where data privacy and confidentiality are paramount.

Models

Models	Developed by	Description
GPT-3 (opens in a new tab)	Open AI (opens in a new tab)	OpenAI's GPT-3 caused quite a stir when it was unveiled in May 2020. With a staggering 175 billion parameters, it far surpassed its predecessor GPT-2 and promised major advances in NLP.
GPT-4 (opens in a new tab)	Open AI (opens in a new tab)	OpenAI's latest model GPT-4 astounds with human-like text generation and creative abilities from prompts, showing their continued dominance in large language models.
Megatron Turing NLG (opens in a new tab)	Microsoft & Nvidia	NVIDIA and Microsoft announced the Megatron-Turing NLG with 530 billion parameters, three times more than its closest competitors. The model is powered by DeepSpeed and Megatron transformer models.
Jurassic-1 (opens in a new tab)	AI21 Labs	AI21 Labs unveiled Jurassic-1, boasting it as the biggest ever large language model for public use. Sporting 178 billion parameters, it edges out GPT-3 in size and has five times the vocabulary. Trained on a massive 300B token dataset called Jumbo culled from websites, Jurassic-1 demonstrates leading scale and performance.
Gopher (opens in a new tab)	DeepMind	DeepMind's massive 280 billion parameter language model Gopher nearly halves the gap to human performance over GPT-3, surpassing forecasts and state-of-the-art models on 81% of tasks. DeepMind asserts this giant transformer model makes major strides towards matching human linguistic skills.
Chinchilla (opens in a new tab)	DeepMind	DeepMind's Chinchilla has 70 billion parameters optimized for efficiency. With 1.4 trillion training tokens, Chinchilla shows equally scaling model size and data is optimal. Using the same compute as Gopher but with 4x more training data, Chinchilla proves formidable, advancing capabilities within a set compute budget.
LaMDA (opens in a new tab)	Google	Google's 137 billion parameter language model LaMDA gained notoriety when an engineer controversially called it sentient during testing. Built by fine-tuning transformer models on a massive 1.5 trillion word dataset, LaMDA was pre-trained on 40x more data than prior models. But claims of its sentience sparked intense debate.
AlexaTM (opens in a new tab)	Amazon	Amazon's Alexa large language model has 20 billion parameters using a unique encoder-decoder architecture that excels at translation, despite having just 1/8 the scale of GPT-3. Alexa surpassed GPT-3 on SQuAD and SuperGLUE benchmarks, demonstrating Amazon's prowess in efficient language model design.
BloombergGPT (opens in a new tab)	Bloomberg	A large generative AI model tailored for finance aims to navigate the complex financial landscape. This specialized language model trained on huge financial datasets shows promise for advancing natural language processing in the intricate world of banking and investments.
PanGu (opens in a new tab)	Huawei	PanGu-Σ Trained on Ascend 910 AI processors. MindSpore 5 served as the framework, as the model underwent rigorous training with a whopping 329 billion tokens over a hundred days.
Kosmos-1 (opens in a new tab)	Microsoft	a pioneering multimodal large language model combining language and vision. In a paper titled "Language Is Not All You Need: Aligning Perception with Language Models" (opens in a new tab), they argue Kosmos-1 represents a critical advance for comprehension, far surpassing traditional text-only models
Wu Dao 2.0 (opens in a new tab)	Beijing Academy of Artificial Intelligence (BAAI)	Wu Dao 2.0 model with an astonishing 1.75 trillion parameters, dwarfing GPT-3 and Google models. It handles English and Chinese, conversing, composing poems, generating recipes and more. Wu Dao 2.0 represents China's latest breakthrough in enormous multilingual language models.
Galactica (opens in a new tab)	Meta AI (opens in a new tab)	The exponential growth of scientific data makes extracting meaningful insights challenging. Meta introduced Galactica, a language model trained on scientific papers to organize and reason about scientific knowledge. It outperformed previous models on scientific tasks but was removed three days post-launch over concerns, despite its potential.
HyperCLOVA (opens in a new tab)	Naver Corp (opens in a new tab)	Trained on a huge 560 billion token dataset. An upgraded multimodal version, HyperCLOVA X, launches in July integrating speech and images. Naver Cloud CEO Kim Yu-won believes this massive Korean language model could be a gamechanger for natural language processing.
ERNIE 3.0 Titan (opens in a new tab)	Baidu & Peng Cheng Laboratory	Baidu along with Peng Cheng Lab created a mammoth 260 billion parameter language model dubbed ERNIE 3.0 Titan. Trained on massive knowledge graphs and data, it has topped benchmarks in over 60 natural language tasks. Baidu touted it as the first hundred-billion scale knowledge-enhanced Chinese model, setting new records in size.
Claude (opens in a new tab)	Anthropic (opens in a new tab)	Claude AI, A cutting-edge Anthropic creation, stems from their research on ethical AI training. It excels in conversational and text tasks, upholding reliability and predictability. Key features include Constitutional AI principles, extensive language model training, and integrations with Notion and Robin AI.
Claude 2.1 (opens in a new tab)	Anthropic (opens in a new tab)	Claude 2.1 by Anthropic offers advanced capabilities with a 200K Token Context Window, enhancing comprehension and summarization. It excels in accuracy, reduces incorrect answers by 30%, and supports diverse applications, including translation, summarization, Q&A, and document analysis.
Grok (opens in a new tab)	xAI (opens in a new tab)	Grok, an AI inspired by the Hitchhiker’s Guide to the Galaxy, not only answers questions but also suggests what to ask. With a touch of wit and a rebellious streak, it adds humor to responses. Grok's real-time knowledge from the 𝕏 platform and its willingness to tackle spicy questions set it apart from other AI systems.

Models Base Model