Datasets

Falcon Dataset (opens in a new tab): A large English web dataset is utilized for training the Falcon LLM.
Music Caps (opens in a new tab) dataset contains 5,521 music examples, each of which is labeled with an English aspect list and a free text caption written by musicians.
MMLU (opens in a new tab) (Multi-task Language Understanding) is a new benchmark designed to measure knowledge acquired during pretraining by evaluating models exclusively in zero-shot and few-shot settings. (Paper with Code)
MMMU (opens in a new tab) (Massive Multi-discipline Multimodal Understanding) benchmark assesses multimodal models on college-level tasks, including image and text retrieval, question-answering, and language modeling. It gauges AI models' ability to understand and reason across diverse disciplines. (website)
HumanEval (opens in a new tab) is a benchmark for evaluating the multilingual ability of code generative models. (Paper with Code)
GSM8K (opens in a new tab) is high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning. (Paper with Code)
ShareGPT (opens in a new tab): Enables users to share their conversations with GPT chatbots, and these conversations can be used for fine-tuning the model.