
The Datasets library provides fast, memory-efficient access to thousands of ML datasets — with a consistent API for loading, processing, and sharing data.
from datasets import load_dataset
# Load any dataset from the Hub
dataset = load_dataset("imdb") # Movie reviews
dataset = load_dataset("squad") # Q&A dataset
dataset = load_dataset("wikitext", "wikitext-103-raw-v1")
# Load your own data
dataset = load_dataset("csv", data_files="my_data.csv")
dataset = load_dataset("json", data_files="my_data.jsonl")Reference:
TaskLoco™ — The Sticky Note GOAT